ASN3

How does a Robot See the World?


Deadline


Mar 04, 2022 at 11:59:59 PM (Individual Submissions).



What you need to do?


Since most of the real-world robots has some form of robot perception: interpretting data from a robot's sensors, in this assignment you'll work on data from a color camera. The reason for the choice of this sensor is because of its cost-effectiveness and the ubiquitous nature of the sensors. Moreover, one can achieve a numerous set of tasks with images/video from a color camera.

In this assignment, you'll learn how to interpret color images using Python, Numpy and OpenCV to find/detect a cylindrical "orange" barrels in an .mp4 video, which acts as our makeshift replacement for a disaster survivor. For the purposes of this assignment, there is only one survior/barrel in every frame of the video and is placed vertically on the ground. A sample image from the video is shown in Fig. 1 and the video preview is given in Fig. 2.


Fig. 1: Sample image frame from the video.


Fig. 2: Video in which barrel has to be detected.

Step 1: Download Data


The data is given in the form of a .zip file that contains an input color video called Vid.mp4, the masks video called Masks.mp4, a sample frame called Frame0064.png and it's corresponding mask called Label0064.png. The videos are in .mp4 format and the images are in .png format. The data can be downloaded from here. Feel free to convert the video to frames for debugging.

Step 2: Let's Learn OpenCV


You will be using the images that you downloaded in the last step (extract the .zip file) in this step. I'll use the following set of questions to guide you in learning a few basic concepts of OpenCV.

Step 3: Learn About Color Images and Color Spaces


Just like last step, I'll use the following set of questions to guide you in learning a few basic concepts of color images.

Step 4: Your First Color Thresholder Application


Our human vision is really good at focusing at a particular color of interest. For e.g., when I ask you to focus on an "orange" colored ball, you can do it with ease, however I never told you specifically what "orange" means. You could still focus on the "orange" color since you have learned in your childhood how the color "orange" looks like. Similarly, we want to teach our robot to foucus/detect a certain colored object. Let us consider that we are working on RGB images. The color "orange" has a set of RGB values, however since this is a range, you'll need to make all the values zero which are "not orange". For e.g., an RGB value of \([255,145,0]\) is the stereotypical orange color but the RGB values of \([209, 122, 9]\) and \([212, 120, 0]\) are also different shades of orange (see Fig. 3 for an example). Now, answer the following fundamental question:

Let's say, we have a set of images across varying factors that can affect the way the captured image "looks". A simple example of how the "orange" barrel looks under different conditions is given below.


Fig. 3: The same "orange" barrel looks different under different conditions.

Now implement/answer the following:


Fig. 4: Left: Sample input, Right: Sample output.


Fig. 5: Input and "ideal" output video.

What you need to submit?


A .zip file that has a document (feel free to use Word or Google Docs or any other software for this) converted to PDF with the following answers and required codes as mentioned below:

IMPORTANT NOTE: The submissions are made through ELMS with the name ASN3_DirID.pdf. Here, DirID is your directory ID, i.e., the first part of your terpmail email address. For e.g., if your terpmail email address is ABCD@terpmail.umd.edu, then your DirID is ABCD. Keep your submissions professional, grammatically correct without spelling mistakes. Do not use slangs and chat shorthands on your submissions. You'll get 25% grade penalty for not following the submission guidelines.