Learn with Jaishree - Object Detection Techniques

Hello Reader! Greetings from Jaishree :)

Have you ever wondered how a football is tracked in a live match?! How can a self -driving car identify the objects while driving?! All this happens through an amazing technique called ‘OBJECT DETECTION’. Today, in this blog I’m going to give you an idea on how this works in the real world. I’m going to start with what is Object detection and give you an explanation with very simple examples. We are also going to learn about some of the Deep learning techniques used for Object detection.

Let’s get started…

WHAT IS OBJECT DETECTION?!

As the name suggests it is going to detect the object. Here the object which we are going to provide is a visual object, for example, person, animals, vehicles, accessories, buildings and much more. Object Detection is a computer vision task that detects the objects in the images and in the videos.

The motive of Object detection is to recognize, identify and localize or locate all the known objects in the still image or a video data. This information from the object detector is used for wide applications in the real world and Data scientists play an important role in building algorithms using Deep Learning techniques.

TECHNIQUES USED FOR OBJECT DETECTION

Object detection is performed with either Machine Learning or Deep Learning. There is a difference in the process while performing Machine Learning and Deep Learning. In Machine Learning the data is entered manually for classification. It is taken as supervised machine learning, so the pre-trained models are used to trigger the object detectors whereas in deep learning, automatic feature selection is done using convolutional neural network methods. Here the result is faster and the accuracy is also high.

Difference between ML and Deep Learning

In this blog we are going to see some of the DEEP LEARNING approaches.

DEEP LEARNING TECHNIQUES

Object detection is used to understand what’s in the image and where the objects are found in the image. To achieve this task, there are two different approaches.

Making a fixed number of predictions ( one stage )
A network is proposed to find objects and use another network to fine tune the results to predict the final output ( two stage )

There are many deep learning techniques used for object detection. The below image shows the popular techniques used for object Detection.

In this blog, I’m going to explain one of the fastest, popular, efficient and widely used techniques, YOLO.

YOLO - STATE OF THE ART OBJECT DETECTION ALGORITHM

YOLO is an abbreviation for “You Only Live Once”. This was invented in 2015 and it outperforms all the previously used techniques. YOLO is the state of the art object detection algorithm and it has become the standard way of detecting objects in the field of computer vision because of its faster performance and accuracy. This is considered as a Regression problem and it provides the probability of the class which belongs to the detected images. The YOLO algorithm divides the image into N number of grids or boxes, where each grid has an equal dimension region of S x S. Now,each grid or box is responsible for identifying the image it contains.

Example: Considering image classification problem with this image. Let’s say that we are trying to identify whether the image has a dog or person

Here I’m taking only 2 classes for easier understanding of the concept. So we have only two classes. C1 as Dog and C2 as Person.

Here in this image, the output here is clear and simple which gives the Dog as 1 and the Person as 0. The bounding box locates where exactly the identified dog is or the position of the dog in the image.

To produce this output, CNN is creating a vector with the seven values. I have explained this in the image below.

Now, here I put my image to find the vector. We can clearly see the Dog class is now 0 and the person class is 1. The probability of the object, Pc is 1 as it can detect a person in the image.

What if there is no object in the image?! What if there is no person or dog in the image? The probability class will become zero as it is in the below image.

After this object localization, the input image is divided into grids of equal dimensions and the final detection are done based on the confidence score of bounding boxes and the class probability of the objects.

Let me explain with an example. consider this image where there are two objects, the dog and a person.

This is how the YOLO technique works to detect objects from the images. I hope you have now got an idea about the YOLO architecture. Here is the architecture on how YOLO works.

REAL TIME APPLICATIONS

We are now going to see some of the interesting real time applications of Object Detection

Self- Driving Cars

This is one of the best applications where autonomous driving requires knowing all the objects around. The self-driving car should know when to apply breaks, when to turn and what to do in the next step. The autonomous car should detect objects such as cars, traffic lights, road signs, signals, other vehicles etc., Object detection is used to provide all these necessary information.

Tracking Objects

If there is a need of tracking an object we can see the object detection system is used. From tracking a ball in a football match, tracking a cricket ball or bat, tracking a particular person in a video, there are a wide variety of applications.

Medical Imaging

Object detection is assisting the clinicians in diagnosis, image guided interventions which helps planning for a better treatment for patients. Faster tracking of deformable anatomical objects for major organs like heart, kidney, lungs, brain are a crucial task in Medical image analysis.

There are many other applications such as

Object Recognition as Image Search
Manufacturing Industries
Robotics
Automated CCTV
Automatic Image Annotation
Pedestrian Detection
Activity Recognition
Identity verification
Face Recognition and much more

CONCLUSION

Object Detection is in a wide range of industries, where its uses range from personal security to productivity in the workplaces. There are endless possibilities when it comes to future use cases. I really hope you have learnt some information from this blog. Happy Learning :)

ABOUT AUTHOR

I’m a Mathematics Graduate with a Post Graduate Diploma in Retail Marketing Management. With over 10+ years of experience in various business domains I’m an aspiring candidate to enter into the Data World. I love to code and am fascinated about data driven technologies.