How detection is done using Computer Vision and Deep Learning?
This task of object detection uses a combination of Computer Vision and Deep Learning technique to detect the objects of interest in the frame. It also tracks the objects as well as detects the direction of the objects like whether it is moving towards or away from the camera, taking left turn etc. The process also gives the total number of vehicles and people that passed the road at a given time.
The deep learning model which detects cars and person in the given videos is YOLO (You Only Look Once) trained on COCO dataset. YOLO model can detect upto 80 classes of objects. It also has a tiny light weight architecture which compromises on accuracy but gives improved fps. The objects in the videos are tracked using Centroid Tracker. It uses euclidean distance to track the objects in the successive frame.
From the above video it is clear that when an object moves out of frame, the ID is retained for some of the successive frames. This is to avoid occlusion scenario where an object is hidden by another object. The below video depicts an example of the occlusion scenario where ID 50 and ID 51 are occluded by other vehicles and the lighting, but retains the same ID.