Experience Real-Time Object Recognition with AI
Please allow
camera access
in your web browser to begin this exercise.
In this lesson, we will explore real-time object recognition using the COCO-SSD
model and TensorFlow.js
.
COCO-SSD is an AI model trained by Microsoft on the COCO (Common Objects in Context) dataset. It can quickly identify various objects such as people, animals, and vehicles.
In this context, categories like people, animals, and vehicles are called classes. COCO-SSD can recognize 80 different classes.
TensorFlow.js is a library that allows AI models to run directly in web browsers, enabling real-time object recognition through a webcam.
We’ll learn more about TensorFlow and how to use it in code in future lessons.
How Can You Experience It?
Similar to the previous exercise, this session requires identifying objects through a smart device's camera
(e.g., a laptop's webcam or a smartphone's front camera).
Experience real-time object recognition by AI using a tablet
or laptop
.
How Does AI Recognize Objects?
When webcam video is passed to the COCO-SSD model, the AI analyzes it frame by frame to detect objects.
Based on its training data of 80 objects, the AI predicts which items are present in each video frame.
After completing the frame analysis, the AI identifies the object class
with the highest probability and provides the coordinates
of the detected object.
Below is an example of the object detection results returned by AI, where an object in the frame is identified as having a 98% probability of being a person and a 10% probability of being a car.
[ { "class": "person", // Detected object's class "score": 0.98, // Probability of class match "bbox": [0.1, 0.2, 0.5, 0.6] // Object's coordinates }, { "class": "car", "score": 0.1, "bbox": [0.3, 0.4, 0.7, 0.8] } ]
TensorFlow.js
shows the detected object names and their coordinates directly in the browser, allowing users to view the recognition results in real time.
The use of computers to process visual data such as images and videos is known as Computer Vision
.
In the next lesson, we’ll explore how computer vision combines with AR (Augmented Reality)
to power real-world applications.
TensorFlow.js
is a library that enables the execution of AI models in web browsers.
Lecture
AI Tutor
Design
Upload
Notes
Favorites
Help