Understanding the Knowledge System of Computer Vision

Click on the top "Xiaobai Learns Vision" to choose to add "Starred" or "Pinned"
Heavyweight content delivered first time

Introduction

Computer vision is an important field of artificial intelligence technology. To put it metaphorically (not necessarily accurate), I believe computer vision is the eyes of the AI era, which shows its importance. Computer vision is actually a grand concept, and this article systematically organizes it. Interested readers can take a look.

1. Computer Vision:

Three Levels: System Engineering Solution Level, Domain Task Module Level, Basic Algorithm Level.

Three Knowledge Points: Image Processing, Machine Learning, Basic Mathematics and Models.

Three Video Scenarios: Close Range (Mobile, Smart Hardware, PC, etc.), Indoor Mid-Range (Indoors, such as offices, malls, homes; checkpoints, entrances, etc.), Outdoor Long Range (Roads, Public Places, etc.)

2. System Engineering Solution Level:WEB Image Structuring; OfflineSDK Image Structuring; Key Frame Acquisition, Structuring, Serialization Behavior Analysis, Result Image Streaming;

Performance: High Concurrency; High Availability; Single Image Time, preferably within200ms, especially for video; accuracy.

3. Domain Task Module Level: Five Domains (Person, Vehicle, Text, Object, Event)

Person: Human Body (Detection, Key Points, Attribute Classification, Behavior, Recognition or Image Search for People); Face (Detection, Key Points, Attribute Classification, Liveness Detection, Recognition);

Vehicle: Vehicle (Detection, Key Points, Brand Subclassification, Attribute Classification, Behavior, Recognition or Image Search for Vehicles); License Plate (Detection, Style Classification, Text Recognition);

Text: OCR (Image Preprocessing, Based on Image Classification, Full Text Detection, Specified Field Localization, Text Recognition, Based on Text Content Classification), Fields: Bills, Certificates (Personal, Corporate), Licenses, License Plates, Natural Scenes (Internal System Images, House Numbers, Bus Stops, Objects, etc.)

Object: Animal (Detection, Key Points, Breed Subclassification, Attribute Classification, Behavior, Recognition or Image Search for Animals); Object (Detection, Key Points, Brand Subclassification, Attribute Classification, Recognition or Image Search for Objects)

Event: Specific Scene Detection, such as Fireworks, Object Left Behind, Industrial Vision, etc.

4. Basic Algorithm Level: Three Aspects (Detection Segmentation, Classification Recognition, Image Preprocessing)

Detection Segmentation: Locating Target Position, Classifying Target Type, Extracting Target Key Points, Segmenting Target Pixels from the Image.

Classification Recognition: Classification includes three layers: Major Category, Subcategory, Fine Category. After detecting the target, classify the target brand or breed, target attribute classification (color, shape, category, etc.), static behavior classification, sequential behavior classification; extract target features and combine categories for image-based recognition.

Image Preprocessing: Image Enhancement, Dehazing, Brightness Adjustment, Tilt Correction, etc.

5. In-Depth Domain Insights:

In the face domain, detection sensitivity(Face Tilt Detection), false detection rate, face feature extraction speed, face recognition accuracy.

In the vehicle domain, license plate accuracy and sensitivity; color, type, brand accuracy; extraction of overall vehicle features and internal local features; vehicle behavior analysis.

In the text domain, the impact of image quality on text detection and recognition, image preprocessing, accuracy and false negative rate of text detection, text recognition, and semantic analysis of text.

In the video domain, three major challenges: (1) High false detection rate. (2) Missing detection issues, such as occlusion, backlight conditions, and large tilt poses cannot be detected. (3) Speed issues, where the detection algorithm speed cannot fully achieve real-time performance, achieving under 100ms is good. Solutions for speed issues: a. Video key frames or interval frames; b. Image compression, coordinate restoration; c. Time-consuming modules run at critical moments, while data association is performed at other times.

Understanding the Knowledge System of Computer Vision

Source: CSDN Blog, Author: shaoshuai_AI_DATA

Download 1: OpenCV-Contrib Extension Module Chinese Tutorial
Reply to "OpenCV Extension Module Chinese Tutorial" in the "Xiaobai Learns Vision" public account backend to download the first OpenCV extension module tutorial in Chinese, covering installation of extension modules, SFM algorithms, stereo vision, target tracking, biological vision, super-resolution processing, etc. more than twenty chapters.
Download 2: Python Vision Practical Projects 52 Lectures
Reply to "Python Vision Practical Projects" in the "Xiaobai Learns Vision" public account backend to download 31 visual practical projects including image segmentation, mask detection, lane line detection, vehicle counting, eye line addition, license plate recognition, character recognition, emotion detection, text content extraction, facial recognition, etc., to help quickly learn computer vision.
Download 3: OpenCV Practical Projects 20 Lectures
Reply to "OpenCV Practical Projects 20 Lectures" in the "Xiaobai Learns Vision" public account backend to download 20 practical projects based on OpenCV implementation, achieving advanced learning of OpenCV.

Group Chat

Welcome to join the public account reader group to communicate with peers. Currently, there are WeChat groups for SLAM, 3D vision, sensors, autonomous driving, computational photography, detection, segmentation, recognition, medical imaging, GAN, algorithm competitions, etc. (will gradually be subdivided in the future). Please scan the WeChat ID below to join the group, note: "Nickname + School/Company + Research Direction", for example: "Zhang San + Shanghai Jiao Tong University + Vision SLAM". Please follow the format, otherwise, it will not be approved. After successful addition, you will be invited to the relevant WeChat group according to your research direction. Please do not send advertisements in the group, otherwise you will be removed, thank you for your understanding~

Leave a Comment Cancel reply