Overview of Essential OpenCV Knowledge for Computer Vision

Click on the above “Beginner Learning Vision”, select to add Star or “Top”

Valuable content delivered promptly

This article is adapted from | Machine Learning Laboratory

Overview of Essential OpenCV Knowledge for Computer Vision

Today I am very happy to share an article about OpenCV, focusing on the following questions:

1. How to deploy OpenCV.

2. What modules does OpenCV have and what can they do.

3. Familiarity and usage of OpenCV’s basic data structures.

I hope after reading the article, you can also start your journey with OpenCV.

What is OpenCV?

It is an open-source computer vision processing library initiated and maintained by Intel’s Russian team.

As an excellent computer vision library, it performs exceptionally well in many areas:

1. Programming Languages

Most modules are implemented in C++, with some in C, and interfaces are provided for Python, Ruby, MATLAB, and other languages.

2. Cross-Platform

Can run freely on desktop platforms such as Linux, Windows, and Mac OS, as well as mobile platforms like Android, iOS, and BlackBerry.

3. Active Development Team

Currently updated to OpenCV 4.0

4. Rich APIs

Comprehensive traditional computer vision algorithms, covering mainstream traditional machine learning algorithms, and adding support for deep learning.

OpenCV can accomplish almost all image processing tasks, here is a brief list.

Video Analysis

3D Reconstruction
Feature Extraction
Object Detection
Machine Learning
Computational Photography
Shape Analysis
Optical Flow Algorithms
Face and Object Recognition
Surface Matching
Text Detection and Recognition

How to Deploy OpenCV?

Generally, we will use the C++ and Python versions of OpenCV, so below we will introduce their installation, taking Ubuntu as an example.

2.1 Installing C++ OpenCV on Ubuntu

Install the libraries required for OpenCV

sudo apt-get install build-essential

sudo apt-get install cmake git libgtk2.0-dev pkg-config libavcodec-dev libavformat-dev

libswscale-dev3 sudo apt-get install python-dev python-numpy libtbb2 libtbb-dev libjpeg-dev libpng-dev libtiff-dev libjasper-dev libdc1394-22-dev

Download the latest OpenCV source code

unzip opencv-3.2.0.zip

cd ~/opencv-3.2.0

Compile OpenCV

cd ~/opencv-3.2.0

mkdir release

cd release

cmake -D CMAKE_BUILD_TYPE=RELEASE -D

CMAKE_INSTALL_PREFIX=/usr/local ..

make

sudo make install

Generally, compilation and installation are unlikely to succeed on the first try, here are some common issues.

1. If the ippcv download fails during compilation, the solution is to manually download it.

2. If there is an include error with the LAPACK package, the solution is to modify the corresponding include file path immediately after cmake; if you modify it after make fails, it will be ineffective.

3. If certain modules cannot be found, it is usually because the contrib module is missing during compilation and installation.

2.2 Installing Python-OpenCV on Ubuntu

Install OpenCV

pip3 install opencv-python

Enter Python and import cv2

import cv2

Introduction to OpenCV Modules

OpenCV provides many built-in basic data structures for image processing and computer vision-related operations, all contained in the core module, and these data structures have been optimized for speed and memory. Below is an introduction using version 4.0, refer tohttps://docs.opencv.org/master/d9/df8/tutorial_root.html.

The “modules directory” under the OpenCV directory lists the various modules included in OpenCV, among which core, highgui, imgproc are the most basic modules.

core module implements the most essential data structures and their basic operations, such as drawing functions, array operation-related functions, and interoperability with OpenGL.
highgui module implements interfaces for reading, displaying, and storing videos and images.
imgproc module implements basic methods for image processing, including image filtering, geometric transformations, smoothing, threshold segmentation, morphological processing, edge detection, object detection, motion analysis, and object tracking.

For other higher-level directions and applications in image processing, OpenCV also has relevant modules implemented.

features2d module is used for extracting image features and feature matching, and the nonfree module implements some patented algorithms, such as SIFT features.
objdetect module implements some object detection functions, including classic face detection based on Haar and LBP features, pedestrian and vehicle detection based on HOG, using classifiers like Cascade Classification and Latent SVM.
stitching module implements image stitching functions.
FLANN module (Fast Library for Approximate Nearest Neighbors) includes fast approximate nearest neighbor search and clustering algorithms.
ml module is the machine learning module (SVM, decision trees, Boosting, etc.).
photo module includes image restoration and denoising.
video module is for video processing, such as background separation, foreground detection, and object tracking.
calib3d module is for camera calibration and 3D reconstruction, containing basic multi-view geometry algorithms, single stereo camera calibration, object pose estimation, stereo similarity algorithms, and 3D information reconstruction.
G-API module contains a highly efficient image processing pipeline engine.

Additionally, modules previously in opencv2 such as shape, superres, videostab, viz have been moved to opencv_contrib, which we will introduce in detail later.

Basic Data Structures in OpenCV

OpenCV provides various basic data types, commonly used basic data structures in OpenCV include:

Mat Class
Point Class
Size Class
Rect Class
Scalar Class
Vec Class
Range Class

Next, we will focus on the MAT class.

4.1 Mat Class

To proficiently use OpenCV, the most important thing is to learn the Mat data structure. In OpenCV, Mat is defined as a class, and it can be viewed as a data structure that stores data in a matrix form.

What are the common attributes of Mat?

dims: represents the dimensions of matrix M, such as a 2*3 matrix is 2-dimensional, and a 3*4*5 matrix is 3-dimensional.
data: an uchar pointer that points to a block of memory storing matrix data.
rows, cols: number of rows and columns in the matrix.
type: represents the type of elements in the matrix (depth) and the number of channels in the matrix; the naming rule is CV_ + (bit depth) + (data type) + (number of channels).

Where: U (unsigned integer) — unsigned integer

S (signed integer) — signed integer

F (float) — float

For example, CV_8UC3 can be split into: CV_: type prefix,

8U: 8-bit unsigned integer (depth),C3: 3 channels (channels)

depth: the bit count of each pixel in the image; this value is related to type. For example, in CV_8UC3, depth is CV_8U.
channels: number of channels; if the image is RGB, HSV, etc., which are three-channel images, then channels = 3; if the image is grayscale, it is single channel, then channels = 1.
elemSize: the size of each element of the matrix.

elemSize = channels * depth / 8.

For example: If the type is CV_8UC3, elemSize = 3 * 8 / 8 = 3 bytes.
elemSize1: the size of data occupied by single-channel matrix elements.

elemSize1 = depth / 8.

For example: If the type is CV_8UC3, elemSize1 = 8 / 8 = 1 byte.

4.2 Other Data Types

1.Point Class

Contains two integer data members x and y, which represent the coordinate point.

2.Size Class

Data members are width and height, generally used to represent the size of an image or matrix.

3.Rect Class

Data members x, y, width, height, represent the coordinate point of the top-left corner of the rectangle and the width and height of the rectangle.

4.Scalar Class

Scalar_(_Tp v0, _Tp v1, _Tp v2=0, _Tp v3=0)

This default constructor’s four parameters represent the RGB + Alpha color components:

v0— represents the B (blue) component in RGB

v1— represents the G (green) component in RGB

v2— represents the R (red) component in RGB

v3— represents the Alpha transparency component.

5.Vec Class

A “one-dimensional matrix”.

Vec<int,n>— is an instantiation using type int and vector template class. The first parameter int indicates that the Vec stores int type; the second parameter n is an integer value indicating that each object of Vec stores n int values, which is an n-dimensional vector (column vector).

6.Range Class

Used to specify a continuous subsequence, such as part of a contour or a column space of a matrix.

Basic I/O Operations

Here we use the Python interface.

1. Image Reading and Writing

cv2.imread(filename, display control parameters) # Read image

cv2.imshow(window name, image name) # Display image

cv2.imwrite(file address, file name) # Save image

cv2.namedWindow(window name) # Create window

cv2.destroyAllWindows() # Destroy window

cv2.waitKey( [,delay]) #delay > 0 wait for delay milliseconds

#delay < 0 wait for keyboard click

#delay = 0 wait indefinitely

2. Image Resizing

dst = cv2.resize(src, dsize, fx, fy) # dsize represents the scaling size

#fx, fy scaling ratios

3. Image Flipping

dst = cv2.flip(src, flipCode)

#flipCode=0 flip along the X-axis

#flipCode > 0 flip along the Y-axis

#flipCode < 0 flip along both X and Y axes

4. Channel Splitting and Merging

b,g,r = cv2.split(image)

b = cv2.split(image)[channel number] # Split

bgr = cv2.merge([b,g,r]) # Merge

Related Learning Materials

6.1 Online Resources

OpenCV Docs Official Documentation

https://docs.opencv.org/
OpenCV Official GitHub

https://github.com/opencv/opencv
OpenCV Chinese Tutorial

http://www.opencv.org.cn/opencvdoc/2.3.2/html/doc/tutorials/tutorials.html

6.2 Chinese Books

Python Computer Vision Programming
OpenCV 3 Computer Vision: Implementation in Python
OpenCV Algorithm Explanation: Based on Python and C++

Finally, I would like to recommend the OpenCV learning path.

Summary

This article briefly introduces the OpenCV framework, which is a tool that must be proficiently mastered in the field of computer vision. In this issue, we did not discuss specific algorithms and modules; in the future, we will launch “OpenCV Special Topics” for discussion.

Download 1: OpenCV-Contrib Extended Module Chinese Tutorial

Reply in the backend of the “Beginner Learning Vision” public account:Extended Module Chinese Tutorial, to download the first OpenCV extended module tutorial in Chinese, covering installation of extended modules, SFM algorithms, stereo vision, object tracking, biological vision, super-resolution processing and more than twenty chapters.

Download 2: Python Vision Practical Projects 52 Lectures

Reply in the “Beginner Learning Vision” public account backend: Python Vision Practical Projects, to download 31 vision practical projects including image segmentation, mask detection, lane line detection, vehicle counting, adding eyeliner, license plate recognition, character recognition, emotion detection, text content extraction, face recognition, etc., to help quickly learn computer vision.

Download 3: OpenCV Practical Projects 20 Lectures

Reply in the “Beginner Learning Vision” public account backend: OpenCV Practical Projects 20 Lectures, to download 20 practical projects based on OpenCV to advance OpenCV learning.

Group Chat

Welcome to join the public account reader group to communicate with peers; currently, there are WeChat groups for SLAM, three-dimensional vision, sensors, autonomous driving, computational photography, detection, segmentation, recognition, medical imaging, GAN, algorithm competitions, etc. (will gradually subdivide in the future), please scan the WeChat number below to add to the group, note: “nickname + school/company + research direction”, for example: “Zhang San + Shanghai Jiao Tong University + Vision SLAM”. Please follow the format for remarks, otherwise you will not be approved. Successful additions will be invited into related WeChat groups based on research direction. Please do not send advertisements in the group, otherwise, you will be removed from the group, thank you for your understanding~

Leave a Comment Cancel reply