Click on the above “Beginner Learning Vision”, select to add Star or “Top”
Valuable content delivered promptly
This article is adapted from | Machine Learning Laboratory

Today I am very happy to share an article about OpenCV, focusing on the following questions:
1. How to deploy OpenCV.
2. What modules does OpenCV have and what can they do.
3. Familiarity and usage of OpenCV’s basic data structures.
I hope after reading the article, you can also start your journey with OpenCV.
01
What is OpenCV?


It is an open-source computer vision processing library initiated and maintained by Intel’s Russian team.
As an excellent computer vision library, it performs exceptionally well in many areas:
1. Programming Languages
Most modules are implemented in C++, with some in C, and interfaces are provided for Python, Ruby, MATLAB, and other languages.
2. Cross-Platform
Can run freely on desktop platforms such as Linux, Windows, and Mac OS, as well as mobile platforms like Android, iOS, and BlackBerry.
3. Active Development Team
Currently updated to OpenCV 4.0
4. Rich APIs
Comprehensive traditional computer vision algorithms, covering mainstream traditional machine learning algorithms, and adding support for deep learning.
OpenCV can accomplish almost all image processing tasks, here is a brief list.
-
Video Analysis
-
3D Reconstruction
-
Feature Extraction
-
Object Detection
-
Machine Learning
-
Computational Photography
-
Shape Analysis
-
Optical Flow Algorithms
-
Face and Object Recognition
-
Surface Matching
-
Text Detection and Recognition
02
How to Deploy OpenCV?

Generally, we will use the C++ and Python versions of OpenCV, so below we will introduce their installation, taking Ubuntu as an example.
2.1 Installing C++ OpenCV on Ubuntu
Install the libraries required for OpenCV
sudo apt-get install build-essential
sudo apt-get install cmake git libgtk2.0-dev pkg-config libavcodec-dev libavformat-dev
libswscale-dev3 sudo apt-get install python-dev python-numpy libtbb2 libtbb-dev libjpeg-dev libpng-dev libtiff-dev libjasper-dev libdc1394-22-dev
Download the latest OpenCV source code
unzip opencv-3.2.0.zip
cd ~/opencv-3.2.0
Compile OpenCV
cd ~/opencv-3.2.0
mkdir release
cd release
cmake -D CMAKE_BUILD_TYPE=RELEASE -D
CMAKE_INSTALL_PREFIX=/usr/local ..
make
sudo make install
Generally, compilation and installation are unlikely to succeed on the first try, here are some common issues.
1. If the ippcv download fails during compilation, the solution is to manually download it.
2. If there is an include error with the LAPACK package, the solution is to modify the corresponding include file path immediately after cmake; if you modify it after make fails, it will be ineffective.
3. If certain modules cannot be found, it is usually because the contrib module is missing during compilation and installation.
2.2 Installing Python-OpenCV on Ubuntu
Install OpenCV
pip3 install opencv-python
Enter Python and import cv2
import cv2
03
Introduction to OpenCV Modules

OpenCV provides many built-in basic data structures for image processing and computer vision-related operations, all contained in the core module, and these data structures have been optimized for speed and memory. Below is an introduction using version 4.0, refer tohttps://docs.opencv.org/master/d9/df8/tutorial_root.html.
The “modules directory” under the OpenCV directory lists the various modules included in OpenCV, among which core, highgui, imgproc are the most basic modules.

-
core module implements the most essential data structures and their basic operations, such as drawing functions, array operation-related functions, and interoperability with OpenGL.
-
highgui module implements interfaces for reading, displaying, and storing videos and images.
-
imgproc module implements basic methods for image processing, including image filtering, geometric transformations, smoothing, threshold segmentation, morphological processing, edge detection, object detection, motion analysis, and object tracking.
For other higher-level directions and applications in image processing, OpenCV also has relevant modules implemented.
-
features2d module is used for extracting image features and feature matching, and the nonfree module implements some patented algorithms, such as SIFT features.
-
objdetect module implements some object detection functions, including classic face detection based on Haar and LBP features, pedestrian and vehicle detection based on HOG, using classifiers like Cascade Classification and Latent SVM.
-
stitching module implements image stitching functions.
-
FLANN module (Fast Library for Approximate Nearest Neighbors) includes fast approximate nearest neighbor search and clustering algorithms.
-
ml module is the machine learning module (SVM, decision trees, Boosting, etc.).
-
photo module includes image restoration and denoising.
-
video module is for video processing, such as background separation, foreground detection, and object tracking.
-
calib3d module is for camera calibration and 3D reconstruction, containing basic multi-view geometry algorithms, single stereo camera calibration, object pose estimation, stereo similarity algorithms, and 3D information reconstruction.
-
G-API module contains a highly efficient image processing pipeline engine.
Additionally, modules previously in opencv2 such as shape, superres, videostab, viz have been moved to opencv_contrib, which we will introduce in detail later.
04
Basic Data Structures in OpenCV

OpenCV provides various basic data types, commonly used basic data structures in OpenCV include:
-
Mat Class
-
Point Class
-
Size Class
-
Rect Class
-
Scalar Class
-
Vec Class
-
Range Class

Next, we will focus on the MAT class.
4.1 Mat Class
To proficiently use OpenCV, the most important thing is to learn the Mat data structure. In OpenCV, Mat is defined as a class, and it can be viewed as a data structure that stores data in a matrix form.
What are the common attributes of Mat?
-
dims: represents the dimensions of matrix M, such as a 2*3 matrix is 2-dimensional, and a 3*4*5 matrix is 3-dimensional.
-
data: an uchar pointer that points to a block of memory storing matrix data.
-
rows, cols: number of rows and columns in the matrix.
-
type: represents the type of elements in the matrix (depth) and the number of channels in the matrix; the naming rule is CV_ + (bit depth) + (data type) + (number of channels).
Where: U (unsigned integer) — unsigned integer
S (signed integer) — signed integer
F (float) — float
For example, CV_8UC3 can be split into: CV_: type prefix,
8U: 8-bit unsigned integer (depth),C3: 3 channels (channels)
-
depth: the bit count of each pixel in the image; this value is related to type. For example, in CV_8UC3, depth is CV_8U.
-
channels: number of channels; if the image is RGB, HSV, etc., which are three-channel images, then channels = 3; if the image is grayscale, it is single channel, then channels = 1.
-
elemSize: the size of each element of the matrix.
elemSize = channels * depth / 8.
For example: If the type is CV_8UC3, elemSize = 3 * 8 / 8 = 3 bytes.
-
elemSize1: the size of data occupied by single-channel matrix elements.
elemSize1 = depth / 8.
For example: If the type is CV_8UC3, elemSize1 = 8 / 8 = 1 byte.
4.2 Other Data Types
1.Point Class
Contains two integer data members x and y, which represent the coordinate point.
2.Size Class
Data members are width and height, generally used to represent the size of an image or matrix.
3.Rect Class
Data members x, y, width, height, represent the coordinate point of the top-left corner of the rectangle and the width and height of the rectangle.
4.Scalar Class
Scalar_(_Tp v0, _Tp v1, _Tp v2=0, _Tp v3=0)
This default constructor’s four parameters represent the RGB + Alpha color components:
v0— represents the B (blue) component in RGB
v1— represents the G (green) component in RGB
v2— represents the R (red) component in RGB
v3— represents the Alpha transparency component.
5.Vec Class
A “one-dimensional matrix”.
Vec<int,n>— is an instantiation using type int and vector template class. The first parameter int indicates that the Vec stores int type; the second parameter n is an integer value indicating that each object of Vec stores n int values, which is an n-dimensional vector (column vector).
6.Range Class
Used to specify a continuous subsequence, such as part of a contour or a column space of a matrix.
05
Basic I/O Operations

Here we use the Python interface.
1. Image Reading and Writing
cv2.imread(filename, display control parameters) # Read image
cv2.imshow(window name, image name) # Display image
cv2.imwrite(file address, file name) # Save image
cv2.namedWindow(window name) # Create window
cv2.destroyAllWindows() # Destroy window
cv2.waitKey( [,delay]) #delay > 0 wait for delay milliseconds
#delay < 0 wait for keyboard click
#delay = 0 wait indefinitely

2. Image Resizing
dst = cv2.resize(src, dsize, fx, fy) # dsize represents the scaling size
#fx, fy scaling ratios
3. Image Flipping
dst = cv2.flip(src, flipCode)
#flipCode=0 flip along the X-axis
#flipCode > 0 flip along the Y-axis
#flipCode < 0 flip along both X and Y axes

4. Channel Splitting and Merging
b,g,r = cv2.split(image)
b = cv2.split(image)[channel number] # Split
bgr = cv2.merge([b,g,r]) # Merge

06
Related Learning Materials

6.1 Online Resources
-
OpenCV Docs Official Documentation
https://docs.opencv.org/
-
OpenCV Official GitHub
https://github.com/opencv/opencv
-
OpenCV Chinese Tutorial
http://www.opencv.org.cn/opencvdoc/2.3.2/html/doc/tutorials/tutorials.html
6.2 Chinese Books
-
Python Computer Vision Programming
-
OpenCV 3 Computer Vision: Implementation in Python
-
OpenCV Algorithm Explanation: Based on Python and C++
Finally, I would like to recommend the OpenCV learning path.

Summary
This article briefly introduces the OpenCV framework, which is a tool that must be proficiently mastered in the field of computer vision. In this issue, we did not discuss specific algorithms and modules; in the future, we will launch “OpenCV Special Topics” for discussion.
Group Chat
Welcome to join the public account reader group to communicate with peers; currently, there are WeChat groups for SLAM, three-dimensional vision, sensors, autonomous driving, computational photography, detection, segmentation, recognition, medical imaging, GAN, algorithm competitions, etc. (will gradually subdivide in the future), please scan the WeChat number below to add to the group, note: “nickname + school/company + research direction”, for example: “Zhang San + Shanghai Jiao Tong University + Vision SLAM”. Please follow the format for remarks, otherwise you will not be approved. Successful additions will be invited into related WeChat groups based on research direction. Please do not send advertisements in the group, otherwise, you will be removed from the group, thank you for your understanding~