Image Recognition Preprocessing Techniques

Click on "Little White Learns Vision" above, select "Star" or "Top"

Heavyweight content, delivered at the first time

In image recognition, the quality of the image directly affects the design and accuracy of the recognition algorithm. Besides optimization on the algorithm side, preprocessing techniques play a crucial role in the entire project, yet people often overlook this point.

Image preprocessing involves extracting each text image and handing it over to the recognition module for identification; this process is called image preprocessing.

The main purpose of image preprocessing is to eliminate irrelevant information in the image, restore useful real information, enhance the detectability of relevant information, and simplify data to improve the reliability of feature extraction, image segmentation, matching, and recognition. The preprocessing process generally includes steps such as digitization, geometric transformation, normalization, smoothing, restoration, and enhancement.

1. Filtering: Filtering (Wave filtering) is the operation of removing specific frequency bands from the signal. It is an important measure to suppress and prevent interference. Note: Mean, median, and Gaussian filtering can affect the clarity of the image to some extent. Clear images do not require filtering; generally, filtering is performed only when there are obvious noise points.

2. Image normalization is similar to histogram normalization. To some extent, it can be understood as transforming pixel values from 0-255 to a range of 0-1, thereby reducing the distribution distance. Note: When the grayscale difference between the target and background is large, it works well; otherwise, it may cause adhesion and lead to segmentation failure.

3. Image smoothing (filtering) suppresses the abrupt changes in image grayscale, while sharpening the image enhances the changes in grayscale, highlighting the detail changes in the image. Note: Smoothing and sharpening both fall under filtering and use the same function, but the difference lies in the choice of operator.

For example: Establish a filter as: h = fspecial(type,parameter); then filter the image using this filter to determine whether to smooth or sharpen: filterImage = imfilter(srcImage,h);

4. Sharpening often “enhances” noise as well. You can combine sharpening and smoothing; if the image has obvious noise, smooth first and then sharpen. If noise appears after sharpening, further smoothing can be performed.

Preprocessing Process

Digitization

The grayscale value of an original photo is a continuous function of spatial variables (continuous values of position). Sampling the grayscale of the photo on an M×N grid and quantizing it (classifying it into one of 2b grayscale levels) can yield a digital image that can be processed by a computer. To reconstruct the original image, certain requirements exist for the sizes of M, N, and b. The larger the values of M, N, and b within the spatial and grayscale resolution capabilities of the receiving device, the better the quality of the reconstructed image. When the sampling period is equal to or less than half of the minimum detail period in the original image, the frequency spectrum of the reconstructed image equals that of the original image, meaning the reconstructed image can be identical to the original. The product of M, N, and b determines the storage amount of an image in a computer. Therefore, under a certain storage capacity, appropriate values of M, N, and b should be chosen based on the different properties of the image to achieve the best processing effect.

Geometric Transformation

This transformation is used to correct systematic errors in the image acquisition system and random errors in instrument positioning. For systematic errors in satellite images, such as distortions caused by Earth’s rotation, scanning mirror speed, and map projection, models can be used to represent these and eliminate them through geometric transformation. Random errors, such as those caused by changes in aircraft attitude and altitude, are difficult to model, so generally, after correcting systematic errors, the transformation is achieved by comparing the observed image with a known correctly geometrically positioned image, using a certain number of ground control points in the image to solve a bivariate polynomial function set.

Normalization

This is a standard form of an image that ensures certain features maintain invariant properties under a given transformation. Certain properties of images, such as the area and perimeter of objects, are inherently invariant to coordinate rotation. Generally, the influence of certain factors or transformations on some properties of the image can be eliminated or weakened through normalization, allowing them to be selected as measurement bases for the image. For example, in remote sensing images where lighting is uncontrollable, grayscale histogram normalization is essential for image analysis. Grayscale normalization, geometric normalization, and transformation normalization are three methods to achieve invariant properties in images.

Smoothing

This technique eliminates random noise in the image. The basic requirement of smoothing techniques is to eliminate noise without making the image contours or lines blurry. Common smoothing methods include median filtering, local averaging, and k-nearest neighbor averaging. The size of the local area can be fixed or can vary point by point according to the grayscale value. In addition, bandpass filtering methods in the spatial frequency domain are sometimes applied.

Restoration

This corrects image degradation caused by various reasons, making the reconstructed or estimated image as close as possible to the ideal undistorted image field. In practical applications, image degradation phenomena often occur. For example, atmospheric disturbances, optical system aberrations, and the relative motion between the camera and the object can all cause remote sensing images to degrade. The basic restoration technique treats the acquired degraded image g(x,y) as the convolution of the degradation function h(x,y) and the ideal image f(x,y). Their Fourier transforms exist in the relation G(u,v) = H(u,v)F(u,v). After determining the degradation function based on the degradation mechanism, F(u,v) can be derived from this relation, and then f(x,y) can be obtained using the Fourier inverse transform. This is typically referred to as the inverse filter. In practical applications, because H(u,v) rapidly decreases as the distance from the uv plane origin increases, to avoid amplifying noise in the high-frequency range, when u² + v² exceeds a certain threshold W, M(u,v) is set to 1. The choice of W0 should ensure that H(u,v) does not have zero points within the range u² + v² ≤ W. The algebraic method for image restoration is based on the best criteria of the least squares method, seeking an estimation that minimizes the goodness-of-fit criterion function value. This method is relatively simple and can derive the least squares Wiener filter. When there is no noise, the Wiener filter becomes the ideal inverse filter.

Enhancement

This selectively strengthens and suppresses information in the image to improve its visual effect or transform the image into a form more suitable for machine processing, facilitating data extraction or recognition. For example, an image enhancement system can use a high-pass filter to highlight the contours of the image, allowing machines to measure the shape and perimeter of the contours. Various methods exist for image enhancement, such as contrast stretching, logarithmic transformation, density layering, and histogram equalization, all of which can be used to alter image grayscale and highlight details. In practical applications, different methods are often used, and repeated experiments are conducted to achieve satisfactory results.

Source: http://www.cnblogs.com/developer-ios/p/7965928.html

Good news!
"Little White Learns Vision" Knowledge Planet is now open to the public👇👇👇





Download 1: OpenCV-Contrib Extension Module Chinese Version Tutorial
Reply "Extension Module Chinese Tutorial" in the "Little White Learns Vision" public account backend to download the first Chinese version of the OpenCV extension module tutorial available online, covering installation of extension modules, SFM algorithms, stereo vision, object tracking, biological vision, super-resolution processing, and more than twenty chapters of content.

Download 2: Python Vision Practical Project 52 Lectures
Reply "Python Vision Practical Project" in the "Little White Learns Vision" public account backend to download 31 practical vision projects including image segmentation, mask detection, lane detection, vehicle counting, adding eyeliner, license plate recognition, character recognition, emotion detection, text content extraction, and facial recognition, assisting in quickly learning computer vision.

Download 3: OpenCV Practical Project 20 Lectures
Reply "OpenCV Practical Project 20 Lectures" in the "Little White Learns Vision" public account backend to download 20 practical projects based on OpenCV, advancing OpenCV learning.

Group Chat

Welcome to join the public account reader group to communicate with peers. Currently, there are WeChat groups on SLAM, 3D vision, sensors, autonomous driving, computational photography, detection, segmentation, recognition, medical imaging, GAN, algorithm competitions, etc. (These will gradually be subdivided in the future). Please scan the WeChat number below to join the group, noting: "Nickname + School/Company + Research Direction", for example: "Zhang San + Shanghai Jiao Tong University + Vision SLAM". Please follow the format, otherwise, you won't be approved. After successful addition, invitations will be sent to relevant WeChat groups based on research direction. Please do not send advertisements in the group, otherwise, you will be removed. Thank you for your understanding~

Leave a Comment Cancel reply