Click on the above “Beginner’s Guide to Vision”, select to add “Star” or “Top”
Essential content delivered first-hand
Using OpenCV to recognize digits from 0 to 9, achieving a simple OCR function, feature extraction based on contour analysis, and digit recognition based on L1 distance calculation for matching. With interference eliminated, recognition accuracy can exceed 98%. The entire algorithm is divided into two parts: the first part is feature extraction, which achieves scale invariance and slight illumination and deformation interference exclusion, and the second part performs matching based on feature data to achieve similarity comparison, ultimately recognizing the ten digits from 0 to 9.
The first part of the algorithm mainly implements the following functions: extracting 42 feature vectors, using 40 of them for matching recognition, and the remaining two for auxiliary checks. For example, there is a significant difference in the aspect ratio between 0 and 1. The main steps for feature extraction are as follows:
1. Image denoising and binarization
2. Contour discovery and ROI area segmentation
3. Horizontal and vertical projection to extract 20 vectors, and normalization
4. Grid segmentation 5×4 to extract 20 vectors, and normalization
5. Aspect ratio and blank ratio, totaling 42 vectors extracted.
Preprocessing is done by Gaussian blur for denoising, followed by global thresholding for image binarization, using contour discovery to extract the ROI rectangular area, and completing steps 3 to 5 for feature extraction in each area, where horizontal and vertical projections are demonstrated as follows:
The horizontal and vertical projections of the ROI area are divided into 10 bins. Considering the floating-point division, the length of each bin may not be an integer, so pixel points are proportionally divided using weights. Statistics of foreground pixel points for each bin are completed.
Similarly, the digit ROI area is subjected to a 5×4 grid segmentation, where the number of foreground pixels in each cell is calculated, also using proportional division with weights, ultimately obtaining 20 normalized feature vectors.
The 40 feature vectors obtained have scale invariance and slight resistance to deformation interference.
Training Data:
Input Data:
Recognition Results:
The training data and recognition data differ in font and size, yet matching based on the extracted features can recognize all, fully proving the scale invariance and local interference resistance of this recognition algorithm.
int main(int argc, char** argv) {
Mat src = imread("D:/vcprojects/images/td1.png");
if(src.empty()) {
printf("could not load image...\n");
return-1;
}
namedWindow("input image", CV_WINDOW_AUTOSIZE);
imshow("input image", src);
// Training
train_data();
// Testing
test_data();
waitKey(0);
return0;
}
Discussion Group
Welcome to join the public account reader group to exchange ideas with peers. Currently, there are WeChat groups for SLAM, three-dimensional vision, sensors, autonomous driving, computational photography, detection, segmentation, recognition, medical imaging, GAN, algorithm competitions, etc. (which will gradually be subdivided), please scan the WeChat ID below to join the group, with a note: “nickname + school/company + research direction”, for example: “Zhang San + Shanghai Jiao Tong University + Visual SLAM”. Please note as formatted, otherwise it will not be approved. After successfully adding, you will be invited to the relevant WeChat group based on research direction. Please do not send advertisements in the group, otherwise you will be removed from the group, thank you for your understanding~