Image Text Recognition Using PaddleOCR in C#

Introduction to PaddlePaddle

PaddlePaddle (飞桨) is a deep learning platform developed by Baidu, aimed at providing developers with a comprehensive and flexible toolkit for building, training, and deploying various deep learning models.

It features open-source code, high flexibility, scalability, and distributed training capabilities. PaddlePaddle supports end-to-end deployment, allowing models to be easily applied to servers, mobile devices, and edge devices.

Additionally, PaddlePaddle has a rich library of pre-trained models covering common tasks such as image classification, object detection, and semantic segmentation. The community support and ecosystem are well-established, providing developers with abundant tutorials, documentation, and sample code to aid in the development and application of deep learning models.

Image Text Recognition Using PaddleOCR in C#

Introduction to PaddleOCR

PaddleOCR is an open-source optical character recognition (OCR) tool developed based on the PaddlePaddle deep learning framework.

It provides an end-to-end OCR solution, supporting text detection, text recognition, and key point detection functions.

PaddleOCR has high flexibility and scalability, making it suitable for various text recognition needs in different scenarios, including ID card recognition, license plate recognition, and table recognition.

With pre-trained models, PaddleOCR can achieve high-precision text detection and recognition while supporting multi-language text recognition, including Chinese and English.

Moreover, PaddleOCR offers a rich set of API interfaces and model libraries, facilitating quick integration and deployment of OCR functionalities for various application scenarios.

Introduction to PaddleSharp

PaddleSharp is a library that wraps the PaddlePaddle deep learning framework based on the C# language.

It provides C# developers with the ability to leverage the powerful features of PaddlePaddle in a familiar environment.

PaddleSharp supports building, training, and deploying various deep learning models, including tasks like image classification, object detection, and semantic segmentation.

This library offers a rich set of features and tools, including model building, pre-trained model loading, and high-performance computing support.

With PaddleSharp, developers can utilize the underlying computing libraries of PaddlePaddle to perform high-performance deep learning computations effectively using GPU or CPU resources.

Overall, PaddleSharp provides a convenient tool for C# developers, enabling them to easily apply the deep learning capabilities of PaddlePaddle within the C# environment.

Winform Interface Design

There are two buttons, a rich text box, and a PictureBox.

Steps

Install the corresponding NuGet

Perform Image Text Recognition

The code used is also quite simple:

FullOcrModel model = LocalFullModels.ChineseV3;      
using (PaddleOcrAll all = new PaddleOcrAll(model, PaddleDevice.Mkldnn())  
{  
    AllowRotateDetection = true, /* Allow recognition of angled text */  
    Enable180Classification = false, /* Allow recognition of text rotated more than 90 degrees */  
})  
{  
    // Load local file by following code:  
    using (Mat src2 = Cv2.ImRead(selectedPicture))               
    {  
        PaddleOcrResult result = all.Run(src2);  
        richTextBox1.Text = result.Text;  
    }  
}

FullOcrModel model = LocalFullModels.ChineseV3;

This line of code creates a FullOcrModel object, which represents the PaddleOCR model. LocalFullModels.ChineseV3 is a pre-trained model specifically designed for recognizing Chinese characters.

using (PaddleOcrAll all = new PaddleOcrAll(model, PaddleDevice.Mkldnn())  
{  
    AllowRotateDetection = true, /* Allow recognition of angled text */  
    Enable180Classification = false, /* Allow recognition of text rotated more than 90 degrees */  
})

This code creates a PaddleOcrAll object, which is used to run the OCR model and obtain recognition results.

PaddleDevice.Mkldnn() indicates that the Intel MKL-DNN library is used to accelerate computations. AllowRotateDetection = true means that recognition of angled text is allowed, meaning that text can be recognized even if it is not perfectly horizontal.

Enable180Classification = false means that recognition of text rotated more than 90 degrees is not allowed; if the angle of rotation is too large, it may not be recognized correctly.

The using keyword ensures that the PaddleOcrAll object is disposed of correctly when it is no longer needed, preventing memory leaks.

using (Mat src2 = Cv2.ImRead(selectedPicture))

This line of code uses OpenCV’s ImRead function to read the specified image file path, returning a Mat object, which is a class used by OpenCV to represent images.

selectedPicture is the path of the image file. The using keyword ensures that the Mat object is disposed of correctly when it is no longer needed, preventing memory leaks.

PaddleOcrResult result = all.Run(src2);

This line of code passes the loaded image to the PaddleOCR model for text recognition.

all.Run(src2) runs the OCR model and returns the recognition result, which is stored in the PaddleOcrResult object.

PaddleOcrResult is a record with properties Regions and Text:

The Regions of this example are shown as follows:

The Text of this example is shown as follows:

The effect of this example is shown in the following image:

All the code for this example:

using OpenCvSharp;  
using Sdcb.PaddleInference;  
using Sdcb.PaddleOCR.Models.Local;  
using Sdcb.PaddleOCR.Models;  
using Sdcb.PaddleOCR;  
using System.Diagnostics;  
  
namespace PaddleSharpDemo  
{  
    public partial class Form1 : Form  
    {  
        string selectedPicture;  
        public Form1()  
        {  
            InitializeComponent();  
        }  
     
        private void button1_Click(object sender, EventArgs e)  
        {  
            OpenFileDialog openFileDialog = new OpenFileDialog();  
            openFileDialog.Filter = "Image Files(*.BMP;*.JPG;*.GIF;*.PNG)|*.BMP;*.JPG;*.GIF;*.PNG|All files (*.*)|*.*";  
            openFileDialog.FilterIndex = 1;  
            openFileDialog.Multiselect = false;  
  
            if (openFileDialog.ShowDialog() == DialogResult.OK)  
            {  
                selectedPicture = openFileDialog.FileName;  
                MessageBox.Show($"The selected image path is: {selectedPicture}");  
                // Use the Image class to load the image  
                Image image = Image.FromFile(selectedPicture);  
                // Make the PictureBox display the image completely  
                pictureBox1.SizeMode = PictureBoxSizeMode.Zoom;  
                // Display the image in the PictureBox  
                pictureBox1.Image = image;  
  
            }  
            else  
            {  
                MessageBox.Show("You did not select any image this time!");  
            }  
        }  
  
        private void button2_Click(object sender, EventArgs e)  
        {  
            FullOcrModel model = LocalFullModels.ChineseV3;  
           
            using (PaddleOcrAll all = new PaddleOcrAll(model, PaddleDevice.Mkldnn())  
            {  
                AllowRotateDetection = true, /* Allow recognition of angled text */  
                Enable180Classification = false, /* Allow recognition of text rotated more than 90 degrees */  
            })  
            {  
                // Load local file by following code:  
                using (Mat src2 = Cv2.ImRead(selectedPicture))               
                {  
                    PaddleOcrResult result = all.Run(src2);  
                    richTextBox1.Text = result.Text;  
                }  
            }  
        }  
    }  
}

Using PaddleOCR from Command Line and Python Scripts

The reason I chose PaddleSharp is that I want to apply it directly in C#. If you are not familiar with C#, you can opt to use PaddleOCR from the command line or in Python scripts.

Specific installation processes are available on the official website, and many others have also provided tutorials, so I will not repeat them here; I will simply demonstrate the command line and Python script usage.

Command Line Usage

Command:

paddleocr --image_dir ./cover.png --use_angle_cls true --use_gpu false

Effect:

Python Script Usage

The Python script is as follows:

from paddleocr import PaddleOCR, draw_ocr  
  
# PaddleOCR currently supports multiple languages, which can be switched by modifying the lang parameter  
# For example `ch`, `en`, `fr`, `german`, `korean`, `japan`  
ocr = PaddleOCR(use_angle_cls=True, lang="ch")  # need to run only once to download and load model into memory  
img_path = 'D:\Desktop\2024.04 Learning Content\cover.png'  
result = ocr.ocr(img_path, cls=True)  
for idx in range(len(result)):  
    res = result[idx]  
    for line in res:  
        print(line)  
  
# Display results  
from PIL import Image  
result = result[0]  
image = Image.open(img_path).convert('RGB')  
boxes = [line[0] for line in result]  
txts = [line[1][0] for line in result]  
scores = [line[1][1] for line in result]  
im_show = draw_ocr(image, boxes, txts, scores, font_path='./fonts/simfang.ttf')  
im_show = Image.fromarray(im_show)  
im_show.save('result.jpg')

Effect as shown:

The generated image is shown as follows:

Conclusion

I previously shared Spire.OCR for image text recognition, but its recognition accuracy is not as good as PaddleOCR, and Spire.OCR is not open-source. Therefore, if you encounter OCR needs while using C#, you can try using PaddleOCR. This concludes this sharing session; I hope it helps you.

– EOF –

Recommended Reading Click the title to jump

.NET Avalonia Cross-Platform UI and ChatAI Integration

Free AI Programming Assistant, Supports Visual Studio, Making Coding Pleasant and Easy

C# Concurrent Process Control Framework, Scheduling Over 1 Million Times Per Second

Did you gain something from this article? Please share it with more people

Recommended to follow ‘DotNet’ to enhance your .Net skills

Likes and views are the greatest support ❤️