Image Recognition Made Easy: A Practical Python Guide

Are you still frustrated with manually processing table images in office documents? Don’t worry! Today, I’ll show you how to use Python and image recognition technology to extract text from images with just one click. Whether it’s meeting notes or report screenshots, with just a few lines of code, you can easily handle it and boost your office efficiency!

1. Core Tools for Image Recognition:`Pytesseract`

Pytesseract is a Python library based on the Tesseract OCR engine, specifically designed for recognizing text in images. Tesseract is an open-source project maintained by Google, known for its high accuracy.

Installing Necessary Tools

First, ensure that you have installed the Tesseract OCR engine.

Windows users can download and install it from the Tesseract official installation page.
macOS users can install it via Homebrew:
```
brew install tesseract  
```
Linux users can install it via the package manager:
```
sudo apt install tesseract-ocr  
```

Install the Python libraries:

pip install pytesseract pillow

Simple Example: Recognizing Text in Images

Here’s a simple example demonstrating how to extract text from an image using Python:

from PIL import Image  
import pytesseract  

# Ensure to specify the Tesseract executable path (for Windows users)  
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"  

# Load the image  
image = Image.open("table_image.png")  

# Extract text  
text = pytesseract.image_to_string(image)  
print("Recognized text is as follows:")  
print(text)

After running the code, you will see the text content from the image printed directly!

Tip: If the recognition result is not satisfactory, it may be due to poor image quality. You can try preprocessing the image (which will be discussed later).

2. Improve Recognition Accuracy: Image Preprocessing Techniques

Sometimes, the recognition effect of text in images is not ideal. We can improve accuracy through image preprocessing.

Common Preprocessing Methods

Convert to Grayscale: Remove color information and focus on the text area.
Binarization: Convert the image to black and white to enhance contrast.
Remove Noise: Clean up noise in the image to reduce interference.

Example Code

import cv2  
import numpy as np  
from PIL import Image  

# Use OpenCV to load the image  
image = cv2.imread("table_image.png")  

# Convert to grayscale  
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)  

# Binarization  
_, binary = cv2.threshold(gray, 150, 255, cv2.THRESH_BINARY)  

# Save the preprocessed image  
cv2.imwrite("processed_image.png", binary)  

# Use Pytesseract to recognize  
processed_image = Image.open("processed_image.png")  
text = pytesseract.image_to_string(processed_image)  
print("Optimized recognition result:")  
print(text)

Tip: Image preprocessing is especially suitable for scenarios with blurry text or complex backgrounds. Give it a try!

3. Extracting Text from Tables: Using `Pytesseract` to Recognize Table Structures

If your image contains a table structure (like an Excel screenshot), extracting the content of each cell can be a bit more complex.

Let Pytesseract Output Table Structure

You can obtain the position information of each text block using image_to_boxes or image_to_data methods.

# Output the content and position of each cell in the table  
data = pytesseract.image_to_data(image, output_type=pytesseract.Output.DICT)  

# Iterate through all text blocks  
for i in range(len(data["text"])):  
    if data["text"][i].strip():  # Skip empty blocks  
        print(f"Text: {data['text'][i]}，Position: ({data['left'][i]}, {data['top'][i]})")

Practical Application: Generating Excel Files

You can combine it with pandas to save the recognized text to Excel:

import pandas as pd  

# Extract table content  
rows = []  
for i in range(len(data["text"])):  
    if data["text"][i].strip():  
        rows.append(data["text"][i])  

# Save to Excel  
df = pd.DataFrame(rows, columns=["Content"])  
df.to_excel("output.xlsx", index=False)  
print("Table content has been saved to output.xlsx!")

4. Advanced Applications: Multilingual Recognition

Tesseract supports multiple languages; just download the corresponding language pack and specify the language parameter.

Installing Language Packs

For example, for Chinese:

sudo apt install tesseract-ocr-chi-sim

Using Language Packs

text = pytesseract.image_to_string(image, lang="chi_sim")  
print("Chinese recognition result:")  
print(text)

Note: If processing multiple languages simultaneously, you can specify multiple language packs in the form of lang="eng+chi_sim".

5. Small Exercise: Give It a Try

Download an image containing a table, try extracting all the text from the table and saving it as an Excel file.
Use OpenCV to preprocess the image and compare the recognition effects before and after preprocessing.
Try using Pytesseract to extract text content from a multilingual image.

Conclusion

Friends, today’s journey of learning Python ends here! Wish you all happy learning, may your Python skills improve steadily!

Image Recognition Made Easy: A Practical Python Guide

1. Core Tools for Image Recognition:`<span>Pytesseract</span>`

Installing Necessary Tools

Simple Example: Recognizing Text in Images

2. Improve Recognition Accuracy: Image Preprocessing Techniques

Common Preprocessing Methods

Example Code

3. Extracting Text from Tables: Using `<span>Pytesseract</span>` to Recognize Table Structures

Let Pytesseract Output Table Structure

Practical Application: Generating Excel Files

4. Advanced Applications: Multilingual Recognition

Installing Language Packs

Using Language Packs

5. Small Exercise: Give It a Try

Conclusion

Leave a Comment Cancel reply

1. Core Tools for Image Recognition:<span>Pytesseract</span>

Installing Necessary Tools

Simple Example: Recognizing Text in Images

2. Improve Recognition Accuracy: Image Preprocessing Techniques

Common Preprocessing Methods

Example Code

3. Extracting Text from Tables: Using <span>Pytesseract</span> to Recognize Table Structures

Let Pytesseract Output Table Structure

Practical Application: Generating Excel Files

4. Advanced Applications: Multilingual Recognition

Installing Language Packs

Using Language Packs

5. Small Exercise: Give It a Try

Conclusion

Leave a Comment Cancel reply

1. Core Tools for Image Recognition:`<span>Pytesseract</span>`

3. Extracting Text from Tables: Using `<span>Pytesseract</span>` to Recognize Table Structures