Python + OpenCV: Image Recognition Practical Guide
Recently, while working on an image recognition project, I found that OpenCV is truly a magical library. With just a few lines of code, static images can be brought to life. Today, I will share some interesting image processing techniques with you.
Getting Started with OpenCV
To master image recognition, the first step is to set up the environment. Installing OpenCV is super simple, just one command:
pip install opencv-python numpy
Once installed, let’s read and display an image to experience the basic operations of OpenCV:
import cv2
import numpy as np
# Read the image file
img = cv2.imread('test.jpg')
# Convert to RGB color space
img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
Add Some Effects to the Photo
I remember when I was working on a project, the product manager always said the images didn’t look cool enough. This is when we can use OpenCV to add some effects:
# Gaussian blur to soften the image
blurred = cv2.GaussianBlur(img, (7, 7), 0)
# Edge detection to highlight contours
edges = cv2.Canny(blurred, 100, 200)
Sometimes we also need to resize images, especially when dealing with high-resolution pictures:
# Scale the image proportionally
height, width = img.shape[:2]
scale = 0.5 # Scaling factor
resized = cv2.resize(img, (int(width * scale), int(height * scale)))
Make the Photo Come Alive
The most interesting part is adding dynamic effects to static images. We can achieve this through simple image processing:
# Create dynamic effect
def create_wave_effect(image):
rows, cols = image.shape[:2]
# Generate displacement matrix for wave effect
map_x = np.zeros((rows, cols), np.float32)
map_y = np.zeros((rows, cols), np.float32)
for i in range(rows):
for j in range(cols):
map_x[i,j] = j + 20 * np.sin(i/30)
map_y[i,j] = i;
return cv2.remap(image, map_x, map_y, cv2.INTER_LINEAR)
Practical Image Recognition Tips
In actual projects, we often need to detect specific objects in images. Here’s a simple example of face detection:
# Load face detector
face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')
# Detect faces
faces = face_cascade.detectMultiScale(img_gray, 1.3, 5)
# Mark face positions
for (x,y,w,h) in faces:
cv2.rectangle(img, (x,y), (x+w,y+h), (255,0,0), 2)
I remember receiving a request to recognize text in images, and I used image preprocessing to improve the recognition rate:
# Image preprocessing
def preprocess_for_ocr(image):
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Adaptive thresholding
binary = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 11, 2)
return binary
Having reached this point, don’t you find that OpenCV is actually very easy to get started with? The key is that it makes our image processing work both efficient and fun. Looking back, I used to think it was difficult to learn, but following the code step by step, I realized that many seemingly complex effects can be easily achieved.