Common Image Processing Techniques

Click the above “Beginner’s Guide to Vision“, select to add “Star” or “Top“

Essential content delivered at the first opportunity

In this article, let us learn the following content.

Using some common image processing techniques through PIL and OpenCV, such as converting RGB images to grayscale, rotating images, denoising images, detecting edges in images, and cropping regions of interest in images.

Using template matching in OpenCV to search for objects in images.

Required Libraries:PIL, OpenCV, imutils

Why do we need to learn image processing techniques?

Deep learning is significant for the analysis, recognition, and semantic understanding of images. Image classification, object detection, instance segmentation are common applications of deep learning in images. To build better training datasets, we must first understand the basic image processing techniques, such as image enhancement, which includes cropping images, denoising images, or rotating images. Additionally, basic image processing techniques also aid in Optical Character Recognition (OCR).

Image processing techniques improve the interpretability of images by recognizing key features or reading text information in images, allowing for the classification or detection of objects present in the images.

Image source: Unsplash

Here we provide code and images

Import Required Libraries

import cv2
from PIL import Image

First, we use OpenCV and PIL to display images

Using OpenCV to read and display images

image = cv2.imread(r'love.jpg')
cv2.imshow("Image", image)
cv2.waitKey(0)

If the image is too large, the image window will not match the screen display ratio.

So how do we display the full image on the screen?

By default, when displaying oversized images, the images will be cropped and cannot be fully displayed. To view the complete image, we will use namedWindow(name, flag) in OpenCV to create a new window for displaying the image.

The first parameter name is the title of the window, which will be used as an identifier. When you set flag to cv2.WINDOW_NORMAL, the full image will be displayed, and the window size can be adjusted. Of course, there are other options for the flag parameter.

image = cv2.imread(r'love.jpg')
cv2.namedWindow('Normal Window', cv2.WINDOW_NORMAL)
cv2.imshow('Normal Window', image)
cv2.waitKey(0)

Resizing Images

When we resize an image, we can change the height or width of the image, or change both height and width while maintaining the aspect ratio. The aspect ratio of an image is the ratio of its width to its height.

image= cv2.imread(r'taj.jpg')
scale_percent =200 # percent of original size
width = int(image.shape[1] * scale_percent / 100)
height = int(image.shape[0] * scale_percent / 100)
dim = (width, height)
resized = cv2.resize(image, dim, interpolation = cv2.INTER_AREA)
cv2.imshow("Resize", resized)
cv2.waitKey(0)

Using PIL to Read and Display Images

We will use open() to load the image and then use show() to display it.

Using image.show() creates a temporary file

pil_image= Image.open(r'love.jpg')
pil_image.show("PIL Image")

If we are interested in the edges or other features of the objects in the image, how can we identify them?

Grayscale images are often used for edge detection of target objects because grayscale images not only help understand the contrast and shadow gradients in the image but also help understand image features.

Compared to the 2D channels of grayscale images, RGB images have three channels: red, green, and blue. Compared to color images, grayscale images contain less information per pixel, thus processing time for grayscale images will be faster.

Using OpenCV to Convert Color Images to Grayscale

Below is the method of converting a color image to a grayscale image using cvtColor() and the conversion result.

image = cv2.imread(r'love.jpg')
gray_image= cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
cv2.namedWindow('Gray Image', cv2.WINDOW_NORMAL)
cv2.imshow('Gray Image', gray_image)
cv2.waitKey(0)

Converted grayscale image

Using PIL to Convert Color Images to Grayscale

convert() provides another way to convert images, with the “ L” mode used to convert to grayscale and “ RGB” mode used to convert to color images.

pil_image= Image.open(r'love.jpg')
gray_pil=pil_image.convert('L')
gray_pil.show()

Using OpenCV for Edge Detection

We will use the Canny operator to detect edges in the image. Canny edge detection is performed through grayscale images using higher-order algorithms.

Canny(): The first parameter is the input image, the second and third parameters are the values of threshold1 and threshold2.

Edges with intensity gradients greater than threshold2 are considered edges, while those below threshold1 are considered non-edges. Non-edges will be discarded. Any gradient intensity value between the two thresholds is classified as an edge or non-edge based on their connectivity.

image= cv2.imread(r'taj.jpg')
cv2.namedWindow("Edge", cv2.WINDOW_NORMAL)
denoised_image = cv2.Canny(image, 100,200 )
cv2.imshow("Edge", denoised_image)
cv2.waitKey(0)

Canny edge processing

If the image is tilted or rotated, how should we adjust it?

OCR performs poorly on tilted text, so we need to correct the original image. We can use rotate() in OpenCV and PIL to perform angle correction on the image.

Using OpenCV to Rotate Images

rotate() will rotate the image based on the value of the second parameter rotationCode. The rotation parameter values are as follows:

cv2.ROTATE_90_CLOCKWISE

cv2. ROTATE_90_COUNTERCLOCKWISE

cv2.ROTATE_180

image = cv2.imread(r'love.jpg')
cv2.namedWindow("Rotated Image", cv2.WINDOW_NORMAL)
rotated_img= cv2.rotate(image,cv2.ROTATE_90_CLOCKWISE )
cv2.imshow("Rotated Image", rotated_img)
cv2.waitKey(0)

Using OpenCV to rotate the image 90 degrees clockwise

If we want to rotate the image by a specific angle, what should we do?

Rotate the image by a specific angle

In the code below, the image is rotated in increments of 60 degrees

Using imutils’srotate()

import imutils
import numpy as np
image = cv2.imread(r'love.jpg')
# loop over the rotation angles
for angle in np.arange(0, 360, 60):
    cv2.namedWindow("Rotated", cv2.WINDOW_NORMAL)
    rotated = imutils.rotate(image, angle)
    cv2.imshow("Rotated", rotated)
    cv2.waitKey(0)

Using imutils to rotate the image in increments of 60 degrees

Using PIL to Rotate Images

Here we use PIL to rotate the image 110 degrees

pil_image= Image.open(r'love.jpg')
rotate_img_pil=pil_image.rotate(110)
rotate_img_pil.show()

Using PIL to rotate the image 110 degrees

When images are degraded due to noise and affect image analysis, how should we improve the image quality?

Using OpenCV for Denoising Images

Noise is not the signal we want, and in terms of images, it can distort the image.

To minimize noise in images using OpenCV, we first input the noisy image

image= cv2.imread(r'taj.jpg')
cv2.namedWindow("Noised Image", cv2.WINDOW_NORMAL)
cv2.imshow("Noised Image", image)
cv2.waitKey(0)

OpenCV has various methods to eliminate noise in images. We will use cv.fastNlMeansDenoisingColored() to eliminate noise in color images.fastNIMeansDenoising common parameters:

src: source image

dst: output image of the same size and type as src

h: adjusts the filter strength. A higher h value can completely eliminate noise and image details, while a lower h value can retain image details as well as some noise.

hForColorComponents: same as h, but only for color images, usually the same as h

templateWindowSize: default 0 (recommended 7)

searchWindowSize: default 0 (recommended 21)

image= cv2.imread(r'taj.jpg')
cv2.namedWindow("Denoised Image", cv2.WINDOW_NORMAL)
denoised_image = cv2.fastNlMeansDenoisingColored(image,None, h=5)
cv2.imshow("Denoised Image", denoised_image)
cv2.waitKey(0)

How to extract certain regions of interest from an image?Cropping ImagesCropping images allows us to extract regions of interest from images.We will crop the image of the Taj Mahal, removing other details from the image, leaving only the Taj Mahal.Using OpenCV to Crop ImagesIn OpenCV, cropping is done by slicing the image array, where we first pass the starting and ending y coordinates, then the starting and ending x coordinates.

image[y_start:y_end, x_start:x_end]

image= cv2.imread(r'taj.jpg')
resized_img= image[15:170, 20:200]
cv2.imshow("Resize", resized_img)
cv2.waitKey(0)

Using PIL to Crop Images

PIL’s crop() allows us to crop rectangular regions of the image. The crop() parameters are the pixel coordinates of the top left and bottom right corners of the rectangle.

pil_image = Image.open(r'taj.jpg')
# Get the Size of the image in pixels
width, height = pil_image.size
# Setting the cropped image coordinates
left = 3
top = height /25
right = 200
bottom = 3 * height / 4
# Crop the image based on the above dimension
cropped_image = pil_image.crop((left, top, right, bottom))
# Shows the image in image viewer
cropped_image.show()

Template MatchingWe can provide a template and use matchTemplate() in OpenCV to search for that template in the image and extract its position.This template will slide over the entire image like a convolutional neural network, attempting to match the template with the input image.minMaxLoc() is used to get the maximum/minimum values, starting from the top left corner of the rectangle and obtaining values along the width and height.There are 6 methods for template matching:

cv2.TM_SQDIFF

cv2.TM_SQDIFF_NORMED

cv2.TM_CCOEFF

cv2.TM_CCORR_NORMED

cv2.TM_CCOEFF

cv2.TM_CCOEFF_NORMED

In the following example, we will create a template by cropping a small part from the main image.The method used for template matching is TM_CCOEFF_NORMED. The matching threshold is set to 0.95. When the matching probability exceeds 0.95, the function will draw a rectangle around the area corresponding to that match.

import cv2
import numpy as np
from matplotlib import pyplot as plt
img = cv2.imread(r'love.jpg',0)
cv2.imshow("main",img)
cv2.waitKey(0)
template = cv2.imread(r'template1.png',0)
cv2.imshow("Template",template)
cv2.waitKey(0)
w, h = template.shape[::-1]
methods = [ 'cv2.TM_CCOEFF_NORMED']
for meth in methods:
        method = eval(meth)# Apply template Matching
    res = cv2.matchTemplate(img,template,method)
    min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res)
    threshold=0.95
    loc=np.where(res>threshold)
    if len(loc[0])>0:# If the method is TM_SQDIFF or TM_SQDIFF_NORMED, take minimum
        if method in [ cv2.TM_SQDIFF_NORMED]:
            top_left = min_loc
        bottom_right = (top_left[0] + w, top_left[1] + h)
    cv2.rectangle(img,top_left, bottom_right,100,20)
    plt.subplot(121),plt.imshow(res,cmap = 'gray')
        plt.title('Matching Result'), plt.xticks([]), plt.yticks([])
        plt.subplot(122),plt.imshow(img,cmap = 'gray')
        plt.title('Detected Point'), plt.xticks([]), plt.yticks([])
        plt.suptitle(meth)
    plt.show()
    else:
        print("Template not matched")

Conclusion The most common image processing techniques we discussed can be used for analyzing images, such as image classification, object detection, and OCR.

Discussion Group

Welcome to join the WeChat group for readers of the public account to communicate with peers. Currently, there are WeChat groups for SLAM, 3D Vision, Sensors, Autonomous Driving, Computational Photography, Detection, Segmentation, Recognition, Medical Imaging, GAN, Algorithm Competitions, etc. (these will gradually be subdivided), please scan the WeChat ID below to join the group, with remarks: “Nickname + School/Company + Research Direction”, for example: “Zhang San + Shanghai Jiao Tong University + Vision SLAM”. Please follow the format for remarks, otherwise, you will not be approved. After successfully adding, you will be invited into the relevant WeChat group based on your research direction. Please do not send advertisements in the group, otherwise, you will be removed from the group. Thank you for your understanding~

Discussion Group

Leave a Comment Cancel reply