Python Convolutional Neural Network (CNN) for Face Recognition

First, you need to install Python and find a user-friendly compiler, like RStudio.

Next, you need to find the data. The original author has placed the data on Kaggle (https://www.kaggle.com/datasets/jessicali9530/lfw-dataset/code), but I have already downloaded it. Just reply with “Face Recognition” to get the complete data.

Be sure to set the reading path to access the folders named after people’s names

Without further ado, let’s look at the code.

First, you need to load the libraries, similar to R’s library() function, but in Python, it’s import…as…

import os
from sklearn.utils import Bunch
import numpy as np
import matplotlib.pyplot as plt
from skimage.transform import resize
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import TensorDataset, DataLoader
from sklearn.model_selection import train_test_split

os: Provides functionality for interacting with the operating system.

Bunch: A helper class for storing datasets.

numpy: Used for numerical calculations.

matplotlib.pyplot: Used for plotting.

skimage.transform.resize: Used for resizing images.

torch: Used for deep learning framework.

torch.nn: Contains modules needed to build neural networks.

torch.optim: Contains optimization algorithms.

TensorDataset and DataLoader: Used to create datasets and data loaders.

train_test_split: Used to split datasets into training and testing sets.

Set the path

local_data_home = 'E:/Deepseek/archive/lfw-deepfunneled/lfw-deepfunneled'

def load_lfw_from_local(directory, min_faces_per_person=70, resize_factor=0.4):
    images = []
    target = []
    target_names = []

    # Traverse all subdirectories (each person)
    for subdir in sorted(os.listdir(directory)):
        subdir_path = os.path.join(directory, subdir)
        if os.path.isdir(subdir_path):
            # Count the number of images in this directory
            num_images = len([f for f in os.listdir(subdir_path) if f.endswith('.jpg')])
            
            # If the number of images is less than min_faces_per_person, skip
            if num_images < min_faces_per_person:
                continue
            
            target_name = subdir.replace('_', ' ')
            target_names.append(target_name)
            label = len(target_names) - 1
            
            # Load images
            for img_file in sorted(os.listdir(subdir_path)):
                if img_file.endswith('.jpg'):
                    img_path = os.path.join(subdir_path, img_file)
                    image = plt.imread(img_path)
                    
                    # Ensure the image is grayscale
                    if image.ndim == 3 and image.shape[2] == 3:  # Color image
                        image = np.mean(image, axis=2)  # Convert to grayscale
                    
                    # Resize the image (if necessary)
                    if resize_factor != 1.0:
                        new_height = int(image.shape[0] * resize_factor)
                        new_width = int(image.shape[1] * resize_factor)
                        image = resize(image, (new_height, new_width), anti_aliasing=True)
                    
                    images.append(image)
                    target.append(label)

    return Bunch(
        images=np.array(images),
        data=np.array([img.flatten() for img in images]),
        target=np.array(target),
        target_names=np.array(target_names),
        DESCR="LFW faces dataset"
    )

Function definition:

load_lfw_from_local(directory, min_faces_per_person=70, resize_factor=0.4): Loads the LFW dataset from local.

Parameters:

directory: The path to the dataset directory.

min_faces_per_person: Minimum number of images per person to be included in the dataset.

resize_factor: The scaling factor for resizing images.

Variable initialization:

images: A list to store all images.

target: A list to store labels for each image.

target_names: A list to store names of each target.

Traverse subdirectories:

for subdir in sorted(os.listdir(directory)): Traverse each subdirectory in the dataset directory.

subdir_path = os.path.join(directory, subdir): Construct the full path of the subdirectory.

if os.path.isdir(subdir_path): Check if it is a directory.

num_images = len([f for f in os.listdir(subdir_path) if f.endswith(‘.jpg’)]): Count the number of .jpg files in this directory.

if num_images < min_faces_per_person: If the number of images is insufficient, skip this directory.

Process images:

target_name = subdir.replace(‘_’, ‘ ‘): Replace underscores in the subdirectory name with spaces as the target name.

target_names.append(target_name): Add the target name to the target_names list.

label = len(target_names) – 1: Assign a unique label to each target.

for img_file in sorted(os.listdir(subdir_path)): Traverse each image file in this directory.

if img_file.endswith(‘.jpg’): Check if the file extension is .jpg.

img_path = os.path.join(subdir_path, img_file): Construct the full path of the image file.

image = plt.imread(img_path): Read the image file.

if image.ndim == 3 and image.shape[2] == 3: Check if the image is a color image.

image = np.mean(image, axis=2): Convert the color image to grayscale.

if resize_factor != 1.0: If resizing is needed.

new_height = int(image.shape[0] * resize_factor): Calculate the new height.

new_width = int(image.shape[1] * resize_factor): Calculate the new width.

image = resize(image, (new_height, new_width), anti_aliasing=True): Resize the image.

images.append(image): Add the processed image to the images list.

target.append(label): Add the image’s label to the target list.

Return dataset:

return Bunch(…): Returns a Bunch object containing images, labels, target names, and other information.

Load the dataset

faces = load_lfw_from_local(local_data_home, min_faces_per_person=70, resize_factor=0.4)

Call the load_lfw_from_local function to load the LFW dataset and store it in the faces variable.

Get image dimensions

n_samples, h, w = faces.images.shape[:3]

faces.images.shape returns the shape of the image array.

n_samples: Number of samples.

h: Height of the images.

w: Width of the images.

Convert image data to PyTorch tensors

X = torch.tensor(faces.images, dtype=torch.float32).unsqueeze(1)  # Add channel dimension
y = torch.tensor(faces.target, dtype=torch.long)

torch.tensor(faces.images, dtype=torch.float32): Convert the image array to a float PyTorch tensor.

.unsqueeze(1): Add a channel dimension at dimension 1, making the image shape (N, C, H, W), where N is the number of samples, C is the number of channels (1 for grayscale), H and W are the height and width of the images, respectively.

torch.tensor(faces.target, dtype=torch.long): Convert the label array to a long PyTorch tensor.

Split into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

train_test_split(X, y, test_size=0.2, random_state=42): Split the dataset into training and testing sets.

X: Feature tensor.

y: Label tensor.

test_size=0.2: The testing set occupies 20% of the total dataset.

random_state=42: Set the random seed to ensure reproducibility.

Create Dataset and DataLoader

train_dataset = TensorDataset(X_train, y_train)
test_dataset = TensorDataset(X_test, y_test)

train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)

TensorDataset(X_train, y_train): Create a training dataset object.
TensorDataset(X_test, y_test): Create a testing dataset object.
DataLoader(train_dataset, batch_size=32, shuffle=True): Create a training data loader with a batch size of 32 and shuffle the data.
DataLoader(test_dataset, batch_size=32, shuffle=False): Create a testing data loader with a batch size of 32 without shuffling the data.

TensorDataset(X_train, y_train): Create a training dataset object.

TensorDataset(X_test, y_test): Create a testing dataset object.

DataLoader(train_dataset, batch_size=32, shuffle=True): Create a training data loader with a batch size of 32 and shuffle the data.

DataLoader(test_dataset, batch_size=32, shuffle=False): Create a testing data loader with a batch size of 32 without shuffling the data.

Define SimpleCNN model

class SimpleCNN(nn.Module):
    def __init__(self):
        super(SimpleCNN, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, kernel_size=3, padding=1)
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3, padding=1)
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
        self.fc1 = nn.Linear(64 * (h // 4) * (w // 4), 128)
        self.fc2 = nn.Linear(128, 64)
        self.fc3 = nn.Linear(64, len(faces.target_names))

    def forward(self, x):
        x = self.pool(torch.relu(self.conv1(x)))
        x = self.pool(torch.relu(self.conv2(x)))
        x = x.view(-1, 64 * (h // 4) * (w // 4))
        x = torch.relu(self.fc1(x))
        x = torch.relu(self.fc2(x))
        x = self.fc3(x)
        return x

Class definition:

SimpleCNN(nn.Module): Defines a convolutional neural network class inheriting from nn.Module.

Constructor (init):

super(SimpleCNN, self).init(): Initialize the parent class.

self.conv1 = nn.Conv2d(1, 32, kernel_size=3, padding=1): Define the first convolutional layer with 1 input channel and 32 output channels, kernel size of 3×3, and padding of 1.

self.conv2 = nn.Conv2d(32, 64, kernel_size=3, padding=1): Define the second convolutional layer with 32 input channels and 64 output channels, kernel size of 3×3, and padding of 1.

self.pool = nn.MaxPool2d(kernel_size=2, stride=2): Define the max pooling layer with a pooling window size of 2×2 and stride of 2.

self.fc1 = nn.Linear(64 * (h // 4) * (w // 4), 128): Define the first fully connected layer with input features of 64 * (h // 4) * (w // 4) and output features of 128.

self.fc2 = nn.Linear(128, 64): Define the second fully connected layer with input features of 128 and output features of 64.

self.fc3 = nn.Linear(64, len(faces.target_names)): Define the third fully connected layer with input features of 64 and output features equal to the number of target classes.

Forward propagation (forward):

x = self.pool(torch.relu(self.conv1(x))): First convolutional layer followed by ReLU activation and max pooling layer.

x = self.pool(torch.relu(self.conv2(x))): Second convolutional layer followed by ReLU activation and max pooling layer.

x = x.view(-1, 64 * (h // 4) * (w // 4)): Flatten the feature map.

x = torch.relu(self.fc1(x)): First fully connected layer followed by ReLU activation.

x = torch.relu(self.fc2(x)): Second fully connected layer followed by ReLU activation.

x = self.fc3(x): Third fully connected layer.

return x: Return the final output.

Instantiate the model

model = SimpleCNN()

Create an instance of the SimpleCNN class and assign it to the model variable.

Define loss function and optimizer

criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

criterion = nn.CrossEntropyLoss(): Define the cross-entropy loss function for multi-class problems.

optimizer = optim.Adam(model.parameters(), lr=0.001): Define the Adam optimizer with a learning rate of 0.001.

Train the model

num_epochs = 10

for epoch in range(num_epochs):
    model.train()
    running_loss = 0.0
    for inputs, labels in train_loader:
        optimizer.zero_grad()
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        running_loss += loss.item()
    
    print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {running_loss/len(train_loader):.4f}')

Training loop:

num_epochs = 10: Set the number of training epochs to 10.

for epoch in range(num_epochs): Loop through 10 epochs of training.

model.train(): Set the model to training mode.

running_loss = 0.0: Initialize the cumulative loss for the current epoch.

for inputs, labels in train_loader: Iterate through each batch in the training data loader.

optimizer.zero_grad(): Clear the gradients.

outputs = model(inputs): Forward propagation to get model outputs.

loss = criterion(outputs, labels): Calculate the loss.

loss.backward(): Backpropagation to compute gradients.

optimizer.step(): Update model parameters.

running_loss += loss.item(): Accumulate the loss for the current batch.

print(…) : Print the average loss for the current epoch.

Test the model

model.eval()
correct = 0
total = 0

with torch.no_grad():
    for inputs, labels in test_loader:
        outputs = model(inputs)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print(f'Accuracy of the network on the test images: {100 * correct / total:.2f}%')

Testing loop:

model.eval(): Set the model to evaluation mode.

correct = 0: Initialize the count of correct predictions.

total = 0: Initialize the total sample count.

with torch.no_grad(): Disable gradient computation to save memory and speed up computation.

for inputs, labels in test_loader: Iterate through each batch in the testing data loader.

outputs = model(inputs): Forward propagation to get model outputs.

_, predicted = torch.max(outputs.data, 1): Get the predicted class labels.

total += labels.size(0): Accumulate the number of samples in the current batch.

correct += (predicted == labels).sum().item(): Accumulate the count of correct predictions.

print(…) : Print the accuracy on the test set.

Define prediction function

def predict_image(model, image_path):
    image = Image.open(image_path).convert('L')
    image = image.resize((100, 100))
    image = np.array(image).astype(np.float32) / 255.0
    image = torch.tensor(image, dtype=torch.float32).view(-1, 1, 100, 100)
    
    model.eval()
    with torch.no_grad():
        output = model(image)
        _, predicted = torch.max(output.data, 1)
    
    return faces.target_names[predicted.item()]

Function definition:

predict_image(model, image_path): Defines a function to predict a single image using the trained model.

Parameters:

model: The trained CNN model.

image_path: The path of the input image.

Process the image:

image = Image.open(image_path).convert(‘L’): Open the image and convert it to grayscale.

image = image.resize((100, 100)): Resize the image to (100, 100).

image = np.array(image).astype(np.float32) / 255.0: Convert the image to a NumPy array and normalize it to the range [0, 1].

image = torch.tensor(image, dtype=torch.float32).view(-1, 1, 100, 100): Convert the NumPy array to a PyTorch tensor and reshape it to (1, 1, 100, 100).

Forward propagation:

model.eval(): Set the model to evaluation mode.

with torch.no_grad(): Disable gradient computation.

output = model(image): Perform forward propagation to get the model output.

_, predicted = torch.max(output.data, 1): Get the predicted class labels.

Return prediction result:

return faces.target_names[predicted.item()]: Return the predicted name.

Use the model for prediction

image_path = "E:\Deepseek\archive\lfw-deepfunneled\lfw-deepfunneled\Aaron_Peirsol\Aaron_Peirsol_0003.jpg"
predicted_name = predict_image(model, image_path)
print(f"The predicted name is: {predicted_name}")

image_path: Specify the image path to predict.

predicted_name = predict_image(model, image_path): Call the predict_image function for prediction.

print(f”The predicted name is: {predicted_name}”): Print the prediction result.

Don’t forget to save the model

# Save the model
torch.save(model.state_dict(), 'simple_cnn_model.pth')

Load the model using the torch.load function.

# Load the entire model
loaded_model = torch.load('model.pth')

This sharing ends here. Next time, I will show you how to plot the ROC curve of this model. Don’t forget to like and follow!