Building CNN Networks Using Object-Oriented Programming

Click on the above “Beginner Learning Visuals”, choose to add “Star” or “Top”

Heavyweight content delivered first time

From a high-level perspective of our deep learning project, we have prepared the data, and now we are ready to build our model.

Prepare Data
Build Model
Train Model
Analyze Model Results

When we refer to the model, we mean our network. “Model” and “network” mean the same thing. What we want our network to ultimately do is to model or approximate a function that maps image inputs to the correct output classes.

1. Prerequisites

To build neural networks in PyTorch, we extend the PyTorch class torch.nn.Module. This means we need to utilize a bit of object-oriented programming (OOP) in Python.

In this article, we will quickly review the details needed to use neural networks in PyTorch, but if you find you need more, there is an overview tutorial in the Python documentation.

https://docs.python.org/3/tutorial/classes.html

To build a Convolutional Neural Network (CNN), we need to have a general understanding of how CNNs work and the components used to build them. This deep learning fundamentals series is a great prerequisite for this series, so I strongly recommend you cover this series if you haven’t already. If you just want a crash course on CNNs, you can check out the following specific articles:

Explanation of Convolutional Neural Networks (CNN)
Visualizing Convolutional Filters from CNNs
Explanation of Zero Padding in Convolutional Neural Networks
Explanation of Max Pooling in Convolutional Neural Networks
Explanation of Learnable Parameters in Convolutional Neural Networks (CNN)

Now let’s quickly review object-oriented programming.

https://deeplizard.com/learn/video/YRhxdVk_sIs https://deeplizard.com/learn/video/cNBBNAxC8l4 https://deeplizard.com/learn/video/qSTv_m-KFk0 https://deeplizard.com/learn/video/ZjM_XQa5s6s https://deeplizard.com/learn/video/gmBfb6LNnZs

2. Quick Review of Object-Oriented Programming

When we write programs or build software, there are two key components: code and data. With object-oriented programming, we can determine the direction of program design and structure around objects.

We define objects in code using classes. A class defines the specifications of an object, specifying the data and code that each object of the class should have.

When we create an object of a class, we call this object an instance of the class, and all instances of a given class have two core components:

Methods (code)
Attributes (data)

Methods represent the code, while attributes represent the data, so methods and attributes are defined by the class.

In a given program, there can be many objects. An instance of a given class can exist simultaneously, and all instances have the same available attributes and the same available methods.

The difference between objects of the same class lies in the values contained in each object’s attributes. Each object has its own attribute values. These values determine the internal state of the object. The code and data of each object are encapsulated within the object.

Let’s build a simple Lizard class to demonstrate how classes encapsulate data and code:

class Lizard: #class declaration    def __init__(self, name): #class constructor (code)        self.name = name #attribute (data)        def set_name(self, name): #method declaration (code)        self.name = name #method implementation (code)

The first line declares the class and specifies the class name, which in this case is Lizard.

The second line defines a special method called the class constructor. The class constructor is called when creating a new instance of the class. As parameters, we have self and name.

The self parameter allows us to create attributes that are stored or encapsulated within the object. When we call this constructor or any other method, we do not pass the self parameter. Python does this automatically for us.

Any other parameter values are arbitrarily passed by the caller, and these passed values can be used in calculations or saved and accessed later using self.

After completing the constructor, we can create any number of dedicated methods, like this method that allows the caller to change the name value stored in self. What we need to do here is call that method and pass a new value for the name. Let’s see how it works.

> lizard = Lizard('deep')> print(lizard.name)deep> lizard.set_name('lizard')> print(lizard.name)lizard

We create an instance of the class by specifying the class name and passing the constructor parameters. The constructor will receive these parameters, the constructor code will run, and the passed name will be saved.

Then, we can access the name and print it, and we can also call the set_name() method to change the name. Multiple such Lizard instances can exist in a program, each containing its own data.

From an object-oriented perspective, an important part of this setup is the combination of attributes and methods encapsulated within the object.

Now let’s shift gears and see how object-oriented programming fits into PyTorch.

PyTorch’s torch.nn Package

To build neural networks in PyTorch, we use the torch.nn package, which is the neural networks (nn) library of PyTorch. We typically import the package like this:

import torch.nn as nn

This allows us to access the neural network package with the nn alias. So from now on, when we say nn, we mean torch.nn. The neural network library in PyTorch contains all the typical components needed to build neural networks.

The main component needed to build a neural network is a layer, so as we would expect, PyTorch’s neural network library contains some classes to help us build layers.

PyTorch’s nn.Module Class

It is well known that deep neural networks consist of multiple layered structures. That is why they are called deep networks.

Each layer in a neural network has two main components:

Transformations (code)
A set of weights (data)

Like many things in life, this fact makes layers the best candidate objects to represent using OOP. OOP stands for object-oriented programming.

In fact, this is the case for PyTorch. In the neural network package, there is a class called Module, which is the base class for all neural network modules, including layers.

This means that all layers in PyTorch extend the nn.Module class and inherit all built-in functionality from nn.Module. In object-oriented programming, this concept is called inheritance.

Even neural networks themselves extend the nn.Module class. This makes sense since neural networks can be thought of as a large layer (if needed, let it sink over time).

Neural networks and layers in PyTorch extend the nn.Module class. This means that when building new layers or neural networks in PyTorch, we must extend the nn.Module class.

PyTorch’s nn.Modules Have a forward() Method

When we pass a tensor as input to the network, the tensor flows forward through each layer until it reaches the output layer. This process of the tensor flowing forward through the network is called forward propagation.

The tensor input is passed forward through the network.

PyTorch’s nn.functional Package

When we implement the forward() method of nn.Module subclasses, we will usually use functions from the nn.functional package. This package provides us with many neural network operations that can be used to build layers. In fact, many nn.Module layer classes use nn.functional functions to perform their operations.

The nn.functional package contains methods used by nn.Module subclasses to implement their forward() functions. Later, we will observe an example by looking at the PyTorch source code for the nn.Conv2d convolution layer class.

Building Neural Networks in PyTorch

Now, we have enough information to provide an overview of building neural networks in PyTorch. The steps are as follows:

Streamlined version:

Extend the nn.Module base class.
Define layers as class attributes.
Implement the forward() method.

More detailed version:

Create a neural network class that extends the nn.Module base class.
In the class constructor, define the layers of the network as class attributes using pre-built layers from torch.nn.
Use the network’s layer attributes and operations from the nn.functional API to define the forward propagation of the network.

(1) Extend PyTorch’s nn.Module Class

Just like we did in the lizard class example, let’s create a simple class to represent a neural network.

class Network:    def __init__(self):        self.layer = None    def forward(self, t):        t = self.layer(t)        return t

This gives us a simple network class that has a single virtual layer and a virtual implementation of the forward function.

The implementation of the forward() function takes the tensor t and transforms it using the virtual layer. After the transformation, a new tensor will be returned.

This is a good start, but the class has not yet extended the nn.Module class. To make our Network class extend nn.Module, we need to do two more things:

Specify the nn.Module class in the parentheses of line 1.
Insert a call to the superclass constructor on line 3 inside the constructor.

This gives us:

class Network(nn.Module): # line 1    def __init__(self):        super().__init__() # line 3        self.layer = None    def forward(self, t):        t = self.layer(t)        return t

These changes convert our simple neural network into a PyTorch neural network since we are now extending the nn.Module base class of PyTorch.

And that’s it! Now we have a Network class that has all the functionalities of the PyTorch nn.Module class.

(2) Define the Layers of the Network as Class Attributes

Currently, our Network class has a single virtual layer as an attribute. Now, let’s replace it with some real layers pre-built for us in PyTorch’s nn library. Since we are building a CNN, the two types of layers we will use are linear layers and convolutional layers.

class Network(nn.Module):    def __init__(self):        super().__init__()        self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5)        self.conv2 = nn.Conv2d(in_channels=6, out_channels=12, kernel_size=5)        self.fc1 = nn.Linear(in_features=12 * 4 * 4, out_features=120)        self.fc2 = nn.Linear(in_features=120, out_features=60)        self.out = nn.Linear(in_features=60, out_features=10)    def forward(self, t):        # implement the forward pass        return t

Alright. At this point, we have a Python class named Network that extends PyTorch’s nn.Module class. Inside the Network class, we have five layers defined as attributes. We have two convolutional layers, self.conv1 and self.conv2, and three linear layers, self.fc1, self.fc2, self.out.

We used the abbreviation fc in fc1 and fc2 because linear layers are also known as fully connected layers. They also have a name called “dense” that we might hear. Therefore, linear, dense, and fully connected are all methods referring to the same type of layer. PyTorch uses the term linear, so we use the nn.Linear class name.

We use the name out for the last linear layer because the last layer in the network is the output layer.

Summary

Now, we should have a good idea of how to start building neural networks in PyTorch using the torch.nn library. In the next article, we will explore the different types of parameters for layers and learn how to choose them. See you next.

Good news!<br/>Beginner Learning Visuals knowledge circle<br/>Is now open to the outside👇👇👇<br/><br/><br/>Download 1: OpenCV-Contrib Extended Module Chinese Version Tutorial<br/>Reply in the background of the “Beginner Learning Visuals” public account: Extended Module Chinese Tutorial to download the first OpenCV extended module tutorial in Chinese on the internet, covering installation of extended modules, SFM algorithms, stereo vision, target tracking, biological vision, super-resolution processing, and more than twenty chapters of content.<br/><br/>Download 2: Python Vision Practical Project 52 Lectures<br/>Reply in the background of the “Beginner Learning Visuals” public account: Python Vision Practical Project to download 31 vision practical projects including image segmentation, mask detection, lane line detection, vehicle counting, adding eyeliner, license plate recognition, character recognition, emotion detection, text content extraction, facial recognition, etc., helping to quickly learn computer vision.<br/><br/>Download 3: OpenCV Practical Project 20 Lectures<br/>Reply in the background of the “Beginner Learning Visuals” public account: OpenCV Practical Project 20 Lectures to download 20 practical projects based on OpenCV, achieving advancement in OpenCV learning.<br/><br/>Group Chat<br/><br/>Welcome to join the public account reader group to communicate with peers. Currently, there are WeChat groups for SLAM, 3D vision, sensors, autonomous driving, computational photography, detection, segmentation, recognition, medical imaging, GAN, algorithm competitions, etc. (will be gradually subdivided in the future). Please scan the WeChat number below to join the group, remark: “Nickname + School/Company + Research Direction”, for example: “Zhang San + Shanghai Jiao Tong University + Visual SLAM”. Please follow the format for remarks, otherwise, it will not be approved. After successfully adding, you will be invited into the relevant WeChat group based on research direction. Please do not send advertisements in the group, otherwise, you will be asked to leave the group. Thank you for your understanding~

Building CNN Networks Using Object-Oriented Programming | PyTorch Series (13)