Building CNN Networks with Object-Oriented Programming | PyTorch Series

Click the “Beginner’s Visual Learning” above to choose to add “Star” or “Pinned“.

Important content delivered promptly.

From a high-level perspective of our deep learning project, we have prepared the data, and now we are ready to build our model.

  • Prepare Data

  • Build Model

  • Train Model

  • Analyze Model Results

When we talk about the model, we refer to our network. “Model” and “Network” mean the same thing. We want our network to ultimately model or approximate a function that maps image inputs to the correct output classes.

1. Prerequisites

To build neural networks in PyTorch, we extend the PyTorch class torch.nn.Module. This means we need to utilize a bit of Object-Oriented Programming (OOP) in Python.

In this article, we will quickly review the details required for using neural networks in PyTorch, but if you find you need more, there is an overview tutorial in the Python documentation.

https://docs.python.org/3/tutorial/classes.html

To construct a Convolutional Neural Network (CNN), we need a rough understanding of how CNNs work and the components used to build CNNs. This series on deep learning fundamentals is a great prerequisite for this series, so I strongly recommend you cover this series (if you haven’t already). If you just want a crash course on CNNs, you can check out the following specific articles:

  • Explanation of Convolutional Neural Networks (CNNs)

  • Visualizing Convolution Filters from CNNs

  • Explanation of Zero Padding in Convolutional Neural Networks

  • Explanation of Max Pooling in Convolutional Neural Networks

  • Explanation of Learnable Parameters in Convolutional Neural Networks (CNNs)

Now let’s quickly review Object-Oriented Programming.

https://deeplizard.com/learn/video/YRhxdVk_sIs https://deeplizard.com/learn/video/cNBBNAxC8l4 https://deeplizard.com/learn/video/qSTv_m-KFk0 https://deeplizard.com/learn/video/ZjM_XQa5s6s https://deeplizard.com/learn/video/gmBfb6LNnZs

2. Quick Review of Object-Oriented Programming

When we write programs or build software, there are two key components: code and data. With Object-Oriented Programming, we can determine the direction of program design and structure around objects.

Objects are defined in code using classes. A class defines the specifications of an object, specifying the data and code that each object of the class should have.

When we create an object of a class, we call that object an instance of the class, and all instances of a given class have two core components:

  • Methods (code)

  • Attributes (data)

Methods represent the code, while attributes represent the data, so methods and attributes are defined by the class.

In a given program, there can be many objects. Instances of a given class can exist simultaneously, all instances having the same available attributes and the same available methods. From this perspective, they are consistent.

The distinction between objects of the same class lies in the values contained in each object’s attributes. Each object has its own attribute values, which determine the internal state of the object. The code and data of each object are encapsulated within the object.

Let’s build a simple Lizard class to demonstrate how a class encapsulates data and code:

class Lizard:  # class declaration    def __init__(self, name):  # class constructor (code)        self.name = name  # attribute (data)    def set_name(self, name):  # method declaration (code)        self.name = name  # method implementation (code)

The first line declares the class and specifies the class name, which in this case is Lizard.

The second line defines a special method called the class constructor. The class constructor is called when a new instance of the class is created. As parameters, we have self and name.

The self parameter allows us to create the attribute values that are stored or encapsulated within the object. When we call this constructor or any other method, we do not pass the self parameter. Python does this for us automatically.

The values of any other parameters are arbitrarily passed by the caller, and these passed values can be used in computations or saved and accessed later using self.

Once the constructor is complete, we can create any number of dedicated methods, such as this one, which allows the caller to change the value of name stored in self. All we need to do here is call that method and pass a new value for the name. Let’s see how it works.

> lizard = Lizard('deep')> print(lizard.name)deep> lizard.set_name('lizard')> print(lizard.name)lizard

We create an instance of the class by specifying the class name and passing the constructor parameters. The constructor receives these parameters, the constructor code runs, and the passed name is saved.

Then we can access the name and print it, and we can also call the set_name() method to change the name. There can be multiple such instances of Lizard in a program, each containing its own data.

From an object-oriented perspective, the important part of this setup is combining attributes and methods and encapsulating them within the object.

Now let’s shift gears and see how Object-Oriented Programming fits into PyTorch.

PyTorch’s torch.nn Package

To build neural networks in PyTorch, we use the torch.nn package, which is PyTorch’s neural network (nn) library. We typically import the package like this:

import torch.nn as nn

This allows us to access the neural network package using the nn alias. So from now on, when we say nn, we refer to torch.nn. PyTorch’s neural network library contains all the typical components needed for building neural networks.

The main component required to build neural networks is a layer, so as we would expect, PyTorch’s neural network library contains some classes to help us build layers.

PyTorch’s nn.Module Class

It is well known that deep neural networks consist of multi-layer structures. That’s why they are called deep networks. Each layer in a neural network has two main components:

  • Transformations (code)

  • A set of weights (data)

Like many things in life, this fact makes layers the best candidates for representing objects using OOP. OOP stands for Object-Oriented Programming.

In fact, this is the case in PyTorch. In the neural network package, there is a class called Module, which is the base class for all neural network modules, including layers.

This means that all layers in PyTorch extend the nn.Module class and inherit all the built-in functionality of PyTorch in nn.Module. In Object-Oriented Programming, this concept is called inheritance.

Even neural networks extend the nn.Module class. This makes sense because a neural network itself can be thought of as a large layer (if needed, let it sink over time).

Neural networks and layers in PyTorch extend the nn.Module class. This means that when we build new layers or neural networks in PyTorch, we must extend the nn.Module class.

PyTorch’s nn.Module Has a forward() Method

When we pass a tensor as input to the network, the tensor flows forward through each layer until it reaches the output layer. This process of the tensor flowing forward through the network is called forward propagation.

The tensor input is passed forward through the network.

PyTorch’s nn.functional Package

When we implement the forward() method for subclasses of nn.Module, we typically use functions from the nn.functional package. This package provides us with many neural network operations that can be used to build layers. In fact, many nn.Module layer classes use nn.functional functions to perform their operations.

The nn.functional package contains methods that subclasses of nn.Module use to implement their forward() functions. Later, we will observe an example by looking at the source code of the nn.Conv2d convolution layer class in PyTorch.

Building Neural Networks in PyTorch

Now we have enough information to provide an overview of building neural networks in PyTorch. The steps are as follows:

Short version:

  1. Extend the nn.Module base class.

  2. Define layers as class attributes.

  3. Implement the forward() method.

More detailed version:

  1. Create a neural network class that extends the nn.Module base class.

  2. In the class constructor, use pre-built layers from torch.nn to define the layers of the network as class attributes.

  3. Use the network’s layer attributes and operations from the nn.functional API to define the forward propagation of the network.

(1) Extending PyTorch’s nn.Module Class

Just like we did in the Lizard class example, let’s create a simple class to represent the neural network.

class Network:    def __init__(self):        self.layer = None    def forward(self, t):        t = self.layer(t)        return t

This gives us a simple network class that has a single virtual layer inside the constructor and a virtual implementation of the forward function.

The forward() function’s implementation takes the tensor t and transforms it using the virtual layer. After the tensor is transformed, a new tensor is returned.

This is a good start, but the class has not yet extended the nn.Module class. To make our Network class extend nn.Module, we need to do two more things:

  1. Specify the nn.Module class in the parentheses of line 1.

  2. Insert a call to the superclass constructor on line 3 inside the constructor.

This gives us:

class Network(nn.Module):  # line 1    def __init__(self):        super().__init__()  # line 3        self.layer = None    def forward(self, t):        t = self.layer(t)        return t

These changes convert our simple neural network into a PyTorch neural network because we are now extending the nn.Module base class from PyTorch.

And there we have it! Now we have a Network class that has all the functionality of the PyTorch nn.Module class.

(2) Defining the Layers of the Network as Class Attributes

Currently, our Network class has a single virtual layer as an attribute. Now, let’s replace it with some real layers that are pre-built for us in the PyTorch nn library. Since we are building a CNN, the two types of layers we will use are linear layers and convolutional layers.

class Network(nn.Module):    def __init__(self):        super().__init__()        self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5)        self.conv2 = nn.Conv2d(in_channels=6, out_channels=12, kernel_size=5)        self.fc1 = nn.Linear(in_features=12 * 4 * 4, out_features=120)        self.fc2 = nn.Linear(in_features=120, out_features=60)        self.out = nn.Linear(in_features=60, out_features=10)    def forward(self, t):        # implement the forward pass        return t

Great. At this point, we have a Python class named Network that extends PyTorch’s nn.Module class. Inside the Network class, we have five layers defined as attributes. We have two convolutional layers, self.conv1 and self.conv2, and three linear layers, self.fc1, self.fc2, and self.out.

We used the abbreviation fc in fc1 and fc2 because linear layers are also known as fully connected layers. They also have a name we might hear called “dense.” Thus, linear, dense, and fully connected all refer to the same type of layer. PyTorch uses the term linear, so we use the nn.Linear class name.

We use the name out for the last linear layer because it is the output layer of the network.

Summary

Building CNN Networks with Object-Oriented Programming | PyTorch Series

Now we should have a good idea of how to start building neural networks in PyTorch using the torch.nn library. In the next article, we will explore the different types of parameters for layers and learn how to choose them. See you next time.

Good news!<br/>Beginner's Visual Learning Knowledge Circle<br/>Starts to open to the public 👇👇👇<br/><br/>Download 1: OpenCV-Contrib Extension Module Chinese Version Tutorial<br/>Reply "Extension Module Chinese Tutorial" in the backend of the "Beginner's Visual Learning" public account to download the first Chinese version of OpenCV extension module tutorial on the internet, covering installation of extension modules, SFM algorithms, stereo vision, object tracking, biological vision, super-resolution processing, etc. over twenty chapters of content.<br/><br/>Download 2: Python Vision Practical Project 52 Lectures<br/>Reply "Python Vision Practical Project" in the backend of the "Beginner's Visual Learning" public account to download 31 vision practical projects including image segmentation, mask detection, lane line detection, vehicle counting, eyeliner addition, license plate recognition, character recognition, emotion detection, text content extraction, face recognition, etc., to help quickly learn computer vision.<br/><br/>Download 3: OpenCV Practical Project 20 Lectures<br/>Reply "OpenCV Practical Project 20 Lectures" in the backend of the "Beginner's Visual Learning" public account to download 20 practical projects based on OpenCV, achieving advanced learning of OpenCV.<br/><br/>Group Chat<br/><br/>Welcome to join the reader group of the public account to communicate with peers. Currently, there are WeChat groups for SLAM, 3D vision, sensors, autonomous driving, computational photography, detection, segmentation, recognition, medical imaging, GAN, algorithm competitions, etc. (will gradually subdivide in the future). Please scan the WeChat ID below to join the group, noting: "Nickname + School/Company + Research Direction", for example: "Zhang San + Shanghai Jiao Tong University + Visual SLAM". Please follow the format for notes, otherwise, it will not be approved. After successful addition, you will be invited into the relevant WeChat group based on your research direction. Please do not send advertisements in the group, otherwise you will be removed from the group. Thank you for your understanding.~

Leave a Comment