Detailed Explanation of Common Operations in PyTorch

Click the above“Beginner’s Guide to Vision”, select to add “Star” or “Top”

Important content delivered promptly

<<Beginner’s Guide to PyTorch>>

Reference Directory:

1 Matrix and Scalar
2 Hadamard Product
3 Matrix Multiplication
4 Power and Square Root
5 Logarithmic Operations
6 Approximation Operations
7 Clamping Operations

This lesson mainly explains some operations in PyTorch, including addition, subtraction, multiplication, and division, as well as matrix multiplication. The content of this lesson is not extensive, serving as a knowledge reserve. In the subsequent content, there will be practical tasks using PyTorch to obtain the EfficientNet pre-trained model and a cat vs. dog classification task. EfficientNet is in lesson 13, and cat vs. dog classification is in lesson 14. Lesson 11 covers MobileNet in detail along with the PyTorch code analysis, and lesson 12 covers SENet in detail along with the PyTorch code analysis (since EfficientNet is based on these two networks). Further down the line, I plan to organize some outstanding papers and codes from the past two years, as well as effective techniques to improve accuracy. Of course, I have not yet detailed various optimizers in PyTorch (but generally, it’s SGDM).

I hope everyone enjoys this series~ I sincerely hope you can help promote it to friends who want to learn PyTorch. Thank you all!

I won’t elaborate on addition, subtraction, multiplication, and division, just +-*/

1 Matrix and Scalar

This involves performing operations on each element of a matrix (tensor) with a scalar.

import torch
a = torch.tensor([1,2])
print(a+1)
>>> tensor([2, 3])

2 Hadamard Product

This is the product of two tensors of the same size, where each corresponding element is multiplied, known as the Hadamard Product, also referred to as element-wise.

a = torch.tensor([1,2])
b = torch.tensor([2,3])
print(a*b)
print(torch.mul(a,b))
>>> tensor([2, 6])
>>> tensor([2, 6])

The function torch.mul() is equivalent to *.

Similarly, division works in the same way:

a = torch.tensor([1.,2.])
b = torch.tensor([2.,3.])
print(a/b)
print(torch.div(a,b))
>>> tensor([0.5000, 0.6667])
>>> tensor([0.5000, 0.6667])

We can see that torch.div() is essentially /, similarly: torch.add is +, torch.sub() is -, but the symbolic operations are simpler and more commonly used.

3 Matrix Multiplication

If we want to perform matrix multiplication in linear algebra, how can we do it?

There are three ways to write this operation:

torch.mm()
torch.matmul()
@, this needs to be memorized, otherwise it might be quite confusing when encountered

a = torch.tensor([1.,2.])
b = torch.tensor([2.,3.]).view(1,2)
print(torch.mm(a, b))
print(torch.matmul(a, b))
print(a @ b)

Output:

tensor([[2., 3.],
        [4., 6.]])
tensor([[2., 3.],
        [4., 6.]])
tensor([[2., 3.],
        [4., 6.]])

This is for two-dimensional matrices. If the participating operation is a multi-dimensional tensor, only torch.matmul() can be used. Wait, how do we perform matrix multiplication with multi-dimensional tensors? In multi-dimensional tensors, only the last two dimensions participate in matrix operations; the preceding dimensions act like indices. For example:

a = torch.rand((1,2,64,32))
b = torch.rand((1,2,32,64))
print(torch.matmul(a, b).shape)
>>> torch.Size([1, 2, 64, 64])

We can see that during matrix multiplication, we only consider the last two dimensions: multiplying gives us a resulting matrix. The preceding dimensions must be the same, acting like indices, determining which two tensors are multiplied.

Tip:

a = torch.rand((3,2,64,32))
b = torch.rand((1,2,32,64))
print(torch.matmul(a, b).shape)
>>> torch.Size([3, 2, 64, 64])

This can also be multiplied because it involves an automatic broadcasting mechanism, which will be discussed later. Here, just know that in this situation, the first dimension of b will be copied three times, making it the same size as a for matrix multiplication.

4 Power and Square Root

print('Power operation')
a = torch.tensor([1.,2.])
b = torch.tensor([2.,3.])
c1 = a ** b
c2 = torch.pow(a, b)
print(c1,c2)
>>> tensor([1., 8.]) tensor([1., 8.])

Similar to above, no need to elaborate. The square root operation can be done using torch.sqrt(), or you can also use a**(0.5).

5 Logarithmic Operations

In school, we learned that ln is base e, but in PyTorch, this is not the case.

In PyTorch, log is with base e, while log2 and log10 are operations with bases 2 and 10 respectively.

import numpy as np
print('Logarithmic operations')
a = torch.tensor([2,10,np.e])
print(torch.log(a))
print(torch.log2(a))
print(torch.log10(a))
>>> tensor([0.6931, 2.3026, 1.0000])
>>> tensor([1.0000, 3.3219, 1.4427])
>>> tensor([0.3010, 1.0000, 0.4343])

6 Approximation Operations

.ceil() Ceiling function
.floor() Floor function
.trunc() Truncate to integer
.frac() Get fractional part
.round() Round to nearest integer

a = torch.tensor(1.2345)
print(a.ceil())
>>> tensor(2.)
print(a.floor())
>>> tensor(1.)
print(a.trunc())
>>> tensor(1.)
print(a.frac())
>>> tensor(0.2345)
print(a.round())
>>> tensor(1.)

7 Clamping Operations

This operation restricts a number within a specified range [min, max]. If it’s less than min, it is set to min; if it’s greater than max, it is set to max. This operation is used in some adversarial generative networks, such as WGAN-GP, to forcibly limit the model parameters’ values.

a = torch.rand(5)
print(a)
print(a.clamp(0.3,0.7))

Output:

tensor([0.5271, 0.6924, 0.9919, 0.0095, 0.0340])
tensor([0.5271, 0.6924, 0.7000, 0.3000, 0.3000])

Good news! The knowledge circle of the Beginner’s Guide to Vision team is now open! To thank everyone for their support, the team has decided to allow free membership to the knowledge circle worth 149 yuan. Don’t miss this opportunity!

Detailed Explanation of Common Operations in PyTorch

Download 1: OpenCV-Contrib Extension Module Chinese Tutorial

Reply in the backend of “Beginner’s Guide to Vision” public account:Chinese Tutorial for Extension Modules, to download the first Chinese version of the OpenCV extension module tutorial on the internet, covering installation of extension modules, SFM algorithms, stereo vision, target tracking, biological vision, super-resolution processing, and more than twenty chapters.

Download 2: 52 Lectures on Python Vision Practical Projects

Reply in the backend of “Beginner’s Guide to Vision” public account: Python Vision Practical Projects, to download 31 visual practical projects including image segmentation, mask detection, lane line detection, vehicle counting, eyeliner addition, license plate recognition, character recognition, emotion detection, text content extraction, facial recognition, etc., to help quickly learn computer vision.

Download 3: 20 Lectures on OpenCV Practical Projects

Reply in the backend of “Beginner’s Guide to Vision” public account: OpenCV Practical Projects 20 Lectures, to download 20 practical projects based on OpenCV to advance your learning in OpenCV.

Discussion Group

Welcome to join the public account reader group to communicate with peers. Currently, there are WeChat groups for SLAM, 3D Vision, Sensors, Autonomous Driving, Computational Photography, Detection, Segmentation, Recognition, Medical Imaging, GAN, Algorithm Competitions, etc. (these will gradually be subdivided). Please scan the WeChat ID below to join the group, and note: “nickname + school/company + research direction”, for example: “Zhang San + Shanghai Jiao Tong University + Vision SLAM”. Do not send advertisements in the group, or you will be removed. Thank you for your understanding~

Detailed Explanation of Common Operations in PyTorch

Leave a Comment Cancel reply