How to Use C++ to Call and Deploy PyTorch Models?

Click the "Xiao Bai Learns Vision" above, select to "star" or "top"
Heavy content delivered immediately
Author丨Civ@Zhihu (Authorized)
Source丨https://www.zhihu.com/question/66532235/answer/2782357337
Editor丨Jishi Platform

Jishi Guide

This article uses the C++ inference framework ncnn as an example to introduce the general deployment process. The approach is similar for other C++ inference frameworks; the only learning cost is the API of the inference framework itself.

There are many methods, and a relatively simple path is:

PyTorch Model –> ONNX Format –> C++ Inference Framework

This article takes the C++ inference framework ncnn as an example to introduce the general process. The approach is similar for other C++ inference frameworks; the only learning cost is the API of the inference framework itself.

1. Convert PyTorch Model to ONNX

ONNX is an open format built to represent machine learning models. ONNX defines a common set of operators – the building blocks of machine learning and deep learning models – and a common file format to enable AI developers to use models with a variety of frameworks, tools, runtimes, and compilers.

In simple terms, ONNX can be seen as an intermediate format. Most machine learning/deep learning frameworks can convert their models to ONNX and can also convert ONNX back to their own framework format, as shown in the figure below.

How to Use C++ to Call and Deploy PyTorch Models?
Figure 1: Model conversion between different frameworks using ONNX

ONNX official website: https://onnx.ai/

In PyTorch, you can easily save a PyTorch model in ONNX format using the following method:

import torch

# Specify input size. ONNX needs this information to determine input size
# Parameters correspond to (batch_size, channels, H, W)
dummy_input = torch.randn(1, 3, 224, 224, device="cuda")

# model is the model itself
# dummy_input can be changed according to your needs
# "model.onnx" is the output file, change it to your own path
torch.onnx.export(model, dummy_input, "model.onnx")

torch.onnx.export has some additional parameters for more flexible usage; for details, see https://pytorch.org/docs/stable/onnx.html. The example in this article is sufficient for you to successfully deploy your model.

It is important to note that ONNX aims to be “universal”, so there may inevitably be situations where operators are incompatible. Specifically, when you convert a model from a certain framework (e.g., PyTorch) to ONNX and then convert ONNX back to another framework model (e.g., ncnn), you may encounter errors (xxx operator not supported). Incompatible situations vary; no examples are provided here as they need to be analyzed on a case-by-case basis.

Some effective solutions:

  1. Use ONNXSIM to simplify the ONNX model. It is very effective. My personal recommendation: as long as you use ONNX, always process the ONNX model with ONNXSIM once. GitHub link: https://github.com/daquexian/onnx-simplifier. It is very easy to use; install with “pip install onnxsim”, then use the command “onnxsim input_onnx_model_path output_onnx_model_path”. The code call is also very simple; refer to the examples in the GitHub link.

  2. Avoid relying on the size of intermediate variables for calculations. For example, in some Image to Image tasks, you may resize other tensors based on the size of an intermediate tensor. In this case, we first obtain the sizes H and W of the intermediate tensor and then pass them as parameters to other methods. When encountering such calculations, ONNX seems to create two variables related to H and W, but their values will be bound to the H and W obtained from forwarding once with dummy_input. Once bound, these values will not change. Therefore, when using different input sizes later, there is a high probability of errors occurring (this has not been carefully verified, but the intermediate results seem to indicate this situation).

Additionally, I strongly recommend using some network visualization tools. When encountering model conversion errors, they can help easily locate the error location. I personally prefer netron; address: https://github.com/lutzroeder/netron

Here’s an image from the repository showing the effect:

How to Use C++ to Call and Deploy PyTorch Models?
Figure 2: Netron Effect

2. Convert ONNX to ncnn

ncnn is a lightweight inference framework open-sourced by Tencent. Its biggest feature is ease of use. However, when power consumption and time consumption are the main considerations, it is necessary to try other frameworks, such as TensorFlow Lite.

ncnn address: https://github.com/Tencent/ncnn

ncnn provides tools to convert ONNX to ncnn format. You can find it here: https://github.com/Tencent/ncnn/releases. For example, on Windows, you can download https://github.com/Tencent/ncnn/releases/download/20221128/ncnn-20221128-windows-vs2017.zip. After extracting, you can find onnx2ncnn.exe in the x64 or x86 bin folder. Use the following command in the command line to convert ONNX to ncnn format:

onnx2ncnn.exe onnx_model_path [ncnn.param] [ncnn.bin]

Replace onnx_model_path with your own ONNX model path. The last two parameters are optional. If not specified, the converted ncnn model files will be generated in the same directory as onnx2ncnn.exe: one .param file and one .bin file. You can also specify the last two parameters to set the output file paths yourself.

3. Model Inference under ncnn

Inference in any framework only requires two steps: loading the model and converting the data to the framework format.

The method to load the model under ncnn is (there are other methods):

ncnn::Net model;  // Define a model
model.load_param("model.param");   // Load the model's param file
model.load_model("model.bin");        // Load the model's bin file

After loading the model, you just need to convert the data to ncnn format. The input format for ncnn models is ncnn::Mat.

The method to convert OpenCV’s Mat to ncnn::Mat is fully listed here:

https://github.com/Tencent/ncnn/wiki/use-ncnn-with-opencv

For example:

// cv::Mat a(h, w, CV_8UC3);
ncnn::Mat in = ncnn::Mat::from_pixels(a.data, ncnn::Mat::PIXEL_BGR2RGB, a.cols, a.rows);

In JNI, to convert an Android bitmap to ncnn::Mat, refer to the official example: https://github.com/nihui/ncnn-android-squeezenet/blob/master/app/src/main/jni/squeezencnn_jni.cpp

The code is as follows:

// ncnn from bitmap
ncnn::Mat in = ncnn::Mat::from_android_bitmap(env, bitmap, ncnn::Mat::PIXEL_BGR);

With the model and input ready, simply forward once and retrieve the result:

ncnn::Extractor ex = model.create_extractor();

// input_name can be viewed using netron on .param or .bin files
// Replace input_name with the name of the first input position of the model
 ex.input(input_name, in);

ncnn::Mat out;  // Variable to hold the output result

// output_name can be viewed using netron on .param or .bin files
// Replace output_name with the name of the output position of the model
ex.extract(output_name, out);

Conclusion

As long as the model is converted, most paths are like this, and the learning cost is not high. The main cost is learning the inference framework. The inference frameworks provided by chip manufacturers are relatively complex, with various peculiar rules and regulations.

Download 1: OpenCV-Contrib Extension Module Chinese Version Tutorial
Reply in the backend of the “Xiao Bai Learns Vision” public account:Extension Module Chinese Tutorial, to download the first Chinese version of the OpenCV extension module tutorial available online, covering installation of extension modules, SFM algorithms, stereo vision, object tracking, biological vision, super-resolution processing, and more than twenty chapters of content.
Download 2: Python Vision Practical Project 52 Lectures
Reply in the backend of the Xiao Bai Learns Vision public account:Python Vision Practical Project, to download 31 vision practical projects including image segmentation, mask detection, lane line detection, vehicle counting, eyeliner addition, license plate recognition, character recognition, emotion detection, text content extraction, facial recognition, etc., to help quickly learn computer vision.
Download 3: OpenCV Practical Project 20 Lectures
Reply in the backend of the Xiao Bai Learns Vision public account:OpenCV Practical Project 20 Lectures, to download 20 practical projects based on OpenCV that implement 20 practical projects, achieving advanced learning in OpenCV.

Group Chat

Welcome to join the public account reader group to exchange with peers. Currently, there are WeChat groups for SLAM, 3D Vision, Sensors, Autonomous Driving, Computational Photography, Detection, Segmentation, Recognition, Medical Imaging, GAN, Algorithm Competitions and others (will gradually be subdivided in the future), please scan the WeChat ID below to join the group, and note: “nickname + school/company + research direction”, for example: “Zhang San + Shanghai Jiao Tong University + Vision SLAM”. Please follow the format; otherwise, you will not be approved. After successful addition, you will be invited to join relevant WeChat groups based on your research direction. Please do not send advertisements in the group, or you will be removed from the group. Thank you for understanding~

How to Use C++ to Call and Deploy PyTorch Models?

How to Use C++ to Call and Deploy PyTorch Models?

Leave a Comment