↑ ClickBlue Text Follow the Jishi Platform

Author丨Civ@Zhihu (Authorized)

Source丨https://www.zhihu.com/question/66532235/answer/2782357337

Editor丨Jishi Platform

Jishi Guide

This article uses the C++ inference framework ncnn as an example to introduce the general deployment process. The approach is similar for other C++ inference frameworks, with the only learning cost being the API of the inference framework itself. >> Join the Jishi CV Technology Exchange Group to stay at the forefront of computer vision

There are many methods, a relatively simple path is:

PyTorch Model –> ONNX Format –> C++ Inference Framework

This article introduces the general process using the C++ inference framework ncnn as an example. The approach is similar for other C++ inference frameworks, with the only learning cost being the API of the inference framework itself.

1. Convert PyTorch Model to ONNX

ONNX is an open format built to represent machine learning models. ONNX defines a common set of operators – the building blocks of machine learning and deep learning models – and a common file format to enable AI developers to use models with a variety of frameworks, tools, runtimes, and compilers.

In simple terms, ONNX can be considered as an intermediate format. Most machine learning/deep learning frameworks can convert their models to ONNX, and similarly, can convert ONNX back to their own framework’s format, as shown in the figure below.

ONNX official website: https://onnx.ai/

In PyTorch, you can easily store a PyTorch model in ONNX format using the following method:

import torch

# Specify input size, ONNX needs this information to determine input size
# Parameters correspond to (batch_size, channels, H, W)
dummy_input = torch.randn(1, 3, 224, 224, device="cuda")

# model is the model itself
# dummy_input should be adjusted according to your needs
# "model.onnx" is the output file, change it to your desired path
torch.onnx.export(model, dummy_input, "model.onnx")

The torch.onnx.export also has some additional parameters that allow for more flexible usage; see here for details. The example in this article is sufficient for you to successfully deploy your own model.

It is important to note that ONNX’s goal is “universality,” so there may inevitably be situations where operators are incompatible. Specifically, when you convert a model from a certain framework (e.g., PyTorch) to ONNX and then convert ONNX to another framework’s model (e.g., ncnn), errors may occur (e.g., xxx operator not supported). There are various types of incompatibility, and specific cases need to be analyzed individually.

Some effective solutions include:

Use ONNXSIM to simplify the ONNX model. This is very effective. I personally recommend that whenever you use ONNX, you should process the ONNX model with ONNXSIM once. GitHub address: https://github.com/daquexian/onnx-simplifier. It is very easy to use; install using pip install onnxsim, and then use the command onnxsim input_onnx_model_path output_onnx_model_path.
Avoid relying on the size of intermediate variables for calculations. For example, in some Image to Image tasks, you may resize other tensors based on the size of an intermediate tensor. In this case, we first obtain the sizes H and W of the intermediate tensor, and then pass them as parameters to other methods. When encountering such operations, ONNX seems to create two variables related to H and W, but their values will be bound to the sizes H and W obtained when using dummy_input for forward propagation. Once bound, these values do not change. Therefore, when using different input sizes later, it is highly likely to result in errors (this point has not been thoroughly verified, but the intermediate results seem to suggest this).

Additionally, I strongly recommend using some network visualization tools. When encountering model conversion errors, they can help easily locate the source of the error. I personally prefer Netron, available at: https://github.com/lutzroeder/netron

Here is an image from the repository showing the effect:

2. Convert ONNX to ncnn

ncnn is a lightweight inference framework open-sourced by Tencent. Its biggest feature is ease of use. However, when power consumption and time consumption are the main considerations, it is necessary to try other frameworks, such as TensorFlow Lite.

ncnn address: https://github.com/Tencent/ncnn

ncnn provides tools to convert ONNX to ncnn format. You can find them here: https://github.com/Tencent/ncnn/releases. For example, on Windows, you can download this. After extracting, you can find onnx2ncnn.exe in the bin folder for x64 or x86. Use the following command in the command line to convert ONNX to ncnn format:

onnx2ncnn.exe onnx_model_path [ncnn.param] [ncnn.bin]

Replace onnx_model_path with your ONNX model path. The last two parameters are optional. If not provided, the converted ncnn model files will be generated in the same directory as onnx2ncnn.exe: one .param file and one .bin file. You can also specify the output path by providing the last two parameters.

3. Perform Model Inference in ncnn

Inference in any framework requires only two steps: load the model and convert the data to the framework format.

The method to load the model in ncnn is (there are other methods):

ncnn::Net model;  // Define a model
model.load_param("model.param");   // Load the model's param file
model.load_model("model.bin");        // Load the model's bin file

After loading the model, you only need to convert the data to the ncnn format. The input format for ncnn models is ncnn::Mat.

The method to convert OpenCV’s Mat to ncnn::Mat is listed here:

https://github.com/Tencent/ncnn/wiki/use-ncnn-with-opencv

For example:

// cv::Mat a(h, w, CV_8UC3);
ncnn::Mat in = ncnn::Mat::from_pixels(a.data, ncnn::Mat::PIXEL_BGR2RGB, a.cols, a.rows);

In JNI, to convert an Android bitmap to ncnn::Mat, you can refer to the official example: https://github.com/nihui/ncnn-android-squeezenet/blob/master/app/src/main/jni/squeezencnn_jni.cpp

The code is as follows:

// ncnn from bitmap
ncnn::Mat in = ncnn::Mat::from_android_bitmap(env, bitmap, ncnn::Mat::PIXEL_BGR);

With the model and input ready, simply forward once and retrieve the result:

ncnn::Extractor ex = model.create_extractor();

// input_name can be viewed using netron for .param or .bin files
// Replace input_name with the name of the first input position of the model
 ex.input(input_name, in);

ncnn::Mat out;  // Used to store the output result

// output_name can be viewed using netron for .param or .bin files
// Replace output_name with the name of the output position of the model
ex.extract(output_name, out);

Conclusion

As long as you are converting models, most paths are like this, and the learning cost is not high. The main cost is learning the inference framework. The inference frameworks provided by chip manufacturers are relatively complex, with various peculiar rules.

Deploying PyTorch Models Using C++

Reply “Dataset” in the public account backend to get 100+ resources sorted across various deep learning directions

Jishi Insights

Technical Column: Detailed Interpretation of Multimodal Large Models｜Understanding Transformer Series｜ICCV2023 Paper Interpretation｜Jishi Live

Dynamic Jishi Perspective: Welcome university faculty and students to apply for the Jishi Perspective 2023 Ministry of Education Industry-University Cooperation Collaborative Education Project｜New Horizons + Smart Brain, “Drone + AI” becomes a good helper for intelligent road inspection!

Technical Review: 40,000-word detailed explanation of Neural ODE: Using neural networks to depict non-discrete state changes｜What are the details of the transformer? 18 Questions about Transformers!

Deploying PyTorch Models Using C++

Click to read the original text to enter the CV community

Gain more technical insights

1. Convert PyTorch Model to ONNX

2. Convert ONNX to ncnn

3. Perform Model Inference in ncnn

Conclusion

Leave a Comment Cancel reply