Source: Quantum Bit
This article is about 1700 words long and takes about 8 minutes to read.
When it comes to computer vision, you can't do without <strong>CNN</strong>.
But what do convolution, pooling, Softmax, etc., actually look like, and how are they interconnected?Imagining it from code alone can be a bit chilling. So someone simply visualized it completely in 3D using Unity.It’s not just a framework; the training process is also clearly presented.For example, during the training process, the real-time changes occurring in each layer as the epoch (iteration count) changes.


To better showcase the network details, users can freely fold and expand each layer.For instance, switching between linear and grid layouts for the feature maps.

Folding the output of the feature map from the convolutional layer.

Performing edge bunding on the fully connected layer, etc.

This kind of visualization image can be constructed by loading TensorFlow checkpoints.

It can also be designed in the Unity editor.

Does it feel a bit like Goose妹子?Recently, this project has gained popularity on social media.

Netizens have expressed:“If I could see this process during training, I could endure it for a longer time.”“Please open source.”

The author of this project is a 3D special effects artist from Vienna.According to him, the reason for creating such a CNN visualization tool is that when he was learning neural networks, he often found it difficult to understand how the convolutional layers were interconnected and how they connected with different types of layers.The main functions of this tool include visual representations of convolution, max pooling, and fully connected layers, as well as various simplification mechanisms that enable clearer visualization.In short, it aims to help beginners grasp the key points of CNN in the most intuitive way.
How to Create a 3D Network with Unity
Before diving into Unity, the author first built a visual 3D network prototype in Houdini.This means first providing a construction idea for the Unity version of the 3D network, preparing methods for demonstrating convolution calculations, the shapes of feature maps, edge bunding effects, and other issues.Its node editor looks like this:

Then, you can build the 3D neural network in Unity.First, you need to preset the “shape” of the neural network.Since the author had never used Unity before, he first learned about shaders and procedural geometry.In this, the author found some limitations. He used Shaderlab, the language developed by Unity for shaders, which cannot use color changes. Only pre-defined variables can pass between vertex, geometry, and pixel shaders.Moreover, it cannot arbitrarily assign vertex attributes; only pre-defined attributes such as position, color, UV, etc. (This may also be one reason why the 3D network cannot change colors in real-time).

After researching some concepts related to instancing, the author planned to use geometry shaders to generate the connections of the neural network. The start and end points are passed to the vertex shader and directly forwarded to the geometry shader.These lines can consist of up to 120 vertices since the scalar floating-point variables that Unity allows for geometry shaders is 1024.The designed network shape looks something like this:

Then, generate the corresponding 3D neural network image from the model’s TensorFlow code.The Tensorflow-native.ckpt format file needs to store the data required to reconstruct the model graph, binary weight readings, activation values, and specific layer names.For example, with the Cifar10-greyscale dataset, you need to write a checkpoint file and set the randomly initialized weights.

After that, you need to load these checkpoint files, start the TensorFlow session, input training examples to query the activation function of each layer.Then write a json file to store the shape, name, weight, and activation function of each layer for easy reading. Then use the weight values to assign color data to the Unity Mesh of each layer.

The final result is quite good:The author also recorded a development video, which can be found at the end of the article.
There Are Many Related Studies
In fact, many scholars have conducted research on neural network visualization before.For example, last May, a Chinese PhD student visualized convolutional neural networks, clearly displaying the changes in each layer, and just clicking on the corresponding neuron allows you to see its “operation”.This is a 10-layer pre-trained model loaded with TensorFlow.js, which can run the CNN model directly in the browser and also allows real-time interaction to display changes in neurons.However, this is still a 2D project.Currently, someone has also created a 3D visualized neural network like the one above:
This project also uses edge bunding, ray tracing, and other techniques, combined with feature extraction, fine-tuning, and normalization to visualize neural networks.This project aims to estimate the importance of different parts of the neural network using these techniques.To achieve this, the author represents each part of the neural network with different colors, predicting their interconnections based on the importance of nodes in the network.

The general processing process is as follows:

If you are interested in this type of 3D neural network visualization, you can find the corresponding open-source project address at the end of the article.
Author Introduction:

Stefan Sietzen, currently residing in Vienna, was previously a freelancer in the field of 3D visuals.He is currently pursuing a master’s degree at Vienna University of Technology and is very interested in visual computing. This 3D neural network is one of the projects he worked on during his master’s program.Development process: https://vimeo.com/stefsietzOpen-sourced 3D neural network project: https://github.com/julrog/nn_v
Editor: Huang Jiyan
Proofreader: Lin Yilin