Visualizing CNN: An Interactive Tool for Understanding Convolution

Click the above“Visual Learning for Beginners” to add it to your Favorites or “Pin”

Important content delivered promptly.

What is CNN? Is it the Cable News Network?

Every beginner with aspirations in AI will encounter the term CNN (Convolutional Neural Network) at the start.

However, every time they try to understand what CNN is and why it can intelligently recognize faces and distinguish sounds, they get confused and end up considering it a mystery:

Well, someone has solved the problems that Wikipedia couldn’t.

This tool, called CNN Explainer, is an online interactive visualization tool that breaks down CNN and explains to beginners what CNN is and how it can recognize objects.

It uses TensorFlow.js to load a pre-trained model with 10 layers, essentially running a CNN model in your browser; you just need to open your computer to understand what CNN is all about.

Moreover, this web tool allows for interaction; by clicking on any grid—representing a “neuron” in CNN—you can see what its inputs are and how they change in detail.

You can even clearly see each convolution operation.

Understanding Convolution

The usage of this CNN Explainer is very simple: just click with your mouse.

Clicking on a neuron enters an elastic explanation view, where you can see an animated simulation of the convolution kernel sliding process:

Clicking on a convolution process image allows you to see the more specific process:

You can see how the underlying convolution operation with a 3×3 kernel is transformed into a single number through calculations.

Understanding ReLU and Max Pooling Layers

Clicking on a ReLU layer neuron reveals the specific process; the ReLU function works like this:

Clicking on a pooling neuron also shows how the max pooling layer works specifically:

Understanding CNN Output Predictions

Clicking on the output neuron on the far right enters the elastic explanation view:

You can view details of the Softmax function:

Try Recognizing a ‘Raccoon’?

The CNN Explainer defaults to 10 images, and you can also add your custom images.

For example:

Stuffed bell pepper? Bell pepper pizza? Or what?

After copying the image link or uploading an image, it goes through 10 layers of processing and reaches a conclusion:

It is a bell pepper, but it could also be a bug.

However, it can only be classified into one of the original 10 categories on the right; for example, putting a raccoon:

It will be identified as espresso.

Produced by a Chinese PhD from Georgia Tech

Finally, the author of this CNN Explainer is a Chinese guy, Zijie Wang from Georgia Tech, who just started his PhD in machine learning last year, having graduated from the University of Wisconsin-Madison with a GPA of 3.95/4.00.

Visualizing CNN: An Interactive Tool for Understanding Convolution

He has also worked on other interesting data visualization projects, such as where the Chinese undergraduates at the University of Wisconsin-Madison come from:

Links

CNN Explainer: https://poloclub.github.io/cnn-explainer/

GitHub: https://github.com/poloclub/cnn-explainer

Paper: https://arxiv.org/abs/2004.15004

The author is a contracted writer for NetEase News’s “Each Has Its Own Attitude”

Download 1: OpenCV-Contrib Extension Module Chinese Version Tutorial

Reply "Extension Module Chinese Tutorial" in the backend of the "Visual Learning for Beginners" public account to download the first Chinese version of the OpenCV extension module tutorial available online, covering over twenty chapters including extension module installation, SFM algorithms, stereo vision, object tracking, biological vision, super-resolution processing, etc.

Download 2: Python Visual Practical Projects 52 Lectures

Reply "Python Visual Practical Projects" in the backend of the "Visual Learning for Beginners" public account to download 31 visual practical projects including image segmentation, mask detection, lane line detection, vehicle counting, eyeliner addition, license plate recognition, character recognition, emotion detection, text content extraction, facial recognition, etc., to help quickly learn computer vision.

Download 3: OpenCV Practical Projects 20 Lectures

Reply "OpenCV Practical Projects 20 Lectures" in the backend of the "Visual Learning for Beginners" public account to download 20 practical projects based on OpenCV to advance your OpenCV learning.

Group Chat

Welcome to join the reader group of the public account to communicate with peers; currently, there are WeChat groups for SLAM, 3D vision, sensors, autonomous driving, computational photography, detection, segmentation, recognition, medical imaging, GAN, algorithm competitions, etc. (will gradually be subdivided in the future). Please scan the WeChat number below to join the group, noting: "Nickname + School/Company + Research Direction", for example: "Zhang San + Shanghai Jiao Tong University + Visual SLAM". Please follow the format; otherwise, you will not be approved. After successful addition, you will be invited to related WeChat groups based on your research direction. Please do not send advertisements in the group, or you will be removed. Thank you for your understanding~