A Comparison of CNN and RNN

If you don’t click the blue text to follow, the opportunity will fly away!

CNN and RNN are the two most commonly used deep learning network architectures in deep learning. Some students may not be very clear about the differences between these two networks, and today I happened to see an image that can clearly explain the differences between CNN and RNN.

Firstly, CNN has a strict constraint on the dimensions of the input data. For instance, if you train a model for image recognition using CNN with images of size 48*48 pixels, you also need to convert all the prediction images to 48*48 during prediction. This constraint may not be so obvious in image recognition, as people might say: ‘It’s fine, I can just resize the images during prediction.’ However, let’s consider another scenario, such as using CNN to build an intelligent question-answering system. CNN would require all question and answer data to be of fixed length, which can be quite problematic, as this model would turn questions and answers into fixed-length pairs. RNN does not have such constraints.

Moreover, RNN takes into account the relationships between neurons more. For example, when training a translation system, the translation of a phrase must consider the context, which requires the model to take into account the preceding and following factors of the data input. CNN does not possess such characteristics.

Now let’s use another image to illustrate the specific applications of CNN and RNN:

One to one: This represents the scenario of the CNN network, from fixed input to fixed output.

One to many: This is the scenario of RNN, with sequence output, somewhat like telling a story based on an image, where a fixed input image outputs a sequence that describes the meaning of the image.

Many to one: This is the scenario of RNN, with sequence input. For example, in sentiment analysis, we input a string of variable length text and return the emotion.

Many to many: This is the scenario of RNN, commonly seen in sequence-to-sequence tasks. For instance, using data from Jay Chou’s lyrics to mimic and write a song in his style, where both the input and output lengths can vary.

Give me a minute

And I’ll send you a world of learning

WeChat ID: Ordinary People Machine Learning

Long press the QR code to follow

Leave a Comment Cancel reply