Hello everyone, today we will discuss Generative Adversarial Networks, and how to create anime avatars.
In this lesson, we will design and implement a Convolutional Generative Adversarial Network (DCGAN):
Then we will use this network to generate various anime avatars.
1. What Are Generative Adversarial Networks
Generative Adversarial Networks, abbreviated as GAN.
GAN is an unsupervised deep learning model that was first proposed in 2014.
This algorithm generates new data that is similar to the original dataset through competitive learning.
The generated images are very similar to the original real images.
Using GAN to generate data is cost-effective, and the results can be directly applied in various fields.
For example, in gaming, film production, or artistic creation.
Due to the wide application of GANs, many variant algorithms have emerged.
Such as DCGAN, CycleGAN, ProGAN, etc., each with different characteristics and application scenarios.
2. Architecture of Generative Adversarial Networks
Generative Adversarial Networks differ from traditional neural networks; they consist of two main networks:
These two networks are called the generator and the discriminator.
The generator’s main function is to generate realistic data instances based on random noise.
For example, the image shows that a segment of random noise is input to the generator, which generates a realistic-looking handwritten digit image.
The discriminator’s main function is to judge the authenticity of the data.
It acts as an image classifier that determines whether the input data is real training data or fake data generated by the generator.
The training of GANs is a game process where the generator and discriminator compete against each other.
The generator continuously learns how to better simulate real data, while the discriminator learns to recognize real and fake data.
The generator’s goal is to produce fake data that the discriminator cannot distinguish.
Meanwhile, the discriminator’s goal is to accurately distinguish between real and generated data.
As training progresses, the generator improves its generation based on the discriminator’s feedback.
Similarly, the discriminator adjusts and optimizes its judgment strategy based on the data generated by the generator.
Ultimately, after multiple rounds of alternating iterations, the generator can produce fake images similar to the training set.
And the discriminator can accurately distinguish between real and generated images.
3. Generating Anime Avatars with DCGAN
Next, we will implement a DCGAN to generate anime avatars.
The experiment mainly includes three parts:
1) Data preparation and processing
2) Implementation of the discriminator and generator models
3) Training the DCGAN model.
First, we search for “anime face” on the Kaggle platform to obtain anime avatar data.
This dataset contains 63,632 different-sized anime avatars.
Next, we need to implement a dataset reading class AnimeDataset to read the anime avatar data.
Subsequently, we can use DataLoader to read the avatar data from AnimeDataset.
DCGAN improves upon the standard GAN by replacing fully connected layers with transposed convolutional layers and convolutional layers.
DCGAN can handle image data more efficiently, improving the quality of generated images and the accuracy of image recognition.
The generator in DCGAN includes transposed convolutional layers:
When the generator is working, it receives a random noise vector z.
The noise vector z undergoes transposed convolution in the generator, continuously increasing in size until it is transformed into the generated image.
The code is as follows:
We need to pay special attention to the usage of the transposed convolutional layer ConvTranspose2d.
The discriminator of DCGAN is a convolutional neural network consisting of five convolutional layers, used for image classification:
When the discriminator is working, it receives a 64×64 sized 3-channel color image.
After passing through five convolutional layers, it produces one output result corresponding to the input image’s category.
The code is as follows:
From the code, we can see that the discriminator of DCGAN is a convolutional neural network containing conv2d.
The training process of DCGAN is the same as that of ordinary GANs.
The discriminator adjusts and optimizes its judgment strategy based on real sample data and the data generated by the generator G.
The generator further improves its generation based on the discriminator D‘s judgment results.
Ultimately, after multiple rounds of alternating iterations, training balance is achieved.
The code is as follows:
First, we use a data transformation object transform to crop and resize the images and convert them into tensors for normalization.
Then, we read the anime avatar data using AnimeDataset. Next, we use DataLoader to read and process the data.
Then we define the models, including the generator netG and the discriminator netD.
Optimizing the generator and discriminator requires two different optimizers optimizerG and optimizerD.
They will optimize netG and netD independently, without interference.
During the iterative process of the GAN, the iterations of the discriminator and generator alternate.
The first part is the iteration of the discriminator.
The second part is the iteration of the generator.
Every 10 batches of data, debugging information is printed.
We can use a fixed random vector fixed_noise to pass into the current generator netG to generate fake images for effect testing.
By observing the generated fake images, we can see that as iterations progress, the quality of data generation improves.
After training the generator, we can use netG to generate images.
Here we need to write a new Python file.
In the main function, define the noise vector dimension and input channel number, and pass them to the Generator to define netG.
Then use netG to read the trained generator model and set netG to evaluation mode.
Next, define a random noise fixed_noise and pass it to netG to generate fake images fake.
Finally, call save_image to save the images.
So, that’s it for the Generative Adversarial Network and creating anime avatars. Thank you for watching, and see you in the next lesson.