Follow our public account to discover the beauty of CV technology
This article shares a recent literature review on super-resolution titled ‘Generative Adversarial Networks for Image Super-Resolution: A Survey’. It discusses the performance, advantages, disadvantages, complexity, challenges, and potential research points of 193 related papers.
Specific information is as follows:
-
Authors: Tian Chunwei, Zhang Xuan Yu, Lin Junwei, Zuo Wang Meng, Zhang Yan Ning
-
Affiliations: Northwestern Polytechnical University, National Engineering Laboratory for Integrated Big Data Application Technology in Air, Space and Sea, Western Norway University of Applied Sciences, Harbin Institute of Technology, and Pengcheng Laboratory
-
Paper link: http://arxiv.org/abs/2204.13620
Single image super-resolution has played an important role in the field of image processing. Recent Generative Adversarial Networks (GANs) have achieved outstanding performance on small sample low-resolution images. However, there is currently a lack of summaries of different GANs in image super-resolution. This article summarizes GANs in image super-resolution applications from different perspectives.
First, the development of GANs is introduced. Second, popular GANs’ methods for image applications are presented from both small and large sample perspectives. Then, the motivations, implementations, and differences of supervised, semi-supervised, and unsupervised training of GANs based on optimization methods and discriminative learning methods in image super-resolution applications are analyzed.
Next, these GAN methods are compared regarding their qualitative and quantitative performance in image super-resolution. Finally, the challenges and potential research points of GANs in image super-resolution are provided. The framework of this article is shown in Figure 1:
Figure 1 The framework of this article
For a summary of GAN-related models mentioned in the introduction, see Table 1:
Table 1 Summary of GANs in the introduction
In Chapter Two, a brief introduction to the development of GAN methods is provided, focusing on the original GAN network model and several classic variants, including BiGAN, CycleGAN, and StyleGAN. Their network structures are shown in Figures 2 to 6:
Figure 2 Network structure of the original GAN
Figure 3 Network structure of BiGAN
Figure 4 Network structure of CycleGAN
Figure 5 Network structure of StyleGAN
Figure 6 Network structure of the generator in StyleGAN
Popular GANs for Image Applications
In this chapter, popular GAN network models that are frequently applied are introduced from the perspectives of small and large samples. The framework of this chapter is shown in Figure 7:
Figure 7 The framework of Chapter Three
3.1 Popular GANs Based on Large Samples for Image Applications
In this section, models such as BEGAN, MGAN, SGAN, Perceptual GAN, and SODMTGAN are introduced in two directions: image generation and object detection. More related information can be found in Tables 2 and 3:
Table 2 Large sample-based GANs applied in image generation
Table 3 Large sample-based GANs applied in object detection
3.2 Popular GANs Based on Small Samples for Image Applications
In this section, models such as RAMT-GAN, CycleGAN, ArCycleGAN, Perceptual GAN, and SODMTGAN are introduced in two directions: style transfer and image restoration. More related information can be found in Tables 4 and 5:
Table 4 Small sample-based GANs applied in style transfer
Table 5 Small sample-based GANs applied in image restoration
GANs Applied in Image Super-Resolution
In this chapter, a review of some GAN models currently applied in the field of image super-resolution is conducted based on supervised, semi-supervised, and unsupervised classifications. The framework of this chapter is shown in Figure 8:
Figure 8 The framework of Chapter Four
4.1 Supervised GANs for Image Super-Resolution
In this section, supervised GANs are classified and reviewed from four aspects: improved structure, prior knowledge-based, improved loss functions, and multi-tasking. Details of the improved structure can be found in Table 6; more information regarding prior knowledge, improved loss functions, and multi-tasking can be found in Table 7:
Table 6 Supervised GANs for image super-resolution based on improved structures
Table 7 Supervised GANs for image super-resolution based on prior knowledge, improved loss functions, and multi-tasking
4.2 Semi-Supervised GANs for Image Super-Resolution
In this section, semi-supervised GANs are classified and reviewed from three aspects: improved structure, improved loss functions, and multi-tasking. More detailed content can be found in Table 8:
Table 8 Semi-Supervised GANs for Image Super-Resolution
4.3 Unsupervised GANs for Image Super-Resolution
In this section, unsupervised GANs are classified and reviewed from four aspects: improved structure, prior knowledge-based, improved loss functions, and multi-tasking. More details regarding the improved structure can be found in Table 9; further information on prior knowledge and improved loss functions can be found in Table 10; and more on multi-tasking can be found in Table 11:
Table 9 Unsupervised GANs for image super-resolution based on improved structures
Table 10 Unsupervised GANs for image super-resolution based on prior knowledge and improved loss functions
Table 11 Unsupervised GANs for image super-resolution based on multi-tasking
Comparison of GANs’ Performance in Image Super-Resolution
To help readers better understand the applications of GANs in the super-resolution field, this chapter presents the datasets and environmental configurations used by different models and compares the super-resolution results of different models qualitatively and quantitatively.
In this section, a classification review of the datasets used by all models mentioned in the text is conducted from the perspectives of training methods and model usage. More content classified by training methods can be found in Table 12; more content classified by model usage can be found in Table 13:
Table 12 Datasets used for image super-resolution GANs (classified by training methods; divided into training datasets and testing datasets)
Table 13 Datasets used by various GANs for image super-resolution (classified by model usage; divided into training datasets and testing datasets)
5.2 Environmental Configurations
In this section, a classification review of the environmental configurations of most models mentioned in the text is conducted from the perspectives of training methods, batch size, scaling factors, training frameworks, learning rates, and iteration counts. More content can be found in Table 14:
Table 14 Environmental configurations of different GANs for image super-resolution
To let readers understand the performance of different GANs in image super-resolution, both quantitative and qualitative analyses are used to evaluate the super-resolution effects of these GANs. Different methods’ Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM) are quantitatively analyzed based on three training methods and different datasets (see Table 15); the running time and complexity of different algorithms in image super-resolution are also presented (see Table 16). Qualitative analysis is used to evaluate the quality of the restored images, as shown in Figures 9 and 10.
Table 15 PSNR and SSIM scores of different GAN models for image super-resolution classified by three training methods on Set14, BSD100, and DIV2K
Table 16 Testing time and parameter count of different GAN models during 4x image super-resolution
Figure 9 Visual results of different GANs on 4x super-resolution in the BSDS100 dataset. The methods for obtaining each image are as follows: (a) Original image; (b) Bicubic interpolation; (c) ESRGAN; (d) RANKSRGAN; (e) KernelGAN; (f) PATHSRGAN
Figure 10 Visual results of different GANs on 4x super-resolution in the Set14 dataset. The methods for obtaining each image are as follows: (a) Original image; (b) Bicubic interpolation; (c) ESRGAN; (d) RANKSRGAN; (e) KernelGAN; (f) PATHSRGAN
Challenges and Prospects of GAN Models in Image Super-Resolution
This chapter summarizes some challenges and issues found during the work:
-
Unstable training. Due to the adversarial nature between the generator and the discriminator, GANs are unstable during training.
-
High computational resources and memory consumption. GANs consist of a generator and a discriminator, which may increase computational costs and memory consumption, leading to higher demands on digital devices.
-
No-reference high-quality images. Most existing GANs rely on paired high-quality images and low-resolution images to train image super-resolution models, which may be limited by digital devices in the real world.
-
Complex image super-resolution. Most GANs can only handle single tasks, such as image super-resolution and synthetic noise image super-resolution. However, images captured by digital cameras in the real world have drawbacks, such as low resolution, poor lighting, complex noise, and low-resolution images. Additionally, there are higher requirements for the combination of low-resolution images and image recognition. Therefore, existing image super-resolution GANs cannot effectively restore low-resolution images under the aforementioned conditions.
-
Metrics for GANs used in image super-resolution. Most existing GANs use PSNR and SSIM to test the super-resolution performance of GANs. However, PSNR and SSIM cannot fully measure the quality of the restored images. Therefore, finding effective metrics for GANs in image super-resolution is crucial.
To address these issues, the following potential research points regarding GANs for image super-resolution are introduced.
-
Enhance the generator and discriminator to extract significant features, thereby improving the stability of GANs in image super-resolution. For example, using attention mechanisms (i.e., Transformer), residual learning operations, and concatenation operations as generators and discriminators to extract more effective features to enhance the stability of accelerated image super-resolution GAN models.
-
Design lightweight GANs for image super-resolution. Reducing convolutional kernels, group convolutions, priors, and shallow network structures can lower the complexity of image super-resolution processing.
-
Use self-supervised methods to obtain high-quality reference images.
-
Combine the attributes of different low-level tasks to decompose complex low-level tasks into single low-level tasks in different GANs through various stages, helping to restore complex low-resolution images and complete high-level visual tasks.
-
Use image quality assessment techniques as metrics to evaluate the quality of precise images from different regions.
Welcome to join the ‘Super-resolution’ group chat👇 Please note:SR