Click on the “MLNLP” above to select the “Star” public account
Heavyweight content delivered first time
From: Open Source Frontline (ID: OpenSourceTop)
Comprehensive from:
https://github.com/yuanxiaosc/DeepNude-an-Image-to-Image-technology, programmers, etc.
A while ago, a programmer developed an application called DeepNude. “Is Technology Innocent? The AI Undressing App Was Taken Offline Just Hours After Launch”
The app is very simple to use: open the software → transform → generate photo. From the generated results given online, the effect can indeed confuse the real with the fake, as shown below:
Currently, this program has been taken offline, but a programmer found a project related to the image generation and image restoration technology used by DeepNude on GitHub, created by yuanxiaosc.
This repository contains the pix2pixHD (proposed by NVIDIA) algorithm of DeepNude. More importantly, it includes the general Image-to-Image theory and practical research behind DeepNude.
Image-to-Image Demo
This section provides a trial Image-to-Image Demo: from black-and-white sketches to colorful cats, shoes, handbags. The DeepNude software mainly uses Image-to-Image technology, which theoretically can convert the input image into any image you want.
Draw a simple sketch of a cat in the left box, then click the process button to output a model-generated cat.
Experience address: https://affinelayer.com/pixsrv/
DeepNude’s Technology Stack
-
Python + PyQt
-
Pytorch
-
Deep Computer Vision
Image-to-Image Theoretical Research
This section elaborates on the AI/deep learning theories related to DeepNude (especially computer vision) research,
1. Pix2Pix
Pix2Pix is a general solution for image-to-image translation problems using conditional adversarial networks proposed by the University of California, Berkeley. (GitHub address: https://github.com/phillipi/pix2pix)
2. Pix2PixHD
Obtaining high-resolution images from semantic maps. A semantic map is a color image where different color blocks represent different types of objects, such as pedestrians, cars, traffic signs, buildings, etc. Pix2PixHD takes a semantic map as input and generates a high-resolution realistic image from it. Previous technologies mostly generated rough low-resolution images that looked unrealistic. This research generated images with a resolution of 2k by 1k, which is very close to full HD photos. (GitHub address: https://github.com/NVIDIA/pix2pixHD)
3. CycleGAN
CycleGAN uses a cycle-consistency loss function for training without paired data. In other words, it can convert from one domain to another without one-to-one mapping between the source and target domains. This opens up the possibility of performing many interesting tasks, such as photo enhancement, image coloring, style transfer, etc. You only need source and target datasets.
Using the CycleGAN neural network model to achieve four main functions: photo style transfer, photo effect enhancement, seasonal changes in landscapes within photos, and object transformation.
4. Image Inpainting
In the demonstration video, you only need to simply brush away unwanted content in the image with a tool, even if the shape is very irregular, NVIDIA’s model can “restore” the image, filling in the erased blank with a very realistic picture. It can be called one-click photo editing, and “without any traces of editing.” This research comes from the team of Guilin Liu from NVIDIA, who released a deep learning method that can edit images or reconstruct damaged images, even if the image has a hole or missing pixels. This is currently the state-of-the-art method of 2018.
In fact, Image-to-Image may not be necessary. We can use GANs to directly generate images from random values or from text:
1. Obj-GAN
A new AI technology called Obj-GAN developed by Microsoft Research AI can understand natural language descriptions, draw sketches, and synthesize images, and then refine the details based on the sketch framework and the individual words provided by the text. In other words, this network can generate images of the same scene based on a textual description of everyday scenes.
Effects

Model

2. StoryGAN
Microsoft’s new research proposes a new type of GAN—ObjGAN, which can generate complex scenes based on textual descriptions. They also proposed another GAN that can draw stories—StoryGAN, which outputs comic strips based on the input text of a story.
The current optimal text-to-image generation model can generate realistic bird images based on single-sentence descriptions. However, text-to-image generators go far beyond generating a single image for a single sentence. Given a multi-sentence paragraph, it generates a series of images, each corresponding to a sentence, visualizing the entire story completely.
Effects

The most commonly used Image-to-Image technology today should be beauty apps, so why not develop a more intelligent beauty camera?
Technology is innocent, but don’t let entertainment lead to death. It’s important to understand what can and cannot be done. I hope everyone can use these technologies in the right way.
Recommended Reading:
The top public accounts in moments
From Word2Vec to Bert, discussing the past and present of word vectors (Part 1)
Chen Lijie, a PhD student born in 1995 from Tsinghua Yao Class, won the Best Student Paper Award at a top theoretical computer science conference.