Introduction

The GANpaint tool developed by MIT allows for easy photo editing, enabling users to upload any photo for editing without damaging its original details. In addition to helping artists and designers quickly adjust visual effects, researchers say this work may help computer scientists identify “fake” images.

While we are still immersed in the black technology of GauGAN, which can turn sketches into landscape photos with just a few strokes, MIT has also brought a similar black technology.

Since the advent of GANs, they have created miracles time and again: filling textures, face swapping. But this time, the magical tool called GANpaint can create images with just a few strokes.

The teams from MIT and IBM launched “GANpaint Studio,” allowing users to upload any photo and then create from the photo, such as changing the size of objects or adding entirely new items (like trees and buildings), and the system can automatically generate realistic photographic images.

In addition to helping artists and designers quickly adjust visual effects, researchers say this work may also help computer scientists identify “fake” images.

No wonder netizens exclaimed: “In the future, we won’t need to work hard to demolish buildings!”

So how does this magical tool operate? Here is the link:

http://gandissect.res.ibm.com/ganpaint.html?project=churchoutdoor&layer=layer4

First, select the base image at the bottom of the interface; second, choose an object from the left, such as “grass,” “door,” “sky,” etc.; then select the operation to be performed, such as “paint,” “erase,” etc.; finally, just brush over the area in the image you want to change.

“Current machine learning systems are black boxes; we don’t always know how to improve them, somewhat like when you were a child and fixed the image display bug on an old TV by hitting it.” The main author of the paper, David Bau, a PhD student in computer science and artificial intelligence at MIT, described the project as the first time computer scientists could actually “draw neural networks with neurons.”

A Blessing for Designers

The project leader, Professor Antonio Torralba from MIT, believes the project has enormous potential applications. Designers and artists can use it to adjust visual effects more quickly. Imagine a director finishing a shoot only to find a critical prop missing from the background! At this point, GANpaint Studio can showcase its capabilities.

By analyzing the “artificial” units that need to be removed, GANpaint Studio can also be used to improve and debug other GANs currently under development. Additionally, it can help researchers better understand neural networks and their underlying structures.

An unexpected discovery is that the system seems to have learned some simple rules about the relationships between objects. It knows, in some way, what should appear where, and what should not appear in places that do not belong to them (for example, a window should not appear in the sky).

In different environments, it can create different visual effects. For example, suppose there are two different buildings in the image, and we need to add two doors to these buildings. GANpaint Studio will not crudely copy two identical doors but will add two different doors.

“All drawing applications will follow user instructions, but if the user commands placing an object in an unreasonable position, we may decide not to draw anything,” Torralba said. “This is a drawing tool with a strong personality; it opens a window for us to understand how GANs learn to represent the visual world.”

“This work is indeed exciting, allowing us to see how GANs learn things directly, and it starts to look a bit like common sense cognition,” said Jaakko Lehtinen, an associate professor at Aalto University in Finland. “I believe this ability is a key foundation for having autonomous systems that can function in the human world, which is infinite, complex, and constantly changing.”

Identifying Fake Images

The team’s goal is to give people better control over GAN networks. However, they recognize that with increased performance, the potential for abuse is very frightening; imagine how terrifying it would be to use these technologies on medical photos! Co-author Zhu Junyan believes that better understanding GANs, including the mistakes they make, will help researchers better eliminate counterfeit photos.

“You need to understand your opponent first to defend against it,” said Zhu Junyan, a postdoctoral researcher at CSAIL. “This understanding may help us detect fake images more easily.”

To develop the system, the research team first identified the units within GANs associated with specific types of objects (such as trees). Then, they tested these units individually to see if removing them would cause certain objects to disappear or appear. Importantly, they also identified the units that caused visual errors (artifacts) and worked to remove them to improve the overall quality of the images.

Hendrik Strobelt, a research scientist at IBM, said, “Whenever GANs produce very unrealistic images, the reasons for these errors used to be a mystery. But now we find that these errors are triggered by specific neurons, so we can turn off those neurons first to improve the quality of the images.”

“This system opens a door for better understanding GAN models, which will help us conduct any type of research on GANs,” Lehtinen said.

What Makes GANpaint So Magical?

Generative Adversarial Networks (GANs) have recently achieved impressive results in many practical applications, and with improvements in sample quality and training stability, many variants of GANs have emerged. However, the visualization and understanding of GANs are largely missing.

Therefore, researchers from MIT, The Chinese University of Hong Kong, IBM, and others, including David Bau, Zhu Junyan, Joshua B. Tenenbaum, and Zhou Bolei, published a paper proposing the GANpaint tool. They will present their work at the SIGGRAPH conference in Los Angeles next month.

In this work, the researchers proposed an analytical framework to visualize and understand GANs at the unit, object, and scene levels:

First, identify a set of interpretable units closely related to object concepts using a segmentation-based network dissection method;
Then, quantify the causal effects of interpretable units by measuring their “intervention” capabilities;
Finally, examine the relationships between these units and their surrounding environments by inserting the discovered object concepts into new images.

The primary goal of the researchers is to analyze how to encode objects like “trees” through the internal representations of GAN generators:

Characterizing units through dissection

Using intervention to measure causality

Experimental Results

Diagnosing and Improving GANs

The staff studied three variants of Progressive GAN trained on the LSUN scene dataset. To segment the generated images, we used a recent model (Xiao et al., 2018) trained on the ADE20K scene dataset.

The model can segment input images into 336 object classes, 29 large objects, and 25 material classes. To further identify units specifically for object parts, we expand each object class c into additional object part classes ct, cb, cl, and cr, representing the top, bottom, left half, or right half of the bounding box of the connecting component.

Diagnosing and Improving GANs

The framework can reveal how GANs successfully generate real images, as well as analyze the reasons for failures.

As shown in Figure 8, Figure 8a displays several annotated units responsible for visual artifacts in GAN results; Figure 8b shows that artifacts have been successfully removed, and the improved generation result maintains free pixels unchanged.

Exploring GANpaint: A Simple Tool for Image Editing

Figure 8: (a) Examples of units responsible for visual artifacts in GAN results. There are a total of 20 units. By ablating these units, we can fix the artifacts in (b) and significantly improve the visual quality in (c).

Locating Causal Units through Ablation

Exploring GANpaint: A Simple Tool for Image Editing

Figure 9: Measuring the effects of ablating units in GAN trained on conference room images

As shown in Figure 10, while it is easy to remove windows in conference room images, it is more challenging to delete windows in other scenes.

Exploring GANpaint: A Simple Tool for Image Editing

Figure 10: Comparing the effects of ablating 20 window units in GAN trained on five scene categories.

Learning GAN Operations by Inserting Contextual Relationships

GAN operations can also be learned by forcing units and inserting these features at specific locations in the scene. Figure 11 shows the effect of inserting 20 door units in the church scene.

Exploring GANpaint: A Simple Tool for Image Editing

Figure 11: Inserting door units by setting 20 causal units to fixed high values at a pixel in the representation.

Reference Links:

MIT:

http://news.mit.edu/2019/teaching-artificial-intelligence-to-create-more-common-sense-visuals-0701

Paper:

https://arxiv.org/pdf/1811.10597v1.pdf

Editor: Liu Yangke

Recommended Learning

Exploring GANpaint: A Simple Tool for Image Editing

Learning Address:https://campus.swarma.org/play/coursedetail?id=412

Or read the original text

Exploring GANpaint: A Simple Tool for Image Editing

A Blessing for Designers

Identifying Fake Images

What Makes GANpaint So Magical?

Experimental Results

Diagnosing and Improving GANs

Recommended Learning

Recommended Reading

Leave a Comment Cancel reply