High-Precision Image Editing with AI: EditGAN

Author: Nathan Horrocks

Translation: Gabriel Ng
Proofreading: Zhang Ruiyi




This article is about 3300 words long and is suggested to be read in 5 minutes.
This article introduces how to use AI for high-precision image editing.

Processing photos of cats, cars, and even antique paintings has never been easier, thanks to a Generative Adversarial Network (GAN) model called EditGAN. This work is from researchers at NVIDIA, the University of Toronto, and MIT, deployed from DatasetGAN (an AI vision model that requires as few as 16 manually annotated images for training, achieving performance as efficient as other methods that require 100 times more images). EditGAN combines the powerful capabilities of previous models and empowers users to process or manipulate desired images, such as drawing, using simple commands without degrading the quality of the original image.

What is EditGAN?

According to this paper

(https://arxiv.org/pdf/2111.03186.pdf): EditGAN is the first GAN-driven image editing framework that simultaneously offers the following features:

Extremely high-precision editing;
Requires very low annotated training data (no external classifiers needed);
Can run and interact in real time;
Allows direct combination of multiple edits;
Can handle real-time embedded images, GAN-generated images, and even images beyond the set range.

The model learns a quantitative editing vector that can be applied multiple times to an image with feedback. Essentially, it forms an intuitive understanding of a set of images and their content, which users can weigh in various specific modifications and edits. The learning of this model comes from similar sets of images, enabling it to recognize different components and specific parts of objects within the image. A user can utilize this to direct modifications to different chunks or edit within specific areas. The model’s precision ensures that images are not distorted by factors outside the parameters set by the user.

Fig 1 EditGAN in action, the AI trained in the model allows users to make changes to the original image, sometimes dramatically.

“The framework allows us to learn editing vectors, where the number of vectors can be set freely, enabling these editing vectors to directly adapt to other images in a mutually feedback manner,” the researchers explained in their study. “We demonstrated that EditGAN can manipulate image details and freedom to a degree previously unattainable while maintaining the integrity of the image quality. We can also easily merge multiple edits and perform reasonable editing operations not present in the training data of EditGAN. We showcased the advantages of EditGAN across various image types and quantitatively analyzed its super-level performance compared to several previous editing methods in standard editing reference tasks.”

From adding various smiles, changing a person’s gaze direction, creating a new hairstyle, to giving a car a better set of wheels, the researchers demonstrated how much intrinsic detail a model can extract with minimal data annotation. Users can sketch a rough outline or map the parts they want to edit, guiding the AI model to understand the modification task, such as enlarging cat ears or making car headlights cooler. This AI then renders the image while maintaining high accuracy and preserving the quality of the original image. Subsequently, the same edits can be applied to other real-time images.

Fig 2 An example of pixel allocation to different parts of the image. The AI identifies these different parts and makes edits based on human input.

How Does This GAN Work?

EditGAN assigns a category to each pixel of the image, such as tire, windshield, or frame. These pixels need to be controlled within the AI’s current space and adjusted based on user input, allowing users to easily and flexibly edit the space of these categories. EditGAN processes only the pixels relevant to the required changes. The AI understands the meaning of each pixel through other images in the model’s training, so you won’t find a case where you add cat ears to an image of a car and the AI accurately identifies the result. However, when the model is correct, EditGAN performs exceptionally well, providing outstanding image editing results.

Fig 3 EditGAN can train a wide range of image sets, from animal series to environmental series, enabling it to form a detailed understanding of content.

The Potential of EditGAN

AI-driven photo and image editing has the potential to streamline workflows for photographers and content creators, empowering innovation and digital art to new heights. EditGAN also empowers novice photographers and editors to produce high-quality content or rare viral memes.

“This AI may change the way we edit, not just for images, but potentially for videos as well. It allows a person to take an image and make changes using simple text commands. If you have a photo with a car and you want the wheels larger, just input ‘make the wheels larger,’ poof! — an image, exactly like the photo, with the car and larger wheels.” – Fortune Magazine

EditGAN may also be used in the future for other important applications. For instance, the editing capabilities of EditGAN can be used to generate large datasets containing certain established features. Such customized datasets can play a significant role, such as training lower-level machine learning models to handle different computer vision tasks.

Moreover, the EditGAN framework may influence the development of future generations of GANs. While the current version of EditGAN focuses on image editing, similar approaches could potentially be used to edit 3D shapes and objects, a capability that would be highly available for creating virtual 3D content in gaming, movies, or the metaverse.

For more information about this magical method, please check the paperhttps://arxiv.org/pdf/2111.03186.pdf.

NVIDIA is always at the forefront of technology; please check the NVIDIA Research Institute (https://www.nvidia.com/en-us/research/) to learn more about innovative research.

Original link:

https://developer.nvidia.com/blog/high-precision-image-editing-with-ai-editgan/

Original title:

High-precision Image Editing with AI: EditGAN

Editor: Wang Jing

Proofreader: Gong Li

Translator’s Profile

High-Precision Image Editing with AI: EditGAN

Gabriel Ng, an undergraduate student majoring in probability and statistics at Tsinghua University, is passionate about data analysis and language learning (and music). His daily activities revolve around studying, fitness, and music. He enjoys exploring the essence of various problems through data mining and understanding different cultural stories through language. He hopes to analyze problems rationally and understand them emotionally through learning and experience accumulation from different perspectives.

Translation Team Recruitment Information

Job Content: A meticulous heart is needed to translate selected foreign articles into fluent Chinese. If you are an international student in data science/statistics/computer-related fields, or working abroad in related fields, or confident in your language skills, you are welcome to join the translation team.

What You Can Get: Regular translation training to improve volunteers’ translation skills, enhance understanding of cutting-edge data science, and overseas friends can stay connected with domestic technology application development. The THU Data Team’s industry-academia-research background provides good development opportunities for volunteers.

Other Benefits: You will have partners from data science professionals from well-known companies and students from prestigious universities such as Peking University and Tsinghua University, as well as overseas institutions.

Click the “Read Original” at the end of the article to join the Data Team~

Reprint Notice

If you need to reprint, please indicate the author and source prominently at the beginning (Reprinted from: Data Team ID: DatapiTHU), and place a prominent QR code for Data Team at the end of the article. For articles with original identification, please send the [Article Name – Waiting for Authorization Public Account Name and ID] to the contact email to apply for whitelist authorization and edit according to requirements.

After publication, please feedback the link to the contact email (see below). Unauthorized reprints and adaptations will be pursued legally.

High-Precision Image Editing with AI: EditGAN

Click “Read Original” to embrace the organization.

Leave a Comment Cancel reply