Midjourney’s New Describe Feature: AI Image to Text Conversion

Is the prompt generated by AI better than those handwritten by humans?
In today’s world, where various AIGC products are emerging, getting AI to create an image is no longer a challenge.
Despite the advantages of “no need to draw” and “zero threshold for creation”, many people still struggle to achieve their ideal artwork with the help of AI—the difficulty lies in the “prompt”.
The prompt method is a profound management discipline: without precise command descriptions, the generated results can be quite random and may deviate significantly from expectations. Ultimately, very few people understand how to provide the perfect prompt, how to get the model to output results based on human user presets, or how to reduce the cost spent on prompts.
As a result, a specialized profession has emerged in the AIGC industry—prompt engineers. Recently, the startup Anthropic in San Francisco was hiring “prompt engineers and librarians”, offering salaries as high as $335,000.
Perhaps this process can also be handled by AI? Can AI perform better than humans?
Recently, the AI drawing tool Midjourney introduced a new feature: /describe.
Midjourney's New Describe Feature: AI Image to Text Conversion
“Today, we launched a /describe command that allows you to convert images into text.”
Specifically, Midjourney has learned to reverse-engineer prompts from images. If you upload an image to Midjourney, it will provide four versions of descriptions that you can directly use and adjust to generate the image variations you desire.
Midjourney is an AI drawing tool released in March 2022, which recently launched its fifth version. Due to its stunning generation effects, a large influx of users has led the company to close free registration channels.
This update is significant:
  • Improved accessibility: Image descriptions via ALT text elements for web display make it easier for visually impaired or reading-challenged individuals to access digital content;
  • Enhanced searchability: Descriptions can enable better search functionality and indexing through search engines;
  • For titles: Descriptions can clarify images;
  • Detailed prompts: Descriptions can be used to create more detailed prompts for making new variations, providing inspiration for rapid engineering.
How to use it?
Users just need to start with writing a ” /describe”, and Midjourney provides a place to upload images:
Midjourney's New Describe Feature: AI Image to Text Conversion
After uploading the image, click to enter:
Midjourney's New Describe Feature: AI Image to Text Conversion
Then, Midjourney returns four descriptions based on the image:
Midjourney's New Describe Feature: AI Image to Text Conversion
The four numbers at the bottom are mix buttons, each corresponding to a respective description. Clicking a number will remix the image based on the new description.
You can also mix and modify the prompt:
Midjourney's New Describe Feature: AI Image to Text Conversion
This is the original prompt for creating the example image:
an illustration of a brain with tree roots, psychedelic art, vibrant, by Alex Grey, by Amanda Sage, by Robert Venosa, neon colors
This is one of the prompts described by Midjourney, used for mixing:
An image of an abstract brain tree with roots, in the style of mark henson, luminous colors, dark symbolism, detailed anatomy, bold lines, vibrant color, psychological phenomena illustrations, chiaroscuro woodcuts
The new generation result is as follows:
Midjourney's New Describe Feature: AI Image to Text Conversion
User Experience
Once this feature was released, it attracted the attention of AI art enthusiasts, many of whom immediately started creating.
Below is a new image generated from a photo of a NASA lunar mission astronaut (right), which indeed looks remarkably realistic:
Midjourney's New Describe Feature: AI Image to Text Conversion
The image on the left is the original, and the right is the new result generated after /describe:
Midjourney's New Describe Feature: AI Image to Text Conversion
Some have also thought about using /describe to upgrade brand logos. This way, it can maintain previous brand recognition while bringing a fresh image.
Midjourney's New Describe Feature: AI Image to Text Conversion
The new logo of Starbucks feels completely natural.
The classic logo of Apple can also be colorful:
Midjourney's New Describe Feature: AI Image to Text Conversion
This is Adidas:
Midjourney's New Describe Feature: AI Image to Text Conversion
Pepsi could consider this new design:
Midjourney's New Describe Feature: AI Image to Text Conversion
Whether it’s “text-to-image” or “image-to-text”, with increased user interaction, the effectiveness of Midjourney will undoubtedly improve. Some researchers speculate that Midjourney has been conducting large-scale human feedback reinforcement learning (RLHF), which may be the largest text-to-image project in history. The more users it has, the better the RLHF Midjourney can perform, leading to even more users.
Midjourney's New Describe Feature: AI Image to Text Conversion
Source: Machine Heart
Reference link: https://medium.com/the-generator/midjourneys-crazy-new-describe-feature-a96cc09203cc

Midjourney's New Describe Feature: AI Image to Text Conversion

Shenzhen Longgang Intelligent Audiovisual Research Institute

Leave a Comment