Unlocking and Deep Analysis of ChatGPT’s Image Recognition Capabilities

Unlocking and Deep Analysis of ChatGPT's Image Recognition Capabilities

Reported by New Intelligence

Source: Lao Luo Doesn’t Speak
Author: Luo Yuchen
Editor: Hao Kun
[New Intelligence Guide] In fact, ChatGPT can recognize images! You just need to input the image URL and ensure that the image can be accessed without restrictions by OpenAI’s servers.

Because there is no upload button for images on the ChatGPT webpage, most people are unaware that it can recognize images. However, it can indeed recognize images; you need to input the image URL and ensure that the image can be accessed by the US server without login or other restrictions.

Currently, it has no problem reading images from Wiki and CNN.

So what can it recognize?

Ordinary news images

Unlocking and Deep Analysis of ChatGPT's Image Recognition Capabilities

Unlocking and Deep Analysis of ChatGPT's Image Recognition Capabilities

ChatGPT did not recognize the specific model of equipment, but described a scene that was very close to reality. It has a strong understanding of the composition’s perspective.

Images of people

Unlocking and Deep Analysis of ChatGPT's Image Recognition Capabilities

Unlocking and Deep Analysis of ChatGPT's Image Recognition Capabilities

This image of Macron did not recognize the face nor the text on the screen. However, it could generally describe the content of the image, although with some errors. The key point is its guess about the scene and its impression of the person, for example, it said Macron seems to be a very polite person.

The above two images have average recognition accuracy, while the following have stronger visual features and higher recognition accuracy.

Specific historical images

Unlocking and Deep Analysis of ChatGPT's Image Recognition Capabilities

Unlocking and Deep Analysis of ChatGPT's Image Recognition Capabilities

This photo of US troops searching for enemies during the Vietnam War was described in detail by ChatGPT, with a greater amount of information about the scene, characters, and details than the previous two images (it seems that ChatGPT did not optimize specifically for certain objects and faces), thus the description is very vivid, although with slight errors. Moreover, it also understands abstract information such as people’s work attitude.

It is worth noting that here it mentions police. This image comes from the Wiki entry on the Vietnam War. If ChatGPT obtained image information by reading the connected textual information, it would not identify those carrying out tasks as police. Therefore, it can be concluded that its conclusions are entirely derived from image information.

Globally famous images

Unlocking and Deep Analysis of ChatGPT's Image Recognition Capabilities

Unlocking and Deep Analysis of ChatGPT's Image Recognition Capabilities

This is a famous painting. At this time, ChatGPT can completely recognize the characters and events. At the same time, there are still many abstract descriptions, such as dynamism.

Human-computer interaction regarding images

Unlocking and Deep Analysis of ChatGPT's Image Recognition Capabilities

Unlocking and Deep Analysis of ChatGPT's Image Recognition Capabilities

Initially, ChatGPT did not understand this painting. After providing more information, it was able to comprehend the painting and described the emotions of the scene. Therefore, ChatGPT’s understanding of images is interactive, rather than a one-time output.

Landmark images

Unlocking and Deep Analysis of ChatGPT's Image Recognition Capabilities

Unlocking and Deep Analysis of ChatGPT's Image Recognition Capabilities

Successful recognition of London landmarks.

Unlocking and Deep Analysis of ChatGPT's Image Recognition Capabilities

Unlocking and Deep Analysis of ChatGPT's Image Recognition Capabilities

Red Square recognized accurately.

Unlocking and Deep Analysis of ChatGPT's Image Recognition Capabilities

Fabricated a so-called landmark; found no such landmark.

Video content recognition

Unlocking and Deep Analysis of ChatGPT's Image Recognition Capabilities

Unlocking and Deep Analysis of ChatGPT's Image Recognition Capabilities

ChatGPT admits it can recognize videos, but in content recognition from sites like YouTube and CNN, there are errors in the output. This is due to the business not being open; it even knows that my question is about the fast-forward progress.

Unlocking and Deep Analysis of ChatGPT's Image Recognition Capabilities

Unlocking and Deep Analysis of ChatGPT's Image Recognition Capabilities

If I don’t ask a follow-up question, I wouldn’t know if it really watched the video.

Continue to keep an eye on ChatGPT.

Unlocking and Deep Analysis of ChatGPT's Image Recognition Capabilities

Unlocking and Deep Analysis of ChatGPT's Image Recognition Capabilities

Leave a Comment