5 Things You Need to Know About GPT-4 and Wenxin Yiyan

5 Things You Need to Know About GPT-4 and Wenxin Yiyan

5 Things You Need to Know About GPT-4 and Wenxin YiyanOpinion / Liu Run Main Writer / You An Editor / Li Sang
This is the 1841st original article from Liu Run’s public account.
Yesterday, upon waking up, OpenAI in the United States officially announced the large model GPT4.
It has only been 105 days since the launch of ChatGPT based on GPT3.5.
The speed and intensity of technological advancement once again astonish people.
It seems that just after marveling at GPT3.5 passing the Google programmer interview, GPT4 has already scored in the top 10% of the U.S. bar exam.
Transitioning to the examination hall, it can even achieve scores that qualify for Harvard in the SAT.
Moreover, it has demonstrated tremendous potential in image recognition.
We always think that there is no rush, as there are still many things AI cannot do.
However, AI has accomplished one task after another, and at levels beyond human imagination.
As a result, many people start to feel anxious, eager to find opportunities.
But before seeking opportunities, I suggest you take a moment to calm down and take another look.
Understand what this GPT, which has swept through us time and again, really is.
5 Things You Need to Know About GPT-4 and Wenxin Yiyan

GPT-4

It is known that GPT4 should be more powerful than GPT3.5. But what exactly makes it more powerful?
Before the official release of GPT4, there were two speculations:
1. The machine learning parameters of GPT4 will increase from 175 billion to 100 trillion. The difference in capability is comparable to that of a person earning 1750 yuan per month versus one earning 1 million yuan per month.
2. GPT4 will support multi-modal inputs, meaning it can handle text, images, audio, and even video. The difference in capability is akin to that of a deaf and blind person compared to a person who can hear and see.
These are quite shocking speculations.
However, Sam, the boss of OpenAI, responded to this with: “complete bullshit.”
Is it really complete bullshit?
The actual results published show that GPT4’s capabilities indeed appear to have improved.
However:
1. The specific learning parameters were not disclosed. OpenAI only defined a personality for this iteration: “GPT4 is more reliable, more creative, and capable of handling more detailed and complex instructions than GPT3.5.”
2. It supports multi-modal inputs, but it only adds image support, and when GPT produces output, it still only responds in text.
While the extent of progress is not clear, the general direction is roughly accurate.
To truly understand GPT, I suggest you really grasp these two major directions: large models and multi-modal.
Let’s first talk about the most eye-catching update of GPT4: multi-modal.
Is multi-modal impressive?
5 Things You Need to Know About GPT-4 and Wenxin Yiyan

Multi-Modal

What is multi-modal?
Imagine a colleague suddenly sends you a WeChat message: “Shh,Mr. Wang is behind you.”
Upon receiving it, you immediately turn around and seeMr. Wang, greeting him: “Mr. Wang.”
Is it impressive to achieve this?
It is indeed impressive. In just a few seconds, you processed multiple types of data: text, sound, and image.
Multi-modal refers to the modes and forms of various information data.
These different types of data, such as the text you recognize in the message, the image you see when you turn around, and the sound you utter, can sometimes point to the same information: “Mr. Wang.”
For you, recognizing and correlating the same “Mr. Wang” in a multi-modal context is quite simple, almost unremarkable.
But for GPT, it represents a grand ideal: to traverse multiple modalities and handle more complex tasks.
Why is this ideal grand?
Because to achieve this, you first need to have eyes and ears.
The pre-upgrade ChatGPT could only communicate with you through text.
But after upgrading to GPT4, ChatGPT has gained “eyes.”
You send it an image, it can “see,” understand it, and based on that, help you with more tasks.For example, help you edit out “Mr. Wang.”
This is the significance of multi-modal for GPT4: it has evolved to have eyes.
Impressive. But this is not all of GPT4’s evolution.
5 Things You Need to Know About GPT-4 and Wenxin Yiyan

Large Models

Another frequently mentioned advancement is GPT4’s fundamental skill—language ability.
In just 105 days, the language model of GPT has undergone a new round of evolution.
On one hand, the “maximum output limit of words” has been increased to 25,000 words.
Previously, if you gave it an article that was slightly longer, it might not be able to process it fully; now, many people even directly throw web pages or PDFs at it for processing.
Moreover, in understanding and answering questions, especially complex ones, GPT4 also seems smarter.
How did it become smarter?
Simply put, GPT-4’s “brain” has become larger, and it has more “neurons.”
In 2018, GPT had 117 million parameters.
In 2019, GPT2 had 1.5 billion parameters.
In 2020, GPT3, and later ChatGPT based on GPT-3.5, had 175 billion parameters.
Yesterday’s released GPT4, although the official parameter count was not disclosed, no one would underestimate its parameter count.
When GPT is unleashed, it’s in the billions of parameters. Moreover, it continues to double every year.
This is why when mentioning GPT, people often talk about “large models.”
A large number of parameters, like a large number of neurons, often means a greater probability of “guessing the answer you want.”
People often refer to this characteristic as being “smart.”
5 Things You Need to Know About GPT-4 and Wenxin YiyanGPT-4’s partial report card
Think about this scale of parameters, and then look at GPT4’s exam scores that “crushed” humans, and it seems more psychologically balanced.
However, it’s still hard to accept: does it really only require a sufficiently large parameter count to be smart enough to crush 90% of humans?
It’s not that simple.
5 Things You Need to Know About GPT-4 and Wenxin Yiyan

Training

Have you ever thought about how ChatGPT on the other side of the screen does the problem when you open the ChatGPT dialogue box and type a question?
For example, if you ask in the dialogue box: “The bright moonlight before the bed, what is the next line?”
As an artificial intelligence, rather than a human, ChatGPT’s approach to answering questions might be completely different from yours from the start:
It does not recall; instead, it immediately goes online to grab data. It checks where the phrase “The bright moonlight before the bed” appears.
Then it uses its “brain” that can simultaneously consider billions of parameters to calculate:
What words are likely to follow this line based on the context? What is the probability of these words being the correct answer?
Then it selects the text with the highest probability and responds to you in the dialogue box.
Finding combinations, calculating probabilities, comparing sizes.
You asked a language question, but it’s doing a math problem.
Because doing math problems is the essence of the model.
Will it make mistakes?
Yes. If the data samples it finds online happen to come from many erroneous pirated books, it might end up responding with something incorrect.
Clearly, the model doesn’t care about right or wrong, but humans do.
At this point, humans will do one thing: optimize it, train it.
How to train?
Think back to how you solved this problem.
First, you must have seen this poem in your textbook.
Then, you did some exercises. After finishing, you checked the answers and someone told you whether you were right or wrong.
If you got it right, you receive points, and you’re happy, thus you remember the answer.
If you got it wrong, you lose points, you’re unhappy, and you go find the correct answer.
GPT is quite similar.
It first “reads” by inputting massive data.
Then it “takes exams” and receives “feedback.”
If correct, it gains points. If incorrect, it loses points.
The difference is that AI doesn’t feel happy or sad.
But who doesn’t like high scores?
Billions of parameters and their corresponding weights are adjusted based on this feedback, over and over again.
Until it approaches accuracy infinitely.
Doesn’t it sound perfect? But there is a significant difficulty.
5 Things You Need to Know About GPT-4 and Wenxin Yiyan

Cost

Training comes at a cost.
The more astounding the parameter count, the more astounding the cost.
When a “brain” that needs to simultaneously consider hundreds of billions or even more parameters is running, what costs are involved?
Firstly, each training session can cost as much as several apartments in Shanghai.
Data shows that training a model based on GPT3.5 for ChatGPT costs between 4.6 million to 5 million USD.
Additionally, to enter the GPT level of products, the “entrance ticket” is also very expensive.
For example, to achieve the necessary computing power, the earlier GPT3 alone required 10,000 NVIDIA GPU chips during training.
Taking the currently mainstream NVIDIA A100 chip as an example, each costs about 80,000 yuan.
With 10,000 chips at 80,000 yuan each, the initial chip costs exceed 800 million yuan.
If you still have the courage to consider this, you might refer to OpenAI, the company behind ChatGPT.
OpenAI has already spent at least two large sums on the ChatGPT based on GPT3.5:
Money paid to the “data center” for computing power and data was nearly 2 billion yuan.
Money paid to “people” such as scientists and engineers was nearly 500 million yuan.
This does not include daily electricity costs that can reach tens of thousands of dollars.
As for the newly released GPT4, as a “smarter” brain, the computing power costs may be another new level.
These are the prices of being “smart.”
Despite being so expensive, there are always those willing to pay.
5 Things You Need to Know About GPT-4 and Wenxin Yiyan

Final Words

For instance, Microsoft, which has invested nearly 90 billion yuan in ChatGPT.
Currently, Bing’s share in the global search engine market is in the single digits, with Google holding over 90% market share ahead of it.
However, after integrating ChatGPT, Bing proudly set this phrase in its search box:
Ask me anything…
Yesterday, after the release of GPT4, Microsoft proudly announced:
The new version of Bing has already been using GPT4 for some time; what many people experienced in the past five weeks was enhanced by GPT4.
Today, Baidu also officially released Wenxin Yiyan.
Will it grow to become China’s OpenAI?
Currently, it remains uncertain.
Will China have its own GPT4?
At present, it is also unknown.
However, it is clear that it will not be easy.
Data is hard to come by, and chips are also difficult to obtain.
But aren’t we facing such difficulties already?
Just as we dealt with challenges in the past, we will do the same now.
Best wishes.

Leave a Comment