How Advanced Is Baidu’s Wenxin Yiyan AI?

The development and training of large language models is extremely challenging, and training Chinese large language models is even more difficult due to various reasons.
How Advanced Is Baidu's Wenxin Yiyan AI?
On one hand, the proportion of Chinese information on the global internet is relatively small. In systematic knowledge such as academic papers and specialized websites across various industries, the proportion of Chinese is even smaller. Compared to the richness of the corpus used to “feed” artificial intelligence, Chinese is already at a disadvantage at the starting line.
On the other hand, the degree of electronicization of real-world information in Chinese is still relatively low.Whether for humans or artificial intelligence, it is quite difficult to understand a real China through the internet.
For example, when I wanted to check the latest “Dietary Guidelines for Chinese Residents” while writing health science popularization, I found that the official website of the Chinese Nutrition Society does not provide a search tool or a PDF version of the guidelines, only a link to purchase the printed book. In contrast, the dietary guidelines of some English-speaking countries can be easily accessed in electronic format.
Therefore, it is naturally not easy for an AI based on internet information to help us answer real-world problems encountered in the Chinese context.
As the first AI large language model in the Chinese world to submit answers, the gap between Baidu’s Wenxin Yiyan and ChatGPT is expected.
What I care more about is: to what extent has Wenxin Yiyan been trained in intelligence, and how far is it from being able to help us solve real-world problems?
How Advanced Is Baidu's Wenxin Yiyan AI?Poster of the movie “Artificial Intelligence”
With this goal in mind, I tested Wenxin Yiyan with a set of self-designed middle school level questions to see how well it could score in language, math, English, physics, chemistry, and history.
I was surprised to find that it performed best on the history questions.
1. Language Questions
I chose a relatively special idiom, “空穴来风” (kōng xué lái fēng), to test Wenxin Yiyan.
How Advanced Is Baidu's Wenxin Yiyan AI?
To my surprise, Wenxin Yiyan provided a perfect answer, listing both the original meaning of the idiom and its widely misused meaning, along with two easily understandable examples. The structure of the entire answer was also very pleasing.
In comparison, I searched the same question on Baidu and found that the result was far inferior to Wenxin Yiyan’s answer.
How Advanced Is Baidu's Wenxin Yiyan AI?
In this scenario, Wenxin Yiyan played a role in information aggregation and analysis, which is the advantage of artificial intelligence over traditional search engines.
Next, I tested its essay writing ability with the classic topic “An Unforgettable Day,” specifying some characters and details about the time.
How Advanced Is Baidu's Wenxin Yiyan AI?
For this topic, Wenxin Yiyan provided a somewhat decent answer but only scored 50 points. On one hand, it correctly understood the meaning of “An Unforgettable Day”; on the other hand, it did not grasp my intention of specifying February 14th, nor did it notice that the protagonist’s age of 15 is inappropriate for entering a bar.
If I wanted to use Baidu search to solve this problem, I would have to search like this and then piece together a modified essay using model essays. There would be no originality, but it wouldn’t make mistakes like a 15-year-old entering a bar.
How Advanced Is Baidu's Wenxin Yiyan AI?
2. Math Questions
I first asked a question that I thought should be easy for artificial intelligence:
How Advanced Is Baidu's Wenxin Yiyan AI?
My given conditions were very clear, and the concept of prime numbers was unambiguous, but Wenxin Yiyan stumbled badly on this question; the answer was neither correct nor complete, and even after I reminded it, it still “refused to repent”.
Instead, Baidu search performed better on this question:
How Advanced Is Baidu's Wenxin Yiyan AI?
But this was not because Baidu search was better; rather, it was because someone had manually organized the corresponding prime number table, and this could only be retrieved thanks to the work done by human brains and hands. Moreover, obtaining the final answer required me to further select and process these search results.
Next, I tested a math question that was not too difficult but had a more complex language expression:
How Advanced Is Baidu's Wenxin Yiyan AI?
Unfortunately, Wenxin Yiyan again provided an incorrect answer, and it was quite outrageous.
If I wanted to use Baidu search to resolve this, although Baidu search could not answer directly, it would provide manually developed calculation tools:
How Advanced Is Baidu's Wenxin Yiyan AI?
As a human, I can find some comfort in this scenario; although machine capabilities are growing rapidly, when it comes to solving practical problems, there is still some space left for humans.
3. English Questions
I first tested a relatively conventional sentence translation question:
How Advanced Is Baidu's Wenxin Yiyan AI?
This answer is considered satisfactory but not excellent; I would rate Wenxin Yiyan’s performance at 70 points.
Compared to standalone translation software, this score is not outstanding, but the advantage of artificial intelligence lies in its ability to directly understand human natural language; it knows that I want to translate the latter part, not translate all the text I input into English.
For humans, this is a friendlier application scenario than a “translator”.
4. Physics Questions
When testing the physics question, I did not directly inquire about physical knowledge but instead increased the difficulty of understanding by creating a scenario that does not exist in the real world.
How Advanced Is Baidu's Wenxin Yiyan AI?
If artificial intelligence is to accurately answer this question, it needs to complete two tasks: first, find the calculation method for the gravitational constant; second, find the parameters that I did not write down but are readily available online, such as the mass and radius of the Earth and the Moon.
From the above results, it can be seen that Wenxin Yiyan only completed the first task, finding the calculation method, but it still cannot find the corresponding parameters to calculate the result.
In my understanding, for a long time to come, the help that artificial intelligence can provide us will be limited to a similar level: it can help us solve some problems and improve some efficiencies, but it cannot provide accurate and reliable final results.
The machine is still growing, and humans still have time; the question is how much time is left for humanity…
5. Chemistry Questions
Here, I asked a question that required aggregation and had a certain degree of openness to see to what extent artificial intelligence would answer.
How Advanced Is Baidu's Wenxin Yiyan AI?
From the feedback provided by Wenxin Yiyan, it can be seen that while this answer is incorrect, it still has its merits.
More importantly, it is evident that Wenxin Yiyan’s answer to this question was not a simple transfer from a single source; rather, it synthesized information from different sources to arrive at its answer. The structure of the answer was also very user-friendly, providing relevant chemical formulas and additional important information.
6. History Questions
Unlike natural sciences, history questions often carry a certain subjectivity, and sometimes there is no uniquely correct answer. Such questions can test the AI’s preferences in information selection.
How Advanced Is Baidu's Wenxin Yiyan AI?
This answer was also quite satisfactory; it first briefly provided a positive answer, then supplemented with comparative information and the reasons behind it. The latter two parts I did not ask for, but the AI guessed that I would want to know and provided the information accordingly.
This is a characteristic of Wenxin Yiyan that makes it more human-like rather than machine-like, and it is also the most challenging aspect of language models. From this perspective, Wenxin Yiyan is not yet good enough, but it is already worth looking forward to.
Based on the questions from the six subjects above, how would you rate Wenxin Yiyan’s overall performance?

Leave a Comment