Machine Translation: Born in the Cold War, Yet Rebuilding Babel for Humanity

This article is the seventh installment of the series “AI Legends” written by Mr. Chen Zongzhou, the editor-in-chief and president of Global Science. In this installment, Mr. Chen will lead us to review the development history of machine translation over the past half century. How did this field, which fell into a low tide shortly after its birth, achieve a leap and potentially break down language barriers between different nations in the near future?

Machine Translation: Born in the Cold War, Yet Rebuilding Babel for Humanity

Chen Zongzhou is the president of Global Science magazine and the founder of Computer News.

In March 2017, at the national “Two Sessions”, Premier Li Keqiang visited the Anhui delegation. Liu Qingfeng, the chairman of iFlytek, picked up a small device resembling a mobile phone on the table, saying the encouragement Premier Li had given to iFlytek before—”Let the world hear our voice,” and the machine immediately translated it into fluent English. He then said, “This cantaloupe is very sweet,” and the machine instantly translated it into fluent Uighur. This small device called Xiaoyi multilingual translator is a product of iFlytek.

On a day in November 2016, Jun Rekimoto, a professor at the University of Tokyo and an expert in human-computer interaction, discovered a message on social media: Google Translate had made significant improvements. He personally visited the Google Translate page to experience it, and he was shocked.

He compared several sentences translated by two Japanese translators from “The Great Gatsby” with the results from Google Translate. He believed that Google’s Japanese translation was very fluent and easier to understand than the translators’ works.

He then input the Japanese version of American writer Hemingway’s work into Google Translate, translating it into English, and found that the machine translation had an astonishing similarity to Hemingway’s original English text.

The two scenarios above are related to machine translation, the former being voice translation and the latter text translation, and the core issue for both types of translation lies in natural language understanding.

Machine Translation: Born in the Cold War, Yet Rebuilding Babel for Humanity

The Early Development of Machine Translation

Machine translation (MT) is the process of automatically translating one natural language (source language) into another natural language (target language) using a computer. Machine translation is a star technology in AI, as it is the most powerful assistant for achieving barrier-free communication between different ethnic groups and language communities. Successfully solving the challenges of machine translation will realize the dream of rebuilding the Tower of Babel.

Just as computers were born out of war, the idea of machine translation is also related to military needs. Shortly after the first computer, ENIAC, was born in 1946, American scientist Warren Weaver from the Rockefeller Foundation and others thought about the future applications of computers, recalling the tremendous success of Turing’s machine in deciphering codes during World War II. They believed that language translation was similar to code-breaking; both involved converting one symbol into another and could also be accomplished by machines. Following this line of thought, in 1949, Weaver published the “Translation Memorandum,” formally proposing the idea of machine translation.

Machine Translation: Born in the Cold War, Yet Rebuilding Babel for Humanity

The Pioneer of Machine Translation—Warren Weaver

Once the idea of machine translation was proposed, it immediately gained attention. The United States and the Soviet Union were in the midst of the Cold War, and the demand for the translation of Russian intelligence materials was high. In 1954, a laboratory jointly established by Georgetown University and IBM developed the first machine translation demonstration system. This system, which seems no better than a toy by today’s standards, could translate Russian into English. It accommodated 250 words and followed six grammatical rules, only able to translate 49 carefully selected sentences. However, it was still a remarkable achievement, enough to ignite people’s enthusiasm. Reporters excitedly reported: Today, the electronic brain has translated Russian into English for the first time. The U.S. defense agencies and computer scientists optimistically expected that machine translation would be realized within five years.

Machine translation also attracted research interest from the Soviet Union, Japan, and European countries. Governments worldwide began to allocate funds, and a global machine translation craze emerged.

However, the good times did not last long; the progress of machine translation research slowed down and began to face widespread skepticism. In 1964, to evaluate the progress of machine translation research, the American Academy of Sciences established the Automatic Language Processing Advisory Committee (ALPAC) to conduct a two-year investigation and testing. In November 1966, this committee released the ALPAC report titled “Language and Machines,” which comprehensively denied the feasibility of machine translation, stating that ten years of research had failed to meet expectations, and there was no hope of developing a practical machine translation system in the near future or foreseeable future, recommending the cessation of funding support. This report dealt a heavy blow to the thriving machine translation, which quickly fell into a low tide.

The slow progress of machine translation research was due to the difficulty in achieving substantial breakthroughs in natural language understanding at the time. Natural Language Understanding (NLU) is an important AI discipline that addresses the understanding of spoken and text information. In simpler terms, it aims to solve the problems of comprehending spoken and written language. Although voice translation and text translation each have their own technical challenges, the core challenge they both face is natural language understanding. This is a high, even ultimate goal, so many researchers prefer to use another term, Natural Language Processing (NLP), to describe this discipline, emphasizing the process rather than the goal.

The languages formed by human society over a long journey are a very complex system. The early researchers lacked an in-depth understanding of the complexity of the problems. Methodologically, they hoped to quickly find language rules, just as finding the coding rules of a cipher makes it easy to decipher. If they could find the language rules, they would be able to understand natural language, and the machine translation problem would be solved.

However, the rules of language are too complex. For instance, in terms of grammatical rules, some have calculated that to cover just 20% of real statements, at least tens of thousands of grammatical rules are needed. To cover 50% of real statements, for every new sentence added, several new grammatical rules must be added. Since language is developed and fluid, real sentences actually change infinitely, making it impossible to exhaust grammatical rules.

From the perspective of computational complexity, Turing Award winner Donald Ervin Knuth theoretically pointed out the relationship between grammar and computational complexity. If the context is independent, the computational complexity is the square of the sentence length (i.e., how many words there are); if the context is dependent, the computational complexity is the sixth power of the sentence length. Just analyzing the grammar of a sentence with twenty to thirty words would take several minutes even with today’s high-performance computers. Therefore, to fully understand the grammar of an article or a long piece of speech that has contextual relevance from a rules-based approach would lead to an unimaginable computational time. In the 1970s, even IBM, which had large mainframe computers, could not analyze real sentences using a rules-based approach.

Of course, the above analysis was derived from a non-restrictive language application environment. In practical applications, language use is always restrictive; for instance, different cultures, disciplines, and contexts have their own characteristics when using language. In a restrictive language application environment, the problems are much simplified. Thus, rules-based machine translation continues to push forward and has achieved certain results, while another method, statistical machine translation, began to emerge.

The Rise of Statistical Translation

In the fifth installment, we mentioned Jarinik from IBM Watson Lab in the 1970s, who proposed the theoretical framework for statistical speech recognition, succinctly summarizing speech recognition with two hidden Markov models—acoustic models and language models. This framework had a profound impact on both speech and language processing. Since then, natural language processing began to adopt statistical methods.

For rules-based machine translation systems, a large number of linguists specialized in specific languages are needed to compile large dictionaries and formulate numerous rules related to grammar, syntax, and semantics. The dictionaries and grammar rule libraries constitute the translation knowledge base, and the machine translates based on these dictionaries and rules. This is very similar to how humans understand language and translate by consulting dictionaries and grammar books. Among them, the rules are very complex; for a vocabulary composed of hundreds of thousands of words, the translation system may contain tens of thousands of grammatical rules.

Statistical translation, on the other hand, avoids language rules. The founder of statistical linguistics, Jarinik, famously said: “Every time I fire a linguist, the accuracy of speech recognition improves by 1%.” This extreme statement reflects his disregard for language rules.

Statistical translation uses a large amount of bilingual text to establish a parallel corpus of two languages. During translation, words are matched through the corpus (which later developed to matching phrases, clauses, and even entire sentences), and then the translation results are evaluated and selected based on the matching probabilities.

Another method of statistical translation is to establish a bilingual instance library, which is a larger corpus. During translation, matching is done based on instances.

Statistical translation avoids the complexity of grammar rules, which is evidently simpler and aligns with Weaver’s original idea when he proposed the “Translation Memorandum.” However, statistical translation requires a large-scale corpus, which was not easy to achieve at the time. Therefore, the transition of natural language processing from rules-based to statistical was not simple and underwent a long process. Rules-based natural language processing, later adopted new technologies, still plays a role. However, with the popularization of the internet, large-scale corpora were gradually established, and statistical translation eventually became the protagonist.

The earliest developers and software providers in the machine translation industry, Systran, are a living example of this process. Systran is a commercial representative of the old generation of rules-based machine translation technology, founded in 1968 by Peter Toma. Toma was a scientist who worked in the aforementioned Georgetown University machine translation project group, and later he established Systran machine translation company based on the university’s machine translation R&D team. After the ALPAC report, government funding sharply declined, yet Systran survived and became one of the few machine translation companies that remained. In 1986, Systran was sold to a French family, and later went public in France; in 2014, it was sold to a Korean company.

Machine Translation: Born in the Cold War, Yet Rebuilding Babel for Humanity

Systran Company

This small company, with dozens of employees, has relied on its technology from the 1960s to today. It has gradually developed to support multilingual translation and has its products embedded in the translation systems of companies like Yahoo, Google, and AOL. Systran’s annual sales are only over $10 million, but in the $10 billion machine translation market, it once held a significant share of embedded translation engines. “Our company is so small, yet we are the biggest,” the chairman of Systran proudly stated.

The key battle finally broke out in 2005. Google, which had become an internet search giant, had been using Systran’s rules-based translation technology but always wanted to leverage its large corpus fully. Starting in 2002, Google recruited statistical language processing genius Franz Och to form a machine translation team. In the summer of 2005, the Google translation system, designed by Och and still in the experimental phase, won a big victory in the machine translation competition organized by NIST (National Institute of Standards and Technology). The competition provided 100 news articles to translate from Arabic or Chinese to English, and Google’s system won in all categories, defeating all competitors, including IBM.

After the competition, Och revealed that they provided their system with text corpus equivalent to 1 million books for learning, which became key to improving translation quality. He also compared the Systran English-Chinese translation system used by the company at that time with its experimental system based on statistics, concluding that the latter was significantly superior to the former.

This competition was seen as a milestone marking the official rise of statistical machine translation systems.

Will the Tower of Babel Be Completed?

In October 2007, Google terminated its collaboration with Systran and adopted its own statistical machine translation system. In 2010, Systran had to turn to a hybrid machine translation system that combined rules and statistics, later introducing deep neural network technology. The changes at Systran indicate that the deep neural network-based natural language processing and translation systems have become mainstream. Meanwhile, statistical translation has also begun to pay attention to some details of grammar, syntax, and semantics to improve machine translation systems.

Since then, machine translation has advanced rapidly, continuously developing into various applications, and has become a benchmark for measuring the AI capabilities of major tech companies.

Google Translate is the most well-known machine translation product. Since its launch in 2006, it has supported 103 languages and processes 18 million translations daily, totaling 140 billion words, remaining the industry benchmark. On September 28, 2016, Google released its new neural machine translation system, GNMT, along with both PC and mobile versions. This system overcomes the shortcomings of traditional methods that segment sentences into different parts for translation, fully utilizing contextual information to encode and decode sentences as a whole, resulting in more fluent translations. It is said that the use of the new technology has reduced translation errors by 60% or more. The new translation system has made significant improvements in supporting the challenging Chinese-English translations.

The research team of Google Brain even stated that although it still makes some significant errors that human translators would not make, such as omitting some words or translating sentences in isolation without context, the quality of translations after using the new technology has approached the level of ordinary human translators.

Similar to the test conducted by the aforementioned Japanese expert on Google’s new system’s English-Japanese translation capabilities, in January 2017, during President Trump’s inauguration, a Chinese AI new media company, “New Intelligence,” tested Google’s new translation system for Chinese-English translation. After inputting the English text of Trump’s inauguration speech, Google Translate translated the entire text into Chinese in a minute. New Intelligence concluded that overall, Google’s translation was impressively accurate, achieving about 70% to 80% accuracy. If the text does not require extremely strict accuracy, it can basically meet usability.

Microsoft has also had a considerable natural language processing team. Unlike Google, this team initially focused on rules-based translation but has now adopted deep neural network statistical translation. Microsoft’s machine translation system supports many products in Microsoft’s product line, such as Bing and Skype. In December 2014, the preview version of Microsoft’s Skype Translator was released, initially supporting English and Spanish translation during calls, which caused a stir; by April 2015, it was already able to support Mandarin Chinese. Although Skype translation is still under development and its accuracy needs further improvement, it has already sparked the imagination of a future where people can communicate freely without language barriers. In December 2016, Microsoft released the world’s first universal translator. In addition to supporting speech recognition, photo recognition, and direct input translation functions, it can even achieve real-time translation conversations among up to 100 people, making it a translation artifact.

China has also excelled in machine translation.

iFlytek has consistently been at the forefront of the world in voice synthesis, voice recognition, and semantic understanding, winning first place globally in the International Chinese-English Translation Competition (IWSLT) twice in 2014 and 2015. In 2015, its spoken machine translation system won the NIST international evaluation championship. In the 2016 International Knowledge Graph Construction Competition (KBP), iFlytek won both the first and second place in the core tasks of the competition in its first participation, fully demonstrating its top international technical strength in natural language understanding and knowledge reasoning. iFlytek’s multilingual real-time translation technology is at the global forefront, and it also has its own translation artifact—the Xiaoyi multilingual translator. With strong AI capabilities, iFlytek can face any competition.

Like Google, Baidu, which started as a search engine and has a large corpus, is also keen on machine translation. In July 2011, Baidu Translate was launched and currently supports 28 languages, available on both PC and mobile platforms. In May 2015, Baidu Translate officially launched its neural machine translation (NMT) system, becoming the world’s first practical NMT, more than a year ahead of Google. That same year, Baidu Translate won the second prize for national scientific and technological progress, becoming the first Chinese internet company to receive this honor.

Baidu Translate has its own unique features, such as original object translation, smudge translation, and classical Chinese translation, which can conveniently meet the translation needs of Chinese users anytime and anywhere, making it a great assistant for work, life, travel, and study.

In November 2016, during the third World Internet Conference held in Wuzhen, Li Yanhong optimistically predicted: In the coming years, we can easily imagine that language barriers will be completely broken down, and those who do simultaneous translation now may not have jobs in the future.

Coincidentally, Ray Kurzweil, an American futurist and proponent of the singularity theory, also predicted in an interview with the Huffington Post: By 2029, the quality of machine translation will reach the level of human translation.

Natural language processing and machine translation have achieved remarkable success, and the day when the dream of rebuilding the Tower of Babel, allowing people from different ethnic groups and languages to communicate without barriers, is not far off.

Review of the AI Legends Column:

Sixth Installment | Speech Synthesis: Students Write the Legend of iFlytek

Fifth Installment | Deep Learning Takes the Stage in Speech Recognition

Fourth Installment | The Wings of Assistance

Third Installment | “Father of Deep Learning” Geoffrey Hinton

Second Installment | The Hot and Cold Winters of AI

First Installment | 2016: The Spring of AI

The June issue of Global Science is now on sale; feel free to click Read the Original to Purchase

Machine Translation: Born in the Cold War, Yet Rebuilding Babel for Humanity

Please contact newmedia@huanqiukexue.com for reprints.

Leave a Comment