Machine Translation: Born in the Cold War, Rebuilding Babel for Humanity

Source: Scientific American

Author: Chen Zongzhou

This article is 5200 words long and is recommended to be read in 5 minutes.

This article reviews the history of machine translation and analyzes how this field, once in decline, has made a leap forward and may soon break down language barriers between different nations.

In March 2017, at the national “Two Sessions” meeting, Premier Li Keqiang visited the Anhui delegation. Liu Qingfeng, chairman of iFlytek, picked up a small device resembling a mobile phone from the table and spoke the encouragement that the Premier had previously given to iFlytek—let the world hear our voices. The machine immediately translated it into fluent English. He also said, “This cantaloupe is very sweet,” and the machine instantly translated it into fluent Uighur. This small device, called the Xiaoyi multilingual translator, is a product of iFlytek.

On a day in November 2016, Jun Rekimoto, a professor at the University of Tokyo and an expert in human-computer interaction, discovered a message on social media: Google Translate had made significant improvements. He personally visited the Google Translate page to experience it and was shocked.

He compared several sentences translated by two Japanese translators from “The Great Gatsby” with the results from Google Translate. He found that Google’s Japanese translation was very fluent and, for him, easier to understand than the translators’ works.

He then input the Japanese version of American author Hemingway’s works into Google Translate and translated it into English, discovering that the machine translation bore an astonishing resemblance to Hemingway’s original English text.

The two scenarios mentioned above are both related to machine translation; the former is called speech translation, while the latter is text translation. The core issue in both types of translation is natural language understanding.

Machine Translation: Born in the Cold War, Rebuilding Babel for Humanity

Early Development of Machine Translation

Machine translation (MT), also known as automatic translation, is the process of using a computer to convert one natural language (source language) into another natural language (target language). Machine translation is a star technology in AI because it is the most powerful assistant for achieving barrier-free communication among people of different languages and ethnicities. Solving the challenges of machine translation brings hope to the dream of rebuilding Babel.

Just as computers were born out of war, the idea of machine translation also has military connections. Shortly after the first computer, ENIAC, was born in 1946, American scientist Warren Weaver and others at the Rockefeller Foundation pondered the future applications of computers. They recalled the massive success Alan Turing had in breaking codes during World War II. They believed that language translation was similar to code breaking, as both involve converting one symbol into another, and thus could also be accomplished by machines. Following this line of thought, in 1949, Weaver published the “Translation Memo” and formally proposed the idea of machine translation.

Pioneer of Machine Translation—Warren Weaver

After the idea of machine translation was proposed, it quickly gained attention. During the Cold War, there was a high demand for the translation of Russian intelligence materials. In 1954, a laboratory jointly established by Georgetown University and IBM developed the first machine translation demonstration system. This system, which now seems no better than a toy, could translate Russian into English. The system contained 250 words and followed six grammatical rules, allowing it to translate only 49 carefully selected sentences. However, this was still a remarkable achievement that ignited people’s enthusiasm. Reporters excitedly reported: Today, for the first time, an electronic brain has translated Russian into English. American defense agencies and computer scientists optimistically expected that machine translation would be realized within five years.

Machine translation also sparked research interest in the Soviet Union, Japan, and European countries. In no time, governments around the world allocated funds, and a global machine translation craze arose.

However, the good times did not last long; the progress of machine translation research slowed and began to face widespread skepticism. In 1964, the American Academy of Sciences established the Automatic Language Processing Advisory Committee (ALPAC) to evaluate the progress of machine translation research, conducting a two-year investigation and testing. In November 1966, the committee released the ALPAC report titled “Language and Machines,” which completely denied the feasibility of machine translation, stating that ten years of research had failed to meet expectations and that there was no hope of developing a practical machine translation system in the near or foreseeable future, recommending the cessation of funding support. The report dealt a heavy blow to the burgeoning field of machine translation, which quickly fell into a slump.

Why did machine translation research progress slowly? This was because natural language understanding had difficulty achieving substantial breakthroughs at that time. Natural Language Understanding (NLU) is an important AI discipline that addresses the understanding of spoken and text information. In layman’s terms, it aims to solve the problems of comprehension in listening and reading. Although speech translation and text translation each have their own technical challenges, the core issue they face is also natural language understanding. This is a very high, even ultimate goal, so many researchers prefer to use another term, Natural Language Processing (NLP), to describe this discipline, emphasizing the process rather than the goal.

The language that human society has formed over a long journey is a very complex system. The early researchers lacked a deep understanding of the complexity of the issues. Methodologically, they hoped to quickly find language rules, just as it was easy to decipher codes once the encoding rules were found; if they could find language rules, they would be able to understand natural language, thus solving the challenges of machine translation.

However, the rules of language are incredibly complex. For instance, regarding grammatical rules, someone calculated that to cover just 20% of real sentences, at least tens of thousands of grammatical rules are needed. If 50% of real sentences are to be covered, then for every new sentence added, several new grammatical rules must be created. Since language evolves and flows, real sentences are practically limitless in variation, making it impossible to exhaust grammatical rules.

From the perspective of computational complexity, Turing Award winner Donald Ervin Knuth theoretically pointed out the relationship between grammar and computational complexity. If the grammar is context-free, the computational complexity is the square of the length of the sentence (i.e., how many words it contains); if it is context-sensitive, the computational complexity is the sixth power of the sentence length. Analyzing the grammar of a sentence with twenty or thirty words, even with today’s high-performance computers, can take several minutes. Therefore, fully understanding the grammar of an article or a long piece of speech in a context-sensitive manner becomes computationally infeasible. In the 1970s, even IBM, which had large computers, could not analyze real sentences using a grammar-based approach.

Of course, the above analysis is derived from a non-restrictive language application environment. In practical applications, language use is always restrictive; for example, language used in different cultures, disciplines, and contexts has its own characteristics. In a restrictive language application environment, the problems are significantly simplified. Therefore, machine translation based on language rules continues to make progress and has achieved certain results, while another approach, statistical machine translation, began to emerge.

The Rise of Statistical Translation

In the fifth part, we mentioned that in the 1970s, IBM’s Watson Laboratory’s Jelinek proposed the theoretical framework for statistical speech recognition, succinctly summarizing speech recognition with two hidden Markov models—acoustic models and language models. This framework has had a profound impact on both speech and language processing. From then on, natural language processing began to adopt statistical methods.

For rule-based machine translation systems, a large number of linguists are needed to compile extensive dictionaries for specific languages and establish numerous rules related to grammar, syntax, and semantics. The dictionaries and grammar rule libraries constitute the translation knowledge base, and the machine translates based on these dictionaries and rules. This is very similar to how humans understand language and translate by consulting dictionaries and grammar books. Among them, the rules are very complex; for a vocabulary of hundreds of thousands of words, the translation system may contain tens of thousands of grammatical rules.

Statistical translation, on the other hand, avoids language rules. The founder of statistical linguistics, Jelinek, famously said: “Every time I fire a linguist, the accuracy of speech recognition improves by 1%.” This extreme statement underscores his disregard for language rules.

Statistical translation uses a large number of bilingual texts to establish a parallel corpus of two languages. During translation, words are matched against the corpus (which later evolved to match phrases, short sentences, and even entire sentences), and then the translation results are evaluated and selected based on matching probabilities.

Another method of statistical translation is to establish a bilingual instance library, which is a much larger corpus. During translation, matches are made based on instances.

Statistical translation avoids the complexity of grammar rules, making it evidently simpler, and it was the original idea when Weaver proposed the “Translation Memo.” However, statistical translation requires a large-scale corpus, which was not easy to achieve at the time. Therefore, the transition from rule-based to statistical natural language processing was not that simple and underwent a lengthy process. Rule-based natural language processing later continued to play a role after adopting new technologies. However, with the popularization of the internet, large-scale corpora were gradually established, and statistical translation eventually took center stage.

The earliest developers and software providers in the machine translation industry, Systran, are a living example of this process. Systran is a commercial representative of the old generation of rule-based machine translation technology, founded in 1968 by Peter Toma. Toma was a scientist who worked in the aforementioned Georgetown University machine translation project group, and later founded Systran with the university’s machine translation R&D team. After the ALPAC report, government funding sharply declined, yet Systran survived, becoming one of the few machine translation companies to do so. In 1986, Systran was sold to a French family and later went public in France; in 2014, it was sold to a Korean company.

Systran Company

This small company, with dozens of employees, has relied on its technology to survive from the 1960s to today. It has gradually developed to support multilingual translation, with its products embedded in translation systems of companies like Yahoo, Google, and AOL. Systran’s annual sales are only over 10 million dollars, but in the 10 billion dollar machine translation market, it once held a significant share of embedded translation engines. “Our company is so small, yet we are the largest,” the chairman of Systran proudly stated.

The key battle finally broke out in 2005. Google, which had become an internet search giant, had been using Systran’s rule-based translation technology but always wanted to fully utilize its large corpus. Starting in 2002, Google recruited statistical language processing genius Franz Och to form a machine translation team. In the summer of 2005, the Google Translate system, designed by Och and still in experimental stages, won a resounding victory in a machine translation competition organized by NIST (National Institute of Standards and Technology), where 100 news articles were provided for translation from Arabic or Chinese into English. Google’s system triumphed across all categories, defeating all competitors, including IBM.

Och later revealed that they provided their system with text data equivalent to that of 1 million books for learning, which became key to improving translation quality. He also compared the Systran Chinese-English translation system used at the time with their statistical experimental system and found the latter significantly superior.

This competition was seen as the official coronation of statistical machine translation systems.

Will Babel Tower Be Completed?

In October 2007, Google terminated its cooperation with Systran and adopted its statistical machine translation system. In 2010, Systran had to turn to a mixed machine translation system of rules and statistics, later incorporating deep neural network technology. The changes at Systran indicate that deep neural network natural language processing and translation systems based on statistics have become mainstream. However, statistical translation also began to pay attention to some details of grammar, syntax, and semantics to improve the machine translation system.

Since then, machine translation has made rapid progress, continuously developing into various applications, and has become a benchmark for measuring the AI capabilities of major tech companies.

Google Translate is the most famous machine translation product. Since its launch in 2006, it has supported 103 languages, processing 18 million translations daily, totaling 140 billion words, and has always been the benchmark in the industry. On September 28, 2016, Google released its new neural machine translation system GNMT, simultaneously launching PC and mobile versions. This system overcomes the traditional method of segmenting sentences into different parts for translation, instead fully utilizing context information to encode and decode sentences as a whole, producing smoother translations. It is said that after the new technology was implemented, translation errors could be reduced by at least 60%. The new translation system by Google has seen significant improvements in supporting the challenging Chinese-English translations.

The research team of Google Brain even stated that although it still makes some significant errors that human translators would not, such as omitting certain words or translating sentences in isolation out of context, the translation quality with the new technology has approached that of ordinary human translators.

Similar to the tests conducted by the aforementioned Japanese expert on Google’s new system’s English-Japanese translation capabilities, in January 2017, during the inauguration of President Trump, a Chinese AI new media outlet called “Xinzhi Yuan” conducted a Chinese-English translation test on Google’s new translation system. After inputting the English text of Trump’s inauguration speech, one minute later, Google Translate translated the entire text into Chinese. The evaluation conclusion by Xinzhi Yuan was that overall, Google Translate’s correctness was very impressive, achieving about 70% to 80% accuracy. If the text did not require extremely stringent accuracy, its usability could basically be met.

Microsoft has a substantial natural language processing team and, unlike Google, initially focused on rule-based translation but has now adopted deep neural network statistical translation. Microsoft’s machine translation system supports many products in the Microsoft product line, such as Bing and Skype. In December 2014, Microsoft launched a preview version of the Skype translator, which initially only supported English and Spanish translations during calls but created a sensation; by April 2015, it already supported Mandarin Chinese. Although Skype’s call translation is still under development and its accuracy needs further improvement, it has already opened up the beautiful prospect of people speaking different languages communicating freely without barriers. In December 2016, Microsoft released the world’s first universal translator. In addition to supporting translation functions like speech recognition, photo recognition, and direct input, it can even achieve real-time translation conversations among up to 100 people, making it a translation miracle.

China has also performed excellently in machine translation.

iFlytek has consistently been at the forefront of the world in speech synthesis, speech recognition, and semantic understanding, winning first place globally in the international Chinese-English translation competitions (IWSLT) twice in 2014 and 2015. In 2015, its spoken machine translation system won the NIST international evaluation championship. In the 2016 International Knowledge Graph Construction Competition (KBP), iFlytek won both the first and second place in the core tasks of the competition in its first participation, fully demonstrating iFlytek’s top international technical strength in natural language understanding, knowledge reasoning, and other fields. iFlytek’s multilingual real-time translation technology is at the forefront globally, and it also has its own translation miracle—the Xiaoyi multilingual translator. With strong AI capabilities, iFlytek can face any competition.

Like Google, Baidu, which started as a search engine and possesses a large corpus, is also unwilling to fall behind in machine translation. In July 2011, Baidu Translate was launched, and it currently supports 28 languages, available on both PC and mobile platforms. In May 2015, Baidu Translate officially launched its neural machine translation (NMT) system, becoming the world’s first practical NMT, over a year earlier than Google. In the same year, Baidu Translate won the National Science and Technology Progress Second Prize, becoming the first Chinese internet company to receive this honor.

Baidu Translate also has its own features, uniquely creating functions such as object translation, smear translation, and classical Chinese translation, which can conveniently meet the translation needs of Chinese users anytime and anywhere, making it a great assistant for work, life, travel, and study.

In November 2016, during the third World Internet Conference held in Wuzhen, Li Yanhong optimistically predicted: In the coming years, we can easily imagine that language barriers will be completely broken, and those who do simultaneous translation now may not have jobs in the future.

Coincidentally, futurist Ray Kurzweil, a proponent of the Singularity Theory, also predicted in an interview with the Huffington Post: By 2029, the quality of machine translations will reach the level of human translation.

Natural language processing and machine translation have achieved remarkable accomplishments, and the day when Babel is rebuilt, allowing people of different ethnicities speaking different languages to communicate without barriers, is not far off.

Chen Zongzhou

President of Scientific American magazine, founder of Computer Weekly.

The bottom menu of the public account has surprises!

For enterprises and individuals joining the organization, please check “United Federation”

For exciting past content, please check “Search in Account”

For joining volunteers or contacting us, please check “About Us”

Machine Translation: Born in the Cold War, Rebuilding Babel for Humanity

Leave a Comment Cancel reply