New Intelligence Report
Author: Hu Xiangjie
【New Intelligence Overview】 With AlphaGo’s victory over Ke Jie, the panic triggered by AI has spread not only in the Go community but also into almost every field, with translation being particularly impacted. The emergence of deep learning has greatly transformed machine translation: since 2013, neural network-based machine translation has elevated translation speed and accuracy to new heights. Currently, under fierce competition among tech giants and a flourishing academic research landscape, the level of machine translation continues to evolve, and surpassing human levels is merely a matter of time. In this new intelligent era, will the “ancient” profession of translation disappear?
“Translators are likely to see some job opportunities continuously disappear, and they must get used to a kind of ‘entrepreneurial thinking’.”
On May 27, Chinese Go master Ke Jie lost the last game against AlphaGo, concluding the match with a score of 0:3. The panic triggered by AI has spread not only in the Go community but also into almost every field, with translation being particularly impacted. Currently, companies like Google provide free translation services worldwide, and they can already deliver “understandable” translation results.
Recently, Oxford University completed a large survey of machine learning researchers regarding their views on AI progress. Synthesizing these researchers’ predictions, it is anticipated that within the next decade, AI will outperform humans in many activities, including language translation (by 2024), as detailed in the table below:
In recent years, the most significant impact of deep learning on translation has come from neural machine translation (NMT), a technology that has greatly improved the accuracy of machine translation.
When it was launched ten years ago, Google Translate used phrase-based machine translation (PBMT). A few years ago, the Google Brain team began using recurrent neural networks (RNN) to directly learn the mapping from input sequences to output sequences. PBMT translates sentences by breaking them down into words and translating each separately, while NMT treats the input as a whole. This approach greatly reduces the adjustments needed during translation.
When NMT technology first emerged, it achieved results comparable to PBMT on moderately sized public datasets. Since then, researchers in machine translation have proposed numerous methods to improve NMT, including using attention mechanisms to align inputs and outputs, breaking words into smaller units, or mimicking external alignment models to handle rare words. Nevertheless, NMT’s performance was still not sufficient for large-scale deployment in products.
The animated image below demonstrates the process of GNMT performing Chinese-to-English translation. First, the network encodes Chinese characters (input) into a series of vectors, each representing the meaning encountered so far (e.g., e3 represents “Knowledge is”, e5 represents “Knowledge is power”). After reading the entire sentence, it begins decoding, generating one English word (decoder) at a time as output.
To generate a translated English word at each step, the decoder needs to pay attention to the weighted distribution of the encoded Chinese vectors, focusing on the one most closely related to the generated English word (the darkest line among the multiple transparent blue lines above the decoder d). The more attention the decoder pays, the darker the blue becomes.
Using human comparison scoring metrics, the translations generated by the GNMT system have significantly improved compared to previous systems. In several major languages, GNMT reduced translation errors by 55%-58%.
Additionally, the Google Brain team announced the launch of the GNMT Chinese-English and English-Chinese trial version. Currently, the mobile and web versions of Google Translate for Chinese-English are the first to use GNMT, handling 18 million translation tasks daily.
The Google Brain team stated that the launch of GNMT was made possible by TensorFlow and the deep learning-specific accelerators, Tensor Processing Units (TPUs), especially the latter, which provided sufficient computational power to deploy these powerful GNMT systems while meeting Google’s strict latency requirements. The team indicated that more language services will be continuously rolled out to users in the coming months.
The challenges of machine translation still exist. GNMT may still make some errors that humans would never make, such as omitting translations, misinterpreting proper nouns or rare words, and failing to consider the meaning of entire paragraphs or even the full text during translation. In summary, there are still many areas where GNMT can improve, but nonetheless, GNMT represents a significant milestone. They thank the researchers and engineers who have participated in this work in various forms inside and outside Google over the past few years.
Google’s latest technology has achieved an accuracy rate of up to 87% when translating English to Spanish.
Google Translate is now available in China, which is seen as a precursor to Google’s plan to return to China.
It is not only Google that sees the huge value of machine translation; Chinese companies like Baidu, Huawei, Alibaba, and Tencent are also conducting research, and giants like Facebook and Microsoft are not falling behind. This competitive landscape will greatly accelerate the commercialization of machine translation, making it available to more people.
1. Baidu: Leading in Interpretation, a Year Ahead of Google
On December 21, Baidu held an open day for machine translation technology. Dr. Wu Hua, the person in charge, stated that Google Translate does well in statistical machine translation and is in a leading position, but Baidu is ahead in neural network-based machine translation. Moreover, Google Translate is centered around English, while Baidu’s focus is on Chinese. Additionally, Baidu is somewhat ahead in voice translation.
In an exclusive interview with New Intelligence, she said: “Google Translate is in a leading position, but our advantage lies in that we are somewhat ahead in neural network technology. Google Translate’s press release references many of our previously published articles, which can be checked if one pays attention. We are ahead in neural networks, although they still lead in statistical translation.”
She also added: “In the online translation system, we have clearly surpassed Google in spoken translation; anyone can try it out.”
2. Huawei: On Par with Google Translate, Enhancing Translation Quality
Huawei’s Noah’s Ark Lab proposed a new neural machine translation (NMT) model in a paper accepted by AAAI 2017, introducing a reconstruction-based fidelity metric, which showed that the model effectively improved machine translation performance. Researchers from Huawei’s Noah’s Ark Lab stated that their NMT technology is on par with Google.
Researchers evaluated Google, Microsoft’s Bing, and Noah’s systems on the same test dataset (Baidu Translate could not be directly compared because it recorded the test set), with results shown in the figure below. The metric used is the industry-standard BLEU score, where human BLEU values generally range from 50-70.
3. Facebook Uses CNN Technology Instead of Traditional RNN, Translation Speed is 9 Times Faster than Google
Facebook today released a new machine translation technology that uses CNN technology instead of traditional RNN, achieving accuracy that surpasses Google’s machine translation, previously considered one of the top ten AI breakthroughs of 2016, and is 9 times faster. Facebook claims to have set a new world record. Currently, this technology has been open-sourced.
Facebook stated in its official blog that their technology achieved a new high level on the public benchmark dataset provided by the Machine Translation Summit (WMT) compared to RNNs2, particularly, the CNN-based model’s accuracy also surpassed the historical record of 1.5 BLEU in the widely recognized dataset for evaluating machine translation accuracy, WMT2014 English-French translation task. In WMT 2014 English-German translation, the improvement was 0.4 BLEU, and in WMT 2016 English-Romanian translation, the improvement was 1.8 BLEU.
For neural network-based machine translation technology to be put into practice, one consideration is how long it takes to obtain the corresponding translation after entering a sentence into the system. FAIR’s CNN model is computationally very efficient, being 9 times faster than the strongest RNN system. Many studies have focused on how to enhance speed through weight quantization or distillation, methods that can also be applied to CNN models to enhance speed, and potentially even more. This indicates that CNN has enormous potential.。
4. Alibaba: 250 Billion Calls a Year, Saving $2.5 Billion
Since October 2016, the Alibaba translation team has officially begun developing its own NMT model, and in November 2016, they first applied the output results of the NMT system in external evaluations in the context of Chinese-English messaging, achieving good results and significantly improving translation quality.
In April 2017, the distributed NMT system greatly improved training speed during the English-Russian e-commerce translation quality optimization project, reducing model training time from 20 days to 4 days, saving significant time costs for the overall project iteration and advancement.
The academic community’s interest in neural machine translation (NMT) remains strong. As of May this year, the number of research papers on NMT published on the open-access paper site arXiv.org is nearly equivalent to the total number of papers on this topic published in 2016. The fervor in the research field provides the strongest technical support for commercially viable translation technologies.
As of May 7, there are 137 papers in the arXiv.org repository that include NMT in their titles or abstracts, with only 7 published in 2014, increasing to 11 in 2015. A breakthrough occurred in 2016, with 67 papers published.
Tencent has contributed two papers this year. One is from its AI Lab in Shenzhen (“Modeling Source Syntax for Neural Machine Translation”); the other is from Tencent’s mobile internet department (“Deep Neural Machine Translation with Linear Associative Unit”), a joint study with Suzhou University, the Chinese Academy of Sciences, and University College Dublin.
Microsoft Research Asia in Beijing also began research on NMT this year. They uploaded two papers this month (“Adversarial Neural Machine Translation” and “MAT: A Multimodal Attentive Translator for Image Captioning”).
-
Google Paper: https://arxiv.org/abs/1703.03906
-
Harvard University Paper: https://arxiv.org/abs/1701.02810
-
Facebook Paper: https://s3.amazonaws.com/fairseq/papers/convolutional-sequence-to-sequence-learning.pdf
-
Tencent Paper: https://arxiv.org/abs/1705.01020
-
China Mobile Paper: https://arxiv.org/abs/1705.00861
-
Microsoft Paper: https://arxiv.org/abs/1704.06933
On the same day that Ke Jie played his third match against AlphaGo, Professor Jung Jae-seung from the Korea Advanced Institute of Science and Technology spoke at a forum titled “The Future of Artificial Intelligence and Translation”, stating that AI-driven translation will take on a significant portion of the work currently done by human translators.
“If understanding the culture between different languages and generating the best corresponding text can be defined as good translation, then AI-driven translation, which can collect vast amounts of data from different cultures, will definitely surpass humans in the end,” he said.
He also mentioned a human-machine translation competition organized by the International Interpretation and Translation Association in February this year. The results of that competition indicated that if speed and cost are disregarded, humans currently have a slight edge over machines in translation accuracy.
“We should not assume that this gap will persist in the future,” Jung said, “Although it is hard to accept, considering the ample data available, tech companies like Google will have a huge advantage. Just as AlphaGo defeated Lee Sedol, we do not know whether it truly understands the rules of the game. AI-driven translation may also leap over the stage of understanding sentences and surpass humans in translation.”
Currently, the biggest advantages of AI-driven translation are its free nature and speed. “If AI-driven translation can achieve 93% accuracy, with almost no cost and extremely fast speed, then people will use it in most translation scenarios,” Jung said.
In the field of interpretation, similar disruptive changes are also occurring, where speed is far more critical than in written translation. AI-driven translation can translate people’s speech in real-time at word-for-word speed, providing both sound and text; currently, machines can translate dozens of languages.
However, Jung also indicated that AI-driven translation has some positive aspects as it can aid the development of human translation. “By analyzing the various features of AI-driven translation and identifying its strengths and weaknesses, translators can work more effectively. Translation and interpretation departments need to proactively incorporate AI-driven translation into their curricula,” he said.
Where will translation head in the future? Jung summarized, stating that “personalization” and “entrepreneurial thinking” will be essential. When it comes to high-level languages, such as literature, AI-driven translation may not perform as well due to a lack of data. Translators should emphasize their humanity, adding personalized elements to translations to enhance readability.
He said: “We are in an era where machines continuously and ruthlessly question us—what is the value of your work? Is it creative? Translators need to find new ways to contribute, which is very different from before.”