Click the blue text to follow us

To achieve the basic correctness of transformation in language level is the goal of machine translation in the short term. The linguistic interpreting of neural machine translation system, the full utilization of corpus implicit knowledge and external knowledge, and machine translation learning methods are the urgent problems to be solved in machine translation. Corpus interconnection path, corpus cognitive-pragmatics processing path and context neural learning path are the necessary paths for the intellectualization of machine translation.
The experimental results show that the linguistic paths can greatly improve the quality of machine translation, and also verify the feasibility of intervening in neural machine translation from the perspective of linguistics. This article will show the important position and existing problems of corpus cognitive-pragmatics processing path.
Corpus cognitive-pragmatics processing path
Corpus cognitive-pragmatics processing path

Through cognitive-pragmatics processing, the machine can recognize and remember the words, syntax, grammar, logic and pragmatic information contained in the corpus, search and translate the unregistered words online, and have the ability to reconstruct the language structure.
The continuous improvement of relevant data model gives the future development direction of corpus research and linguistics. Cognitive-pragmatics processing based on corpus interconnection can make full use of these corpus interconnection information, which is the only way to realize the intellectualization of machine translation. At present, there are three obstacles in corpus cognitive-pragmatics processing:
The first one is the cognitive-pragmatics barrier of corpus structure. At present, the corpus used in machine translation is based on “line”. The basic unit in linguistic research is “sentence”. Machine translation does not consider the pragmatic factors of natural language when calculating word vectors, that is, it is not a normal and complete language fragment.
For example, in the sentence “我们去上班”, there may be an unnatural pragmatic structure such as “我们去上”. The lack of guidance on the cognitive structure of linguistic corpus makes the structure of machine translation incomplete, resulting in difficult to understand translation results.
The second obstacle is the breakthrough of corpus cognitive-pragmatics theory. Regardless of the “obey” of the syntactic structure of the source language sentence or the copying of the existing bilingual corpus, as long as the consideration of specific factors in a specific translation context is abandoned, it will only lead to very strange translation results. At present, there is no effective cognitive-pragmatics theory to guide the process of machine translation.
The third one is how to integrate cognitive-pragmatics theory into machine translation. The whole philosophical presupposition of cognitive linguistics has become a “thought alien” that is difficult to be understand by machine translation.
Therefore, the field of machine translation can only choose to turn a blind eye to its work. It can be seen that how to understand the “thought alien” of the existing cognitive-pragmatics theory in machine translation is also a problem that is faced by the computer and linguistic circles.
To solve the obstacles of corpus cognitive-pragmatics processing, we need the joint of corpus context reconstruction scheme in line with cognitive-pragmatics path, the new breakthrough of corpus cognitive-pragmatics theory and the specific method of integrating corpus cognitive-pragmatics theory into machine translation. The real meaning of language can only be defined by the pragmatic context we usually understand.
Source: Zhao Huijun, Lin Guobin, 2020, Research on the Linguistic Path to Intelligent Machine Translation, Foreign Language Electronic Teaching [A], (2): 42-47.
Translated by: Editor
Image source: Internet
Guided by: Wang Xiaoxi
Edited by: Qi Jiawei
Reviewed by: Wang Yuting