How machine translation works?

How machine translation works?

Not long ago, a real-time translation video became popular on the Internet. The real-time translation function of Skype software was replaced by people who spoke English and replaced in the video. This kind of scene that existed only in science fiction is now a reality, and it all benefits from machine translation technology.

So what is machine translation? Machine translation, also known as automatic translation, is the process of using computers to transform one natural language into another.

Implementation of machine translation

With the rapid development of science and technology and socio-economic development, the world's interconnectedness has become an irresistible development trend. So how can different countries achieve effective and low-cost communication? The cost of human translation is huge. Perhaps the best solution is to make full use of machine translation technology to provide intelligent automatic translation services. Machines are not tired and fast to learn. It is not a problem for a system to master translation of more than a dozen languages ​​at the same time. Maybe there will never be blind spots in translation like humans.

How machine translation works

But the complexity of language is well known. When people still have misunderstandings, how does an icy machine translate a language? Does it think?

Let's explore the implementation of machine translation technology.

The current mainstream method of machine translation is called statistical translation

The basic principle of statistical machine translation is: automatically learn translation knowledge from a large number of translation examples in the corpus, and then use this translation knowledge to automatically translate other sentences.

For example, in order for the machine to successfully translate between Chinese and English, it is necessary to first collect a large number of Chinese and English two-sentence pairs, and then use a computer to count and learn translation knowledge from these two-sentence pairs.

Seeing this, you may think that machine translation does not seem to be difficult, is it just to collect enough words and example sentences?

of course not!

How machine translation works

Making machine learning translate knowledge is not easy.

Human language has great complexity. First, many words and expressions are ambiguous, ambiguous, and relevant to specific application environments. Even the same sentence has different meanings in different contexts. For example, in such a situation, it is not only foreigners, the machine is also expected to be confused.

Secondly, the word order is different in different languages. For example, Chinese puts the quantifier at the back, but English puts it at the front such as "one of the best friends", and the translation of one is advanced.

Furthermore, there may be many correct translation methods for the same sentence. This increases the uncertainty of the machine learning process. For example, hello can be translated into Hello, or How do you do and so on.

Therefore, an excellent machine translation system must master all the knowledge of translation of words, translation of phrases, translation of grammatical structures, and translation of semantics.

Taking the direction of Chinese-English translation as an example, the system must first grasp the translation knowledge of words, phrases, and grammatical structures between Chinese and English. With this translation knowledge, the system will cut this Chinese sentence into various combinations of words, phrases, or grammatical structures (in this process, there are thousands of possible segmentation possibilities, and each unit also has multiple translation preparations (Optional), then translate each unit separately, and finally combine to form the final English translation.

I never expected it. At the instant of the electric light flint, the system has gone through such a thousands of times process.