Utilisateur:A Pirard/Google Translate en

Une page de Wikipédia, l'encyclopédie libre.

Translation methodology[modifier | modifier le code]

Google Translate does not apply grammatical rules, since its algorithms are based on statistical analysis rather than traditional rule-based analysis. Indeed, the system's original creator, Franz Josef Och, has criticized the effectiveness of rule-based algorithms in favor of empirical approaches.[1] It is based on a method called statistical machine translation, and more specifically, on research by Och who won the DARPA contest for speed machine translation in 2003. He is now the head of Google's machine translation group.[2]

Understanding how Google Translate works helps achieving better translations.
Google does not translate from one language to another (L1 -> L2), it most often translates first to English and then to the target language (L1 -> EN -> L2) [3] [4] [5] [6]. This, of course, reduces down to about 70 a number of about 2500 dictionaries (70×70÷2).
But English, as all languages but even more, is ambiguous and depends on context. This causes translation errors. For example, translating vous from French to Russian gives vous -> you -> ты OR вы [7]. If Google were using an unambiguous, artificial language as the intermediary, it would be vous -> you2 -> вы OR tu -> you1 -> ты. Such a suffixing of words disambiguates their different meanings.
Hence, publishing in English, using non ambiguous words, providing context, using expressions such as "you all" often make a better one-step translation.

Overlooking the grammar of the language can cause mistakes. For example, consider the following sentence:
Пишет (3rd person: it writes) вам (dative: to you (all)) письмо (letter) семья (family) Дарьи (genitive: of Daria).
Based on the word order, Google translates: You wrote a letter to family Darya[8].
Based on declensions (word functions), it means: [it's] Daria's family [that] writes you a letter, exactly the opposite.
Google took you for to you, Daria for of Daria as well as to the family for the family.
When translating back to Russian, however, Google says: Семья Дарьи пишет вам письмо[9].
That's correct because Google understood the English words order.
Respecting the same word order as in English or publishing in English as above may help.

According to Och, a solid base for developing a usable statistical machine translation system for a new pair of languages from scratch, would consist in having a bilingual text corpus (or parallel collection) of more than a million words and two monolingual corpora of each more than a billion words.[1] Statistical models from this data are then used to translate between those languages.

To acquire this huge amount of linguistic data, Google used United Nations documents.[10] The UN typically publishes documents in all six official UN languages, which has produced a very large 6-language corpus.

Google representatives have been involved with domestic conferences in Japan where it has solicited bilingual data from researchers.[11]

Translation mistakes and oddities[modifier | modifier le code]

Because Google Translate uses statistical matching to translate rather than a dictionary/grammar rules approach, translated text can often include apparently nonsensical and obvious errors,[12] often swapping common terms for similar but nonequivalent common terms in the other language,[13] as well as inverting sentence meaning.[14]

References[modifier | modifier le code]

  1. a et b « {{{1}}} »
  2. « Franz Josef Och », Google (consulté le ) : « "Franz Josef Och joined Google in 2004 as a research scientist, where he leads the machine translation group." »
  3. French to Russian translation translates the untranslated non-French word "obvious" from pivot (intermediate) English to Russian le mot 'obvious' n'est pas français -> «очевидными» слово не французское
  4. We pretend that this English article is German when asking Google to translate it to French. Google, because it does not find the English words in the German dictionary, leaves those words unchanged as one can show it with this spelllling misssstake. But it translates them to French nonetheless. That's because Google translates German -> English -> French and that the unchanged English words undergo the second translation. The word "außergewöhnlich" however will be translated twice.
  5. Google Translate performs two-step translation through English
  6. Wrong translation to Ukrainian language because going through English
  7. Google Translation mixes up "tu" and plural or polite "vous" Je vous aime. Tu es ici. You are here. -> Я люблю тебя. Вы здесь. Вы здесь.
  8. The meaning of the English translation is the inverse of the Russian sentence ... Пишет вам письмо семья Дарьи -> You wrote a letter to family Darya
  9. ... but the English to Russian translation is correct Daria's family writes you a letter -> Семья Дарьи пишет вам письмо
  10. Google seeks world of instant translations (Reuters)
  11. Google was an official sponsor of the annual Computational Linguistics in Japan Conference ("Gengoshorigakkai") in 2007. Google also sent a delegate from its headquarters to the meeting of the members of the Computational Linguistic Society of Japan in March 2005, promising funding to researchers who would be willing to share text data.
  12. Google Translate Tangles With Computer Learning Lee Gomes , Forbes Magazine , 9/8/2010
  13. Google Translates Ivan the Terrible as “Abraham Lincoln” google.blognewschannel.com
  14. Google Lost in Translation 28/01/2010 , Alyona Topolyanskaya , www.mn.ru