1.5.1 The emergence of machine translation (MT) and its integration into the translation workflow

Jegyzet elhelyezéséhez, kérjük, lépj be.!

Although the idea of a “translating machine” is hundreds of years old, the modern-day history of MT goes back to 1933 when a French-Armenian and a Russian researcher proposed patents for a translating machine independently of each other (Nitzke & Hansen-Schirra, 2021). Since then, MT has gone through a remarkable development, particularly in the theoretical approach to automatising the translation process. Initially, automatised systems relied on rule-based approaches, which were replaced by statistical machine translation in the 1990s (Yang, 2018; Nitzke & Hansen-Schirra, 2021). The latest approach to MT is neural machine translation, which became widespread around the middle of the 2010s. It should be noted here that although LLM-based translation is becoming increasingly popular in the translator community, as the longitudinal study presented here began well before the LLM-boom, LLMs and their capacity to translate will not be discussed in this volume. As participants in the study reported here had to post-edit NMT output, NMT will be described in somewhat more detail than the other two (rule-based and statistical) approaches.

Jegyzet elhelyezéséhez, kérjük, lépj be.!

Rules-based approaches (RBMT) had different types, but in general, they tried to grasp languages as a systematic collection of rules, and to convert the ST into the TT relying on these rules. Usually, there was a phase of analysis and a phase of synthesis in the process. Rule-based approaches are seen as outdated nowadays; users or translators are not likely to meet them (Nitzke & Hansen-Schirra, 2021; Yang, 2018).

Jegyzet elhelyezéséhez, kérjük, lépj be.!

Statistical machine translation and neural machine translation are both data-based MT architectures. SMT had been in use for several decades before NMT appeared, and it worked very well for some language pairs. The main idea behind SMT is that parallel corpora from different languages are collected (both mono- and bilingual corpora) and translation systems are trained using these corpora. In the training phase, the system learns equivalence relations, and then, in the phase of translation, the system calculates the most probable equivalent of a word. An advantage of SMT is (was) that errors could be reliably predicted for language pairs, making post-editing easier and easily learnable (Čulo et al., 2014). Additionally, SMT’s style was often awkward, directing the user’s attention to potential problems. SMT did not work very well for morphologically rich languages like Hungarian. From 2006 to 2016, Google Translate employed an SMT system. SMT and RBMT can be combined into what we call hybrid systems (Yang, 2018).

Jegyzet elhelyezéséhez, kérjük, lépj be.!

The most popular approach to MT and the most frequently used one is neural machine translation (NMT). The idea of NMT dates back to the 1940s; however, because of hardware limitations (video cards were not powerful enough), the system itself could not be fully worked out and put to use before the 2010s. NMT relies on large, artificial neural networks that mimic human neural networks. Three layers are involved in the translation process: the input layer, where the source text is managed, the output layer, in which the target text is produced and the hidden layers, where transformation takes place. In the training period, NMT systems use deep learning to develop the structure of the hidden layers. This involves assigning weights and mathematical representations to text segments in the training corpora. However, as the machine learns automatically, the content of the hidden layers is not known (Nitzke & Hansen-Schirra, 2021).

Jegyzet elhelyezéséhez, kérjük, lépj be.!

NMT systems perform amazingly well in most language pairs; as a result, they are widely used these days. Google switched to NMT in 2016, and DeepL introduced its NMT-based system in 2017. Moreover, there are several other free NMT systems available to the public. It must be noted, however, that NMT needs huge memory processing capacities, larger training materials and longer training time than SMT. Similar to SMT, domain-specific, technical texts can only be handled well by NMT if it is trained on these domain-specific texts. This can be very difficult due to the increased technical requirements mentioned above (Nitzke & Hansen-Schirra, 2021).

Jegyzet elhelyezéséhez, kérjük, lépj be.!

It is well known that NMT output is usually of good quality, and is easy to read, flows well (Nitzke & Hansen-Schirra, 2021). Some authors refer to this quality as “deceptive fluency” (Klimova et al., 2023; Á. Lesznyák, 2019), as the smoothly flowing text may mask serious accuracy/adequacy errors (Castilho et al., 2017; Martindale & Carpuat, 2018; Nitzke & Hansen-Schirra, 2021). Paradoxically, the fluency of NMT output can pose problems for post-editors, who may find it more difficult to detect errors in a perfectly flowing text than in an awkward one (Nitzke & Hansen-Schirra, 2021; Vieira, 2019; Yamada, 2019). Yamada (2019) suggested that the following error-types are most likely to occur in NMT translated texts: terminology errors, mistranslations, omissions and additions, structural (syntactic) inconsistencies and coherence issues. As for coherence, it must be noted that NMT operates on the sentence level, which may explain why it outperforms SMT in fluency, but it also explains why even NMT-generated translations may show deficiencies in text-level coherence.

Jegyzet elhelyezéséhez, kérjük, lépj be.!

Interestingly, the way translators seem to handle this deceptive fluency is overediting, that is, introducing unnecessary changes (Daems & Macken, 2021; Koponen & Salmi, 2017; Nitzke & Gros, 2021). This is sometimes coupled with leaving errors in the final version of the target text (Sycz-Opoń & Gałuskina, 2017). In addition, Daems and Macken (2021) found that translators were more likely to introduce changes to the text when they thought that they were revising the “output” of a human translator. This finding suggests that translators have different attitudes towards human-translated and machine-translated texts.

Jegyzet elhelyezéséhez, kérjük, lépj be.!

These characteristics of NMT and the post-editing process linked to it should be remembered when interpreting our findings, as the MT output used in the data collection was generated by DeepL using NMT.
Tartalomjegyzék navigate_next
Keresés a kiadványban navigate_next

A kereséshez, kérjük, lépj be!
Könyvjelzőim navigate_next
A könyvjelzők használatához
be kell jelentkezned.
Jegyzeteim navigate_next
Jegyzetek létrehozásához
be kell jelentkezned.
    Kiemeléseim navigate_next
    Mutasd a szövegben:
    Szűrés:

    Kiemelések létrehozásához
    MeRSZ+ előfizetés szükséges.
      Útmutató elindítása
      delete
      Kivonat
      fullscreenclose
      printsave