Knowledge (XXG)

Interactive machine translation

Source đź“ť

99:, aimed at developing novel types of assistance to human translators and integrated them into a new workbench, consisting of an editor, a server, and analysis and visualisation tools. The workbench was designed in a modular fashion and can be combined with existing computer aided translation tools. Furthermore, the CASMACAT workbench can learn from the interaction with the human translator by updating and adapting its models instantly based on the translation choices of the user. 163:
of interactive machine translation techniques, it is not clear what should be measured in such experiments, since there are many different variables that should be taken into account and cannot be controlled, as is for instance the time the user takes in order to get used to the process. In the CASMACAT project, some field trials have been carried out to study some of these variables.
162:
is a difficult issue in interactive machine translation. Ideally, evaluation should take place in experiments involving human users. However, given the high monetary cost this would imply, this is seldom the case. Moreover, even when considering human translators in order to perform a true evaluation
102:
Recent work on involving an extensive evaluation with human users revealed the fact that interactive machine translation may even be used by users that do not speak the source language in order to achieve near professional translation quality. Moreover, it also elucidated the fact that an interactive
122:
to provide interactive machine translation was developed. This approach is not able to extract as much information from the bilingual resources used, due to the black-box nature of the interaction, but can use any resource available to the user. Forecat is a black-box interactive machine translation
135:
The interactive machine translation process starts with the system suggesting a translation hypothesis to the user. Then, the user may accept the complete sentence as correct, or may modify it if he considers there is some error. Typically, when modifying a given word, it is assumed that the prefix
139:
Although explained at the word level, the previous process may also be implemented at the character level, and hence the system provides a suffix whenever the human translator types in a single character. In addition, there is ongoing effort towards changing the typical left-to-right interaction
43:
Interactive machine translation is specially interesting when translating texts in domains where it is not admissible to output a translation containing errors, hence requiring a human user to amend the translations provided by the system. In such cases, interactive machine translation has been
136:
until that word is correct, leading to a left-to-right interaction scheme. Once the user has changed the word considered incorrect, the system then proposes a new suffix, i.e. the remainder of the sentence. Such process continues until the translation provided satisfies the user.
378:
Alabau, Vicent; Buck, Christian; Carl, Michael; Casacuberta, Francisco; Garcia-Martinez, Mercedes; Germann, Ulrich; Gonzalez-Rubio, Jesus; Hill, Robin; Koehn, Philipp; Leiva, Luis; Mesa-Lao, Barto; Ortiz, Daniel; Saint-Amand, Herve; Sanchis, German; Tsoukala, Chara (April 2014).
592:
Alabau, Vicent; Carl, Michael; Casacuberta, Francisco; García-Martínez, Mercedes; Mesa-Lao, Bartolomé; Ortiz-Martínez, Daniel; González-Rubio, Jesús; Sanchis-Trilles, Germán; Schaeffer, Moritz (August 2015). "Learning Advanced Post-editing".
186:, the main attractive of the former with respect to the latter is the interactivity. In classical computer-aided translation, the translation system may suggest one translation hypothesis in the best case, and then the user is required to 87:
into the process, with the goal of producing a complete translation hypothesis, which the human user is allowed to amend or accept. If the user decides to amend the hypothesis, the system then attempts to make the best use of such
553:
Underwood, Nancy; Mesa-Lao, Bartolomé; García-Martínez, Mercedes; Carl, Michael; Alabau, Vicent; González-Rubio, Jesús; Leiva, Luis; Sanchis-Trilles, Germán; Ortiz-Martínez, Daniel; Casacuberta, Francisco (May 2014).
190:
such hypothesis. In contrast, in interactive machine translation the system produces a new translation hypothesis each time the user interacts with the system, i.e. after each word (or letter) has been introduced.
576:
Ortiz-Martínez, Daniel; González-Rubio, Jesús; Alabau, Vicent; Sanchis-Trilles, Germán; Casacuberta, Francisco (August 2015). "Integrating Online and Active Learning in a Computer-Assisted Translation Workbench".
35:
that assists the human translator attempts to predict the text the user is going to input by taking into account all the information it has available. Whenever such prediction is wrong and the user provides
76:
techniques within the interactive translation environment with the goal of achieving the best of both actors: the efficiency of the automatic system and the reliability of human translators.
118:
and limiting the usage of interactive machine translation for some scenarios. For this reason, an approach that uses any kind of bilingual resource (not limited to machine translation) as a
423:
Martinez-Gomez, Pascual; Sanchis-Trilles, German; Casacuberta, Francisco (September 2012). "Online adaptation strategies for statistical machine translation in post-editing scenarios".
315:
Barrachina, Sergio; Bender, Oliver; Casacuberta, Francisco; Civera, Jorge; Cubel, Elsa; Khadivi, Shahram; Lagarda, Antonio L.; Ney, Hermann; Tomás, Jesús; Vidal, Enrique (2009).
40:
to the system, a new prediction is performed considering the new information available. Such process is repeated until the translation provided matches the user's expectations.
666: 289:
Herbig, Nico; Pal, Santanu; van Genabith, Josef; KrĂĽger, Antonio (2019). "Integrating Artificial and Human Intelligence for Efficient Translation".
656: 477:
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics (HLT/NAACL)
174:. Such criteria attempt to measure how many key-strokes or words did the user need to introduce before producing the final translated document. 671: 249: 123:
implementation that is available both as a web application (that includes a webpage and a web services interface) and as a plugin for
509:
Sanchis-Trilles, Germán; Ortiz-Martínez, Daniel; Civera, Jorge; Casacuberta, Francisco; Vidal, Enrique; Hoang, Hieu (October 2008).
248:
Casacuberta, Francisco; Civera, Jorge; Cubel, Elsa; Lagarda, Antonio L.; Lapalme, Guy; Macklovitch, Elliott; Vidal, Enrique (2009).
159: 111: 205: 115: 107: 69: 68:. This first work was extended within the TransType research project, funded by the Canadian government. In this project, the 106:
The previously described approaches rely on a tightly coupled underlying corpus-based machine translation system (usually, a
661: 351:
Foster, George; Isabelle, Pierre; Plamondon, Pierre (1997). "Target-text mediated interactive machine translation".
210: 183: 92:
in order to produce a new translation hypothesis that takes into account the modifications introduced by the user.
84: 61: 24: 215: 141: 626: 388:
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics
166:
For quick evaluations in laboratory conditions, interactive machine translation is measured by using the
616: 533:"Balancing User Effort and Translation Error in Interactive Machine Translation via Confidence Measures" 631: 432: 492:"Black-box integration of heterogeneous bilingual resources into an interactive translation system" 200: 96: 80: 73: 64:
paradigm, where the human translator and the machine translation system were intended to work as a
48:
that implements interactive machine translation and work done in the field is mostly restrained to
45: 555: 290: 410:
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the ACL
518:
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing (EMNLP)
49: 32: 272: 448: 440: 360: 331: 264: 563:
Proceedings of the 29th edition of the Language Resources and Evaluation Conference (LREC)
436: 72:
was aimed towards producing the target text for the first time by embedding data-driven
479:. Los Angeles, California: Association for Computational Linguistics. pp. 537–545. 336: 650: 498:. Los Angeles, California: Association for Computational Linguistics. pp. 57–65. 469: 390:. Los Angeles, California: Association for Computational Linguistics. pp. 25–28. 316: 621: 595:
New Directions in Empirical Translation Process Research: Exploring the CRITT TPR-DB
579:
New Directions in Empirical Translation Process Research: Exploring the CRITT TPR-DB
531:
González-Rubio, Jesús; Ortiz-Martínez, Daniel; Casacuberta, Francisco (July 2010).
401:
Ortiz-Martinez, Daniel; Garcia-Varea, Ismael; Casacuberta, Francisco (June 2010).
532: 510: 496:
Proceedings of the EACL 2014 Workshop on Humans and Computer-assisted Translation
444: 402: 380: 641: 520:. Honolulu, Hawaii: Association for Computational Linguistics. pp. 485–494. 225: 220: 187: 28: 542:. Uppsala, Sweden: Association for Computational Linguistics. pp. 173–177. 611: 364: 60:
Historically, interactive machine translation is born as an evolution of the
268: 119: 491: 44:
proved to provide benefit to potential users. Nevertheless, there are few
636: 89: 37: 453: 490:
Juan Antonio, PĂ©rez-Ortiz; Torregrosa, Daniel; Forcada, Mikel (2014).
556:"Evaluating the Effects of Interactivity in a Post-Editing Workbench" 148: 124: 65: 295: 79:
Later, a larger-scale research project, TransType2, funded by the
103:
scenario is more beneficial than a classic post-edition scenario.
403:"Online Learning for Interactive Statistical Machine Translation" 83:
extended such work by analyzing the incorporation of a complete
412:. Association for Computational Linguistics. pp. 546–554. 511:"Improving Interactive Machine Translation via Mouse Actions" 470:"Enabling Monolingual Translators: Post-Editing vs. Options" 182:
Although interactive machine translation is a sub-field of
540:
Proceedings of the ACL 2010 Conference Short Papers (ACL)
317:"Statistical approaches to computer-assisted translation" 250:"Human interaction for high quality machine translation" 381:"CASMACAT: A Computer-assisted Translation Workbench" 178:
Differences with classical computer-aided translation
243: 241: 310: 308: 306: 8: 95:More recently, CASMACAT, also funded by the 612:Lilt's Interactive Machine Translation demo 452: 335: 294: 667:Statistical natural language processing 565:. Reykjavik, Iceland. pp. 553–559. 237: 116:shortcomings of the translation systems 7: 617:Interactive Machine Translation demo 147:A similar approach is used in the 14: 337:10.1162/coli.2008.07-055-r2-06-29 206:Statistical machine translation 108:Statistical machine translation 17:Interactive machine translation 23:), is a specific sub-field of 1: 672:Computer-assisted translation 597:. Springer. pp. 95–111. 468:Koehn, Philipp (June 2010). 445:10.1016/j.patcog.2012.01.011 627:TransType2 project web page 581:. Springer. pp. 54–73. 114:, therefore inheriting the 688: 657:Human–computer interaction 622:TransType project web page 431:(9). Elsevier: 3193–3203. 211:Computer-aided translation 184:computer-aided translation 110:system) that is used as a 85:machine translation system 62:computer-aided translation 25:computer-aided translation 324:Computational Linguistics 257:Communications of the ACM 216:Computational linguistics 142:human-machine interaction 140:scheme in order to make 632:MIPRCV project web page 365:10.1023/a:1007999327580 269:10.1145/1562764.1562798 662:Machine translation 437:2012PatRe..45.3193M 425:Pattern Recognition 353:Machine Translation 201:Machine translation 97:European Commission 81:European Commission 74:machine translation 46:commercial software 151:translation tool. 127:(Forecat-OmegaT). 172:word stroke ratio 70:human interaction 50:academic research 33:computer software 679: 599: 598: 589: 583: 582: 573: 567: 566: 560: 550: 544: 543: 537: 528: 522: 521: 515: 506: 500: 499: 487: 481: 480: 474: 465: 459: 458: 456: 420: 414: 413: 407: 398: 392: 391: 385: 375: 369: 368: 348: 342: 341: 339: 321: 312: 301: 300: 298: 286: 280: 279: 277: 271:. Archived from 254: 245: 168:key stroke ratio 687: 686: 682: 681: 680: 678: 677: 676: 647: 646: 608: 603: 602: 591: 590: 586: 575: 574: 570: 558: 552: 551: 547: 535: 530: 529: 525: 513: 508: 507: 503: 489: 488: 484: 472: 467: 466: 462: 422: 421: 417: 405: 400: 399: 395: 383: 377: 376: 372: 350: 349: 345: 319: 314: 313: 304: 288: 287: 283: 275: 263:(10): 135–138. 252: 247: 246: 239: 234: 197: 180: 157: 133: 58: 12: 11: 5: 685: 683: 675: 674: 669: 664: 659: 649: 648: 645: 644: 642:Forecat-OmegaT 639: 634: 629: 624: 619: 614: 607: 606:External links 604: 601: 600: 584: 568: 545: 523: 501: 482: 460: 415: 393: 370: 359:(1): 175–194. 343: 302: 281: 278:on 2011-07-06. 236: 235: 233: 230: 229: 228: 223: 218: 213: 208: 203: 196: 193: 179: 176: 156: 153: 132: 129: 57: 54: 31:paradigm, the 13: 10: 9: 6: 4: 3: 2: 684: 673: 670: 668: 665: 663: 660: 658: 655: 654: 652: 643: 640: 638: 635: 633: 630: 628: 625: 623: 620: 618: 615: 613: 610: 609: 605: 596: 588: 585: 580: 572: 569: 564: 557: 549: 546: 541: 534: 527: 524: 519: 512: 505: 502: 497: 493: 486: 483: 478: 471: 464: 461: 455: 450: 446: 442: 438: 434: 430: 426: 419: 416: 411: 404: 397: 394: 389: 382: 374: 371: 366: 362: 358: 354: 347: 344: 338: 333: 329: 325: 318: 311: 309: 307: 303: 297: 292: 285: 282: 274: 270: 266: 262: 258: 251: 244: 242: 238: 231: 227: 224: 222: 219: 217: 214: 212: 209: 207: 204: 202: 199: 198: 194: 192: 189: 185: 177: 175: 173: 169: 164: 161: 154: 152: 150: 145: 143: 137: 130: 128: 126: 121: 117: 113: 109: 104: 100: 98: 93: 91: 86: 82: 77: 75: 71: 67: 63: 55: 53: 51: 47: 41: 39: 34: 30: 27:. Under this 26: 22: 18: 594: 587: 578: 571: 562: 548: 539: 526: 517: 504: 495: 485: 476: 463: 428: 424: 418: 409: 396: 387: 373: 356: 352: 346: 327: 323: 284: 273:the original 260: 256: 181: 171: 167: 165: 158: 146: 138: 134: 105: 101: 94: 78: 59: 42: 20: 16: 15: 454:10251/37324 330:(1): 3–28. 226:Translation 221:Postediting 29:translation 651:Categories 296:1903.02978 232:References 160:Evaluation 155:Evaluation 188:post-edit 120:black-box 112:glass box 195:See also 144:easier. 90:feedback 38:feedback 637:Forecat 433:Bibcode 170:or the 131:Process 56:History 149:Caitra 125:OmegaT 66:tandem 559:(PDF) 536:(PDF) 514:(PDF) 473:(PDF) 406:(PDF) 384:(PDF) 320:(PDF) 291:arXiv 276:(PDF) 253:(PDF) 449:hdl 441:doi 361:doi 332:doi 265:doi 21:IMT 653:: 561:. 538:. 516:. 494:. 475:. 447:. 439:. 429:45 427:. 408:. 386:. 357:12 355:. 328:25 326:. 322:. 305:^ 261:52 259:. 255:. 240:^ 52:. 457:. 451:: 443:: 435:: 367:. 363:: 340:. 334:: 299:. 293:: 267:: 19:(

Index

computer-aided translation
translation
computer software
feedback
commercial software
academic research
computer-aided translation
tandem
human interaction
machine translation
European Commission
machine translation system
feedback
European Commission
Statistical machine translation
glass box
shortcomings of the translation systems
black-box
OmegaT
human-machine interaction
Caitra
Evaluation
computer-aided translation
post-edit
Machine translation
Statistical machine translation
Computer-aided translation
Computational linguistics
Postediting
Translation

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.

↑