Practical Deployment of Machine Translation

The Department of Computational Linguistics develops machine translation systems for the software industry. The work is tied in with our research on domain-specific statistical machine translation, which also aims at developing and adapting translation systems to specific domains.

Machine Translation of Help Desk Tickets

In cooperation with Finnova, a leading Swiss provider of banking software, we are developing a translation system for help desk tickets. These are technical documents used in customer support, which are available to all customers and which Finnova aims to provide in various languages. In a first phase, we're working on the automatic translation from German to English

Project head:

Researchers:

  • Rico Sennrich
  • Mark Fishel

The project is funded by Finnova and is running since March 2012.

Domain-specific Statistical Machine Translation for the Automobile Industry

In this CTI-Project in collaboration with SemioticTransfer, we have been working on automatically translating German marketing and technical texts from the automobile industry into French and Italian. Particular points of research interest were the following:

  • Domain adaptation: can we improve translation quality by using openly available parallel corpora (e.g. Europarl) in addition to the in-domain data?
  • Pivot translation: is it more effective to translate to French via Italian (or the other way around)?

The Machine Translation systems have been deployed at SemioticTransfer to increase its translation efficiency and to keep its business advantage in the translation market.

Results

  • Machine Translation systems for DE–FR and DE–IT:
    • Data basis: domain-specific parallel texts from the automobile industry
    • Domain adaptation using Europarl v7 and OpenSubtitles 2011 (see Läubli et al. 2003a)
    • Automatic splitting of German compounds in order to translate their parts individually (see Läubli et al. 2003b)
  • Incorporation of the translation systems into SemioticTransfer's workflow
  • Controlled experiment on using the translation systems for post-editing: time savings of 17.4% with consistent quality (see Läubli et al. 2003c)

Publications

  • Samuel Läubli, Mark Fishel, Martin Volk, and Manuela Weibel. 2013a. Combining domain-specific translation memories with general-domain parallel corpora in statistical machine translation systems. In Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013), pages 331–341, Oslo, Norway, 2013.

  • Samuel Läubli, Mark Fishel, Manuela Weibel, and Martin Volk. 2013b. Statistical machine translation for automobile marketing texts. In Proceedings of the Fourteenth Machine Translation Summit (MT Summit XIV), Nice, France, 2013.

  • Samuel Läubli, Mark Fishel, Gary Massey, Maureen Ehrensberger-Dow, and Martin Volk. 2013c. Assessing post-editing efficiency in a realistic translation environment. In Proceedings of the Second Workshop on Post-editing Technology and Practice (WPTP2), Nice, France, 2013.

Staffing

  • Martin Volk (project head)
  • Mark Fishel (researcher)
  • Samuel Läubli (researcher)

The project is funded by KTI and SemioticTransfer and was running from July 2012 to July 2013.