Navigation auf


Institut für Computerlinguistik

Course Content

On all topics, participants will learn about findings from applied translation research as well as contribute their personal experiences and ideas.


Module 1: Computer-Aided Translation (CAT) and Basic Programming

Daily: Introduction to Python Programming

In the first module, participants learn how to automate everyday translation work using programs created themselves. The basis for this is an introduction to the programming language Python. With the aid of practical exercises in class and voluntary take-home assignments, participants will learn in particular how large files such as translation memories can be adapted to their own needs. No previous knowledge of programming is required.

Day 1: Latest Developments in CAT Tools

The most important new functionalities in common CAT tools will be presented. Special attention will be paid to functionalities that are directly related to the use of machine translation (MT) when working in the CAT tool: Which MT systems can be integrated into CAT tools? How can MT proposals and CAT tool hits be combined and used productively? What do CAT tools offer in terms of MT quality and post-editing effort? Participants will have the opportunity to apply what they have learned in practical exercises.

Day 2: Style and Quality Checking, Project Management

Style and quality checking affect not only the target text that is being created, but especially the Translation Memory (TM) in which it is stored. Participants will learn about the basic methods, criteria, and automated workflows involved in style and quality checking. Further topics include the handling of common free and CAT-related tools as well as automated TM maintenance, among others with the help of the regular expressions (Regex) learned in the previous course day.

Day 3: File Formats and Encoding, Query Languages and Regular Expressions

The most important file formats for processing and exchanging linguistic data will be presented, as well as relevant query languages for the retrieval and processing of information from different data sources. We will also expand on our knowledge of Regex by filtering and processing relevant data in large datasets based on certain patterns and criteria.

Day 4: Term Databases and Term Extraction

Beginning with the basic concepts of terminology, participants will learn about the different methods and tools that can be used to extract terminology from texts and text collections. Additional topics include the basic properties of terminology databases and their components, as well as established methods with regard to the construction of these databases. Following this theoretical background, participants will consolidate their knowledge with short practical exercises.

Day 5: Audiovisual Translation (AVT)

Within the rich and varied discipline of AVT, we will focus on the theory and practice of subtitling. Basic linguistic and technical aspects will be explored through a combination of theoretical examples and hands-on exercises. Other sessions will touch on automation and machine learning in the context of subtitling, and provide an overview of industry training and working opportunities. Access to film material and software will be provided.


Module 2: Machine Translation (MT) and Artificial Intelligence (AI)

Day 6: Introduction to AI - Machine Learning and Neural Networks

Simple examples will be used to demonstrate how artificial neural networks are constructed and can be used to learn in various applications when suitable training material exists. Focusing on natural language processing, we address how words and sentences can be represented in neural networks, and what impact the type of representation has on machine learning. Fundamental concepts such as embeddings and recurrent networks will also be introduced.

Day 7: Neural MT - Technology, Applications, and Limits

MT has made remarkable progress in recent years thanks to improvements in neural modeling. We explain the technical background without going into the mathematical details, focusing instead on how we use neural networks to automatically translate texts, and describe various use cases ranging from fully automatic translation for assimilation to incorporation into workflows for professional translation. We discuss the integration of company-specific terminology and the adaptability of MT to country-specific variants. In light of recent research focus on human-machine parity, we also discuss the limitations of the technology and future directions in MT research.

Day 8: Post-Editing Techniques - Scenarios and Experiences

We discuss the role and influence of post-editing in a translator's work and examine different scenarios in which post-editing can be used. We also discuss how the post-editing of automatically translated texts differs from editing human translations, and for which types of texts it is particularly suited. The theoretical knowledge of methods, best practises, and the integration thereof into work with other tools will be applied in practical exercises adapted to different language versions.

Day 9: Translation Technology and AI for Accessibility - Simplified and Sign Languages

MT can also be used as an assistive technology in two areas: automatic text simplification and automatic translation to and from sign language. A theoretical introduction to the concepts of easy language and sign language will be followed by an overview of experimental linguistics studies for a better understanding of the needs of target groups (e.g., reception studies to test the applicability of different recommendations for producing simplified language). We then examine state-of-the-art methods in CAT and fully automatic translation in the contexts of simplified and sign languages. Particular attention will be paid to identifying the AI content of these technologies, highlighting their limitations, and working together to identify meaningful areas of application. 

Day 10: Spoken Language Processing - Speech Recognition, Speaker Recognition, and Automatic Interpreting

Spoken language has taken on an increasingly important role in automatic translation. We first address what types of information in the speech signal can easily be processed automatically. The basic steps for the automatic processing of spoken language and relevant challenges will then be illustrated with simple examples and visualised with different programs. This theoretical basis serves to better assess the plausibility and expected success of various practical applications such as speech recognition (automatic transcription of the speech signal), speech synthesis (automatic generation of spoken language), and voice recognition (automatic recognition of the speaker), as well as more specific methods for automatic translation such as 'respeaking' or automatic interpreting.

The final lecture day aims to empower professional translators in their roles as reflective practitioners in using language technology. We address the personal and organisational opportunities and risks of using complex translation systems, including ethical and legal aspects such as professional ethics and data protection, and the successful implementation of these systems for all relevant stakeholders. We also discuss what it takes for professional translators who provide post-editing to be able to deal with the high cognitive load of this task in an ergonomically sound way.

Day 12: Participants present CAS projects

Weiterführende Informationen

CAS Translation Technology AI 2022

Seminar Event

Date & Time: Sept. 6, 18:00-19:00

Location: Online (link TBA)

Registration: Email


Join us online for a sneak peek at some of the topics that will be covered in the course, presented by leading researchers in the field.

Information Event

Date & Time: Sept. 15, 18:00

Location: Online


To register, please send an email to