Navigation auf


Institut für Computerlinguistik

Kolloquiumsplan HS 2020

Kolloquium HS 2020: Berichte aus der aktuellen Forschung am Institut, Bachelor- und Master-Arbeiten, Programmierprojekte, Gastvorträge

Zeit & Ort: alle 14 Tage dienstags von 10.15 Uhr bis 12.00 Uhr, BIN-2.A.01 (Karte)

Verantwortlich: Duygu Ataman


Vortragende & Thema


  • Einführung ins Kolloquium, Duygu Ataman & Martin Volk
  • Nicolas Spring, BA Thesis presentation: "Probing Tasks for Noised Back-Translation"
  • Prof. Dr. Lena Jäger, "What eye movements can and cannot tell us: Challenges in eye-tracking research"
  • Olga Sozinova, PhD research presentation: "Geometry of linguistic morphology"
  • Adrian van der Lek, MA Thesis presentation: "Evaluating the cognitive plausibility of sentence-level embeddings"


  • Invited Talk by Dr. Reto Gubelmann: “Negation Understanding of Transformer-Based NNLP-Systems: A Predominantly Qualitative Analysis”
  • Patrick Haller, "Investigating variability in letter - speech sound association learning. A model-based fMRI approach."
  • Invited Talk by Prof. Dr. Jan Niehues, Maastricht University, "(Simultaneous) Speech Translation: Challenges and Techniques"


  • Marek Kostrzewa, "Monolingual sentence alignment: towards data-driven text simplification"

  • Jan Deriu, Spot The Bot: A Robust and Efficient Framework for the Evaluation of Conversational Dialogue Systems


  • Invited Talk by Dr. Garrett Smith, the University of Potsdam "Sentence comprehension as a self-organizing process "
  • Tannon Kew, "ReAdvsior Project Update: Seq2seq review response generation for the hospitality domain."


  • Noëmi Aepli, "NLP for Low-resource Language Variations"
  • Tatiana Ruzsics, "Towards Interpretable Neural Inflection Models"


Nicolas Spring, "Probing Tasks for Noised Back-Translation", 15.09.2020 10:15
When using back-translation, adding artificial noise to the synthetic source data leads to better model performance. This led to the hypothesis that noise can serve as an implicit label, signaling to the model that a given sentence has been back-translated. In this thesis, we use probing tasks, an adaptable model introspection technique, to verify this hypothesis, and we investigate the way in which the outputs of a model trained with explicitly tagged back-translation change when decoding a source sentence with and without an explicit label. We show that sentences with noise can be distinguished from genuine sentences with the information present in model states, thus confirming that noise can serve as an implicit label. Furthermore, we discover that decoding sentences with an explicit label yields translations that are lexically more diverse, while no clear changes can be observed in word order in respect to the source sentences. BLEU scores of hypotheses produced with an explicit label are lower than that of their standard-decoding counterparts, indicating that the interaction between lexical diversity and translation quality measured in BLEU is not yet fully understood.

Prof. Dr. Lena Jäger, "What eye movements can and cannot tell us: Challenges in eye-tracking research", 15.09.2020 11:15
Eye-tracking methodology is considered the gold standard in psycholinguistic reading reseearch (Rayner et al., 2006). Moreover, eye-tracking has been attracting increasing attention from researchers developing various kinds of technological applications, including language technology. Unfortunately, the usage of eye-tracking data as a dependent variable for psyhcolinguistic research or as (additional) features for language models entails methodological challenges at the level of the hardware, the data preprocessing, the representation of the data, the model architecture, as well as the statistical practices. In this talk, I will discuss the potential, the challenges and the limitations of using eye-tracking for linguistic research. 

Olga Sozinova, "Geometry of linguistic morphology", 29.09.2020 10:15
Lexicographic tree is a mathematical object that can potentially represent text. It shows a sequence of choices between a set of symbols, where words are constructed starting from the root. When the size of the tree is taken with a limit at infinity, we get an approximation of a language, i.e. a potential very big text, which results in a fractal tree representation. Benoit Mandelbrot, the father of fractal geometry, observed that one of the parameters in the generalised Zipf's law, a model of word frequency distributions, can be interpreted as a fractal dimension of a lexicographic tree. Following his proof, I propose to regard this parameter (D) as a new measure of morphological diversity. First results of my experiments on 44 languages show that D behaves similar to the previously proposed text-based measures of morphological complexity, such as entropy and type-token ratio, at the word level. However, in contrast to previous measures, it allows to study more closely the phenomena at the subword level and to apply geometric notions to linguistic description.

Adrian van der Lek, "Evaluating the cognitive plausibility of sentence-level embeddings", 29.09.2020 11:15
In order to assess different approaches to obtaining word and sentence embedding vectors, and to form a basis upon which new methods can developed, formal evaluation is necessary. Since their inception, embeddings have been evaluated using extrinsic means, i.e. in downstream tasks, which serve as proxies to real world applications. More recently, intrinsic evaluation methods have been proposed, which strive to investigate inherent properties of embedding vectors, but typically only investigate very specific phenomena and can be subject to individual bias. An alternative is to leverage processes occurring in the human brain whilst reading or processing speech, in order to obtain a measure of the cognitive plausiblity of embedding approaches. Hollenstein et al. (2019) have shown that word embedding vectors can be evaluated in a neural regression setting, where embeddings predict cognitive signals aggregated on the word-level. Such signals can be obtained through recordings of physiological monitoring methods such as eye-tracking, EEG and fMRI. Tested approaches differ significantly in how well they predict cognitive signals and rankings correlate between datasets, modalities, as well as with results of extrinsic evaluations. In my thesis, I applied the approach by Hollenstein et al. (2019) to sentence embeddings and sentence-level cognitive signals, with necessary adaptations. I evaluated eight sentence embedding approaches of varying complexity, using cognitive datasets which offer sufficient data on the sentence level. Between approaches, I observed distinct rankings, which differ considerably between the modalities eye-tracking and EEG. I also informally assessed correlation between cognitive and previous intrinsic and extrinsic evaluation results. Results point toward a potential relationship between EEG and tasks measuring semantic relatedness and textual similarity, and to a lesser extent, between eye-tracking and linguistic probing tasks.

Dr. Reto Gubelmann, "Negation Understanding of Transformer-Based NNLP-Systems: A Predominantly Qualitative Analysis", 13.10.2020, 10:15
In this talk, we investigate the ability of several transformer-based NNLP models to cope with negations. To do so, we first evaluate several vanilla models (German and English) on a simple MLM task using a large number of sentences containing negations. Then, we finetune the most promising candidates and rerun the evaluation. After that, we conduct a manual evaluation of the models on a very small, handcrafted dataset. The results of this third evaluation differ from the first and the second one in interesting ways.

Patrick Haller, "Investigating variability in letter - speech sound association learning. A model-based fMRI approach", 13.10.2020, 11:15
Reading fluency relies on the ability to integrate letters and speech sounds efficiently. While there is persuasive evidence pointing at a combination of impaired letter - speech sound integration and dysfunctions in subcortical learning systems in individuals suffering from developmental reading disorder (dyslexia), it is unclear how audio-visual integration of letters and speech sounds and associative learning processes relate to individual reading abilities in healthy readers. The neural basis of such learning processes can be captured by observing neural activity of participants performing a letter – speech sound association learning task. However, most conventional statistical analysis methods for fMRI data fail to capture the complex processes at work during such tasks. For this reason, we aimed to disentangle latent cognitive components with a model-based cognitive neuroscience approach. Specifically, we modelled the task performance using a generative reinforcement-learning drift-diffusion (RLDD) architecture. The RLDD model consists of two components which allow to simultaneously model learning processes on the one hand and decision-making processes on the other. In combination with the fMRI data, we obtained insights about the neural networks involved in the cognitive processes reflected by the model parameters. Our results showed significant correlations between the fMRI data and certain model parameters. Associative strength, for instance, was correlated with activity in occipital areas such as the visual word form area (VWFA) which has been shown to play a crucial role in letter processing. Moreover, additional multiple regression analyses revealed that good readers showed increased activity during prediction error processing in the inferior frontal gyrus, a structure involved in the processing of letter-like units. Considering the fact that we did not find significant correlations between the neural data and the raw task performance data, our results illustrate the usefulness of applying the RLDD model to extract relevant information from raw task performance data, making it a useful tool to investigate letter-speech sound processing as a function of individual reading abilities. From a practical point of view, we hope that harnessing similar approaches might ultimately lead to improvements in characterising different sub-types of dysfluent readers and the development of specialized intervention methods.

Prof. Dr. Jan Niehues, "(Simultaneous) Speech Translation: Challenges and Techniques", 27.10.2020
In today’s globalized world, we have the opportunity to communicate with people all over the world. However, often the language barrier still poses a challenge and prevents communication. Machines that automatically translate the speech from one language into another one are a long dream of humankind. In this presentation, we will start with an overview on the different uses cases and difficulties of speech translation. We will continue with a review of state-of-the-art methods to build speech translation system. We will start with reviewing the translation approach of spoken language translation, a cascade of an automatic speech recognition system and a machine translation system. We will highlight the challenges when combining both systems. Secondly, we will discuss end-to-end speech translation, which attracted a rising research interest with the success of neural models in both areas. In the final part of the lecture, we will highlight several challenges of simultaneous speech translation: Latency, sentence segmentation and stream decoding and present techniques that address these challenges.

Marek Kostrzewa, "Monolingual sentence alignment: towards data-driven text simplification", 10.11.2020, 10:15
Automatic text simplification (ATS) is an attractive research problem that draws upon various NLP areas, such as machine translation or summarization. It is also socially relevant and makes textual information accessible to people with poor literacy, cognitive or linguistic impairment, or non-natives and second language learners. The essence of text simplification is to identify and measure complexity relative to specific target audiences in order to generate a paraphrased and a simplified version that retains the information's essence. Recent advances in deep learning methods with new architectures such as encoder-decoder and Transformer with attention, allow for approaching the text simplification task holistically. However, the main obstacle is the lack of datasets with aligned pairs of complex and simple sentences. The acquisition of these pairs is challenging due to ATS's intrinsic asymmetric nature and the need of handling n:m alignments, often with a null alignment between sentences. In this presentation, I will start with an overview of different approaches to retrieve and align simple and complex sentences. I will then give a review of state-of-the-art methods for data-driven automatic text simplification. Subsequently, I will draw analogies with comparable problems in computer vision and will present solutions that can be transferred to the textual modality. Lastly, I will describe a CNN-based model for extracting complex and simple passages from unstructured documents and will highlight several challenges of monolingual n:m alignments.   

Jan Deriu, "Spot The Bot: A Robust and Efficient Framework for the Evaluation of Conversational Dialogue Systems", 10.11.2020, 11:15
The lack of time-efficient and reliable evaluation methods hamper the development of conversational dialogue systems (chatbots). Evaluations requiring humans to converse with chatbots are time and cost-intensive, put high cognitive demands on the human judges, and yield low-quality results. In this work, we introduce Spot The Bot, a cost-efficient and robust evaluation framework that replaces human-bot conversations with conversations between bots. Human judges then only annotate for each entity in a conversation whether they think it is human or not (assuming there are humans participants in these conversations). These annotations then allow us to rank chatbots regarding their ability to mimic the conversational behavior of humans. Since we expect that all bots are eventually recognized as such, we incorporate a metric that measures which chatbot can uphold human-like behavior the longest, i.e., Survival Analysis. This metric has the ability to correlate a bot's performance to certain of its characteristics (e.g., \ fluency or sensibleness), yielding interpretable results. The comparably low cost of our framework allows for frequent evaluations of chatbots during their evaluation cycle. We empirically validate our claims by applying Spot The Bot to three domains, evaluating several state-of-the-art chatbots, and drawing comparisons to related work. The framework is released as a ready-to-use tool.

Dr. Garrett Smith, "Sentence comprehension as a self-organizing process", 24.11.2020, 10:15
Successfully understanding a sentence relies on building a syntactic parse of the words in the sentence. Traditionally, psycholinguistic theories assume that parses are built using a parsing mechanism that ensures that the emerging structures are globally consistent with the grammar, e.g., using a probabilistic context free grammar along with a left-corner parser. These theories have been widely successful in understanding how people parse sentences and why they sometimes experience difficulty in doing so. There is a class of sentences, though, where locally coherent but ungrammatical parses seem to compete with globally grammatical ones. These coherence effects present a challenge for traditional, rule-based theories of human sentence comprehension. An alternative theory, based on principles of self-organization, can explain these effects in a natural way. Under this theory, parses assemble themselves via local interactions between words without any overseer that checks for global grammaticality. Previous implementations self-orgainized parsing have suffered from opaque mathematical formalisms and limited coverage of empirical phenomena. Here, I present a new formalization of self-organized parsing that overcomes these issues and provides a natural explanation for local coherence effects and other important psycholinguistic findings. The results suggest that self-organization can be a useful perspective for understanding human sentence comprehension.

Tannon Kew, "ReAdvsior Project Update: Seq2seq review response generation for the hospitality domain." 24.11.2020, 11:15
In this talk, we present a recent study on automated response generation for online reviews in the hospitality domain.This study investigates an extended seq2seq approach developed by Gao et al. (2019) for response generation in the domain of mobile app reviews. Applying the proposed approach to our target domain leads to a considerable drop in performance according to automatic metrics.As a result, we conduct an empirical investigation into the two types of data (app reviews vs. hospitality reviews) and find a number of discrepancies that account for the drop in model performance.We also present an overview of ongoing work that aims to further develop review response generation for the hospitality domain and improve upon a basic attentional encoder-decoder model baseline.

Noëmi Aepli, "NLP for Low-resource Language Variations", 11:15
Within her research project Noemi Aepli is working on NLP for low-resource language varieties. The project addresses several issues with which state-of-the-art NLP systems struggle when dealing with any other than the 23 standard languages (such as English, Chinese, and Spanish). For most of the ~7000 known languages on our planet, the available data does not suffice to create NLP systems. The original purpose of working on NLP problems is to build systems which would break down language barriers and enable people to access important information written in a different language. This is especially important for regions where minority languages are spoken primarily. Hence, one goal is to develop more data-efficient methods which can cope with less data. Furthermore, non-standard languages feature high variability, posing problems for any system based on statistics. Thus, reducing this variability is essential to reduce the sparsity issues. The plan to solve is by finding a normalized representation for dialectal variations. (