Kolloquiumsplan FS 2018

Kolloquium FS 2018: Berichte aus der aktuellen Forschung am Institut, BA/MA-Arbeiten, Programmierprojekte, Gastvorträge

Zeit/Ort: Circa alle 14 Tage am Dienstag von 10.15 Uhr bis 12.00 Uhr, BIN - 2.A.01

Verantwortliche: Fabio Rinaldi und Martin Volk

Kontakt: Fabio Rinaldi

[Subject to changes]


Vortragende / Thema

20. Februar 2018

Samuel Läubli: Multi-encoder Sequence-to-sequence Models as a Fast Alternative to Constrained Decoding

Fabio Rinaldi: Current activities in bio-medical text mining

 6. März 2018

Invited speaker: Martin Krallinger

Biomedical and clinical text mining: challenges, tasks and use cases

20. März 2018

Invited speaker: Kevin Cohen

What happened when I tried to reproduce my own experiments

 10. April 2018

Peter Makarov: Neural Transition-based String Transduction for Limited-Resource Setting in Morphology

Mathias Müller: Does neural machine translation benefit from larger context?


 24. April 2018

Invited speaker: Nicolas Fischer

Terminology management at Roche

15. Mai 2018

Anna Jancso: Using a neural network to correct the output of a lexicon-based biomedical NER System
Ann-Sophie Gnehm: Text Zoning for Job Advertisements with Bidirectional LSTMs

Maria Kliesch: New Languages and Old Brains: La Mise en Place

22. Mai 2018 Johannes Graën: How (not) to build publicly available NLP web services

29. Mai 2018

Natalia Korchagina: Temporal entity extraction from historical texts

Phillip Ströbel: Cross-Lingual Topic Modeling: Is Transfer Learning Any Good?


Feb 20

Samuel Läubli: Multi-encoder Sequence-to-sequence Models as a Fast Alternative to Constrained Decoding

Many sequence to sequence tasks have use cases where we want to put specific constraints on the output that is being produced. An example is interactive machine translation, where constraints come from a human user; other forms of constraints include rule-based translation of terminology, names and technical terms. Previous work has introduced constrained decoding algorithms, but introducing non-trivial constraints into beam search decoding carries a steep cost in computational complexity. We investigate a method to enforce constraints in a simple beam search, using an auxiliary encoder to represent the constraints, and a model trained to produce the non-constrained parts of the output. Empirical results in a constrained neural machine translation task show translation quality similar to grid beam search, while being an order of magnitude more efficient.

Fabio Rinaldi: Current activities in bio-medical text mining

The OntoGene/BioMeXt group has recently acquired several new projects in the area of bio-medical text mining. I will give an overview of our current and future activities, including recently concluded projects. I will present some of the related problems that pose interesting research challenges. I will describe some of our tools and techniques.

March 20:

Kevin Cohen: What happened when I tried to reproduce my own experiments

There has recently been a lot of discussion in both the scientific and the popular press about failures to reproduce the results of research in a wide range of fields.  It is becoming apparent that the computational sciences are not immune from this, despite the fact that in theory, we have all of the tools that we need to maximize the reproducibility of our work--publicly available source code repositories, shared data sets, and markdown languages.  This talk will present some stories from the trenches that bear on the question of how we should conceptualize these problems in informatics, and some analyses of published work that suggest a different approach to thinking about results in machine learning.  

Apr 10

Mathias Müller:  Does neural machine translation benefit from larger context?

Preliminary experiments have shown that neural machine translation systems benefit from document-level context. However, there is uncertainty as to what exactly the context should be: two obvious choices are previous source segments (source-side context) and previously translated segments (target-side context). We propose a way to extend a standard encoder-decoder model with a mechanism that encodes additional context, while only slightly increasing the number of model parameters. Since benefits from larger-context translation might be imperceptible when measured by coarse sentence-level metrics, we perform a more focused evaluation: automatically constructing a challenge set that tests a model’s ability to discriminate between good and bad translations. We introduced this idea in last semester’s Kolloquium, now we present the results. If time permits, we will also discuss current activities in the Contra project.

Peter Makarov:  Neural Transition-based String Transduction for Limited-Resource Setting in Morphology

We present a neural transition-based model that uses a simple set of edit actions (copy, delete, insert) for morphological transduction tasks such as inflection generation, lemmatization, and reinflection. In a large-scale evaluation on four datasets and dozens of languages, our approach consistently outperforms state-of-the-art systems on low and medium training-set sizes and is competitive in the high-resource setting. Learning to apply a generic copy action enables our approach to generalize quickly from a few data points. We successfully leverage minimum risk training to compensate for the weaknesses of MLE parameter learning and neutralize the negative effects of training a pipeline with a separate character aligner.

Apr 24

Nicolas Fischer and Chiara Baffelli: Terminology management at Roche

Terminology management is a key factor in harmonizing and aligning word usage among different departments, and roles in a company, and in providing consistent documentation. In this talk, we will illustrate the application of multilingual terminology management in large companies, using Roche Diagnostics International as an example. We will present key problems of uncontrolled vocabulary, the economical necessity and benefit of terminology management, as well as the implementation and processes we currently use to ensure a consistent and clear communication. Finally, we will close our talk with a look back on terminology management at Roche Diagnostics International, and what we are planning for the future.

May 15:

Anna Jancso: Using a neural network to correct the output of a lexicon-based biomedical NER System

In this presentation, I mainly talk about the work I did in my bachelor thesis which consisted of building a feed-forward neural network that filters and re-classifies the output of a lexicon-based biomedical NER system. I outline how different features and other properties of the system influenced the performance of the neural network. Then, I will describe the experiments that we plan to do with this hybrid system like feeding it with different data sets.

Ann-Sophie Gnehm: Text Zoning for Job Advertisements with Bidirectional LSTMs

The thesis at hand presents an approach to text zoning for job advertisements with bi directional LSTMs (BiLSTMs). Text Zoning refers to segmenting job a dvertisement into eight zones that differ from each other regarding content. It aims at capturing text parts dedicated to particular subjects (e.g. the publishing company, qualifications of the person wanted, or the application procedure) and hence facilitates subsequent information extraction. As we have 38,000 job advertisements in German from 1950 to 2014 available (Swiss Job Market Monitor corpus), each labeled with text zones , we benefit from a large amount of training data for supervised machine learning. We use BiLSTMs, a class of recurrent neural networks particularly suited for sequence labeling, as they integrate well information over long sequences and consider context on both side s of the actual label for classification decisions. Our best model reaches a token-level accuracy of 89.8%, which is 2 percentage points above results from previous approaches with CRFs and implies an error rate reduction by 16%. Models with task-specific embeddings perform better than models with pretrained word embeddings, which is probably due to the large amount of labeled training data. When optimizing the model for future application on recently published job advertisements, the inclusion of older tra ining data lowers performance, as some sort of out-of-domain effect counteracts the effect or more training data. Ensembling, i.e . to aggregate classification decisions of five models, brings the largest improvement of all optimization steps, raising accur acy by 0.5 percentage poi nts. In conclusion, we succeeded in building a high performing solution to automatic text zoning for job ads based on neural networks.

Maria Kliesch: New Languages and Old Brains: La Mise en Place

Abstract: The unprecedented growth in the number of older adults in Switzerland and elsewhere calls for measures to stave off the age-related cognitive decline, and contribute to healthy and active aging. Second language (L2) learning is a promising way of achieving both, being a cognitively challenging activity that has been shown to promote neural plasticity, and to foster social interactions and individual mobility. However, research on L2 learning and senescence is only beginning to establish itself, and the relationship between cognitive capacities, the aging brain, motivational aspects and L2 learning in third age is largely unknown. In a dense longitudinal study design, older adults between 65 to 75 years of age will participate in a 30-weeks multimodal Spanish training, combining computerised training and communicative language sessions, and will be compared against an active and passive control group. By doing so, I aim to answer 1) when L2 development in older learners is significantly increasing, 2) whether cognition, electrophysiology and socio-affective factors can predict the learning outcome and 3) how the impact of L2 learning on cognitive trajectories compares to that of other training types. In my talk, I will provide an overview of both the project itself as well as a brief outline of the tasks at hand.

May 22:

Johannes Graën: How (not) to build publicly available NLP web services

The outcome of CL research encompasses methods, data and applications. When we have produced data (e.g., corpora or language models for a particular application), we can simply provide it for download. The same applies to applications (e.g., NLP tools, corpus query and visualization services), but we typically also want to showcase them so that interested user have the opportunity to discover their potential. My talk will point out the pitfalls of providing publicly available NLP web services and present a generic solution that circumvents most of the risks while achieving an overall better performance.

May 29:

Natalia Korchagina: Temporal entity extraction from historical texts

Spelling and lexical variation in historical texts, resulting in inconsistent data, obstructs the automatic processing of historical corpora. The direct application of existing NLP tools to this data does not lead to good results. Previous work has shown that for a successful NLP processing of historical corpora, it is necessary to either adapt (=standardize) the corpora for the tool, or the other way around. Within the framework of my dissertation, I explored both approaches, searching for a method for the effective extraction of temporal entities from Early New High German texts. This talk will summarise my effort. In particular, I will discuss the benefits of machine learning and deep learning methods applied to a non-standardized corpus in a low-resource setting.   

Phillip Ströbel: Cross-Lingual Topic Modeling: Is Transfer Learning Any Good?

Abstract: In both supervised and unsupervised Machine Learning we generally train models to solve very specific tasks. If the task or the domain changes only ever so slightly, we need to retrain our models. This not only involves additional amount of computational power, but also a more or less severe reconceptualisation of the model, tweaking features and parameters. Transfer learning can alleviate these issues. The idea of transfer learning is to store the knowledge of a model that has been trained for a specific purpose and to use this knowledge in order to solve a different, but related task. In NLP, this technique has not yet found many applications, although there are potentially many areas in which transfer learning could successfully be implemented. One such area might be cross-lingual topic modeling. Since newspapers are usually not translated, finding similar articles across different languages is not a trivial task. However, my hypothesis is that the distribution of topics in newspapers across languages is very similar. By learning a topic model using data of only one language, it should somehow be possible to map this model to another language. In my disseration, I want to examine different methods of how this could be achieved.