Texttechnologie/Digitale Linguistik-Kolloquium HS 2022

Kolloquium HS 2022: Berichte aus der aktuellen Forschung am Institut, Bachelor- und Master-Arbeiten, Programmierprojekte, Gastvorträge

Zeit & Ort: alle 14 Tage dienstags von 10.15 Uhr bis 12.00 Uhr, BIN-2.A.10 (Karte)

Online-Teilnahme via das MS Teams Team CL Colloquium ist auch möglich.

Verantwortlich: Dr. Mathias Müller

Colloquium Schedule

Date

Speaker

Topic
Tuesday, 20.09.2022

Marek Kostrzewa

A graph neural network approach for simple-complex sentence alignment

Anastassia Shaitarova The impact of machine-generated language on natural language: A corpus linguistic exploration of commercial MT systems.
Tuesday, 04.10.2022 Prof. Dr. Mascha Kurpicz-Briki (Bern University of Applied Sciences) Natural Language Processing for Clinical Burnout Detection
Tuesday, 18.10.2022

David Reich

 

Patrick Haller  

Thursday, 27.10.2022

(17:15)

Department of Computational Linguistics hosts an invited talk in the IFI Kolloquium:

Dr. Sepideh Alassi (University of Basel)

 

Tuesday, 01.11.2022

(16:15)

Dr. Jesse Dodge, Allen Institute for AI

(over Zoom, a link will appear on this page)

 

Tuesday, 15.11.2022

Dr. Mathias Müller

Recent developments in sign language machine translation

Tannon Kew  
Tuesday, 29.11.2022

Jason Armitage

Learning to Navigate with Trajectory Plans and Environment Features

Jan Brasser  
Tuesday, 13.12.2022

Dr. Pedro Ortiz Suarez (University of Mannheim)

 

Abstracts

 

Marek Kostrzewa: A graph neural network approach for simple-complex sentence alignment

Language has an intrinsically compositional and hierarchical structure, and existing embedding approaches cannot fully exploit this property while learning embedding vectors using deep neural networks. Graphs are universal data structures that represent complex systems with their interrelation, allowing the modelling of both properties of individual objects and the relationship between them. Graph-structure representation of linguistic documents is a promising attempt to overcome some limitations in the vector space.
In order to fully capture and exploit a text's compositional and hierarchical structure, we propose to convert documents into heterogeneous multiplex graphs to introduce sequential, syntactic and semantic relations explicitly. In this presentation, I will show the results of applying graph-structure representations to the monolingual alignment task and its effectiveness in simple-complex sentence alignment.

 

Anastassia Shaitarova: The impact of machine-generated language on natural language: A corpus linguistic exploration of commercial MT systems

Machine Translation (MT) has become an integral part of daily life for millions of people, and the scope of human exposure to MT output is growing. Since MT output can be fluent, users often remain unaware that they are exposed to machine-produced text, and there is concern about the effects of increased MT exposure on human language. To address this problem, it is necessary to study the particularities of MT-produced texts.
Commercial MT technology advances continuously, necessitating regular updates in the evaluation of MT engines. We work with three publicly available systems (DeepL, Microsoft Azure, PONS) on several corpora of different domains and investigate their output using a number of established metrics. Additionally we look at one specific lexical feature, namely the distribution of anglicisms in the German translations. We find that MT output still yields to human translation in terms of lexical and syntactic diversity, though sometimes only marginally. PONS showed the highest lexical variability among the investigated commercial systems. DeepL employs the least number of anglicisms even compared to human translation.

 

Prof. Dr. Mascha Kurpicz-Briki: Natural Language Processing for Clinical Burnout Detection

To identify burnout in clinical intervention, so-called inventories are used. Inventories are psychological tests, where the person concerned fills out a questionnaire. This currently used metric, in both practice and most studies, has some limitations. Due to the overhead of manual evaluation, the state-of-the-art does not use free-text questions or interview transcripts, even though there have been promising approaches in the literature. In our research, we investigate how methods from natural language processing (NLP) can be applied to enable new directions in clinical psychology/psychiatry.