Study Advisor
For information about my role as study advisor for Computational Linguistics, please see the student consultation page.
Projects and Work
NFP 77 project "Task and Skill Profiles in the Digital Economy", Sinergia project impresso, Citizen Linguistics Project on Swiss German Dialects and Swiss French Citizen Linguistics,www.tonaccent.ch,www.dindialaekt.ch; Exploitation of linguistically annotated multilingual multiparallel corpora SPARCLING; Multilingual Sentiment Analysis KTI project together with Eurospider ; Biomedical Text Mining: MANTRA; SASEBio ; Political Text Mining: COSA ; German Sentiment Analysis IGGSA; Sentence Extension Tests (SET): SET ; Web-based Virtual Laboratory for Computational Linguistics CLab
Publications
See my publication lists onORCID, Google Scholar , ACL Anthology, or ZORA.
NLP Shared Tasks activities
Shared tasks are a good means to collaborate with other researchers, they let students work on a defined task in a competitive way, and generally assess the methodological competences in solving practical problems. From 2017 to 2026, we had many successful shared task participations and helped to co-organized shared tasks as well:
- CLEF-HIPE 2026 (Identifying Historical People, Places, and other Entities): Evaluating Accurate and Efficient Person–Place Relation Extraction from Multilingual Historical Texts
- ICDAR-HIPE-2026 – LLM-Assisted OCR Post-Correction for Historical Documents: Shared Task on improving OCRized texts for information retrieval.
- CLEF-HIPE-2022 (Identifying Historical People, Places and other Entities): Shared Task on Named Entity Recognition and Linking in Multilingual Historical Documents.
- Second SIGMORPHON Shared Task on Grapheme-to-Phoneme Conversions 2021for different resource settings and languages in collaboration with Peter Makarov. We co-organized the shared task, annotated data, provided the baseline system and also had a submission with extensions of the baseline solution.
- CLEF-HIPE-2020 (Identifying Historical People, Places and other Entities) evaluation campaign on named entity processing on historical newspapers in French, German and English; co-organization of the task, providing a NER baseline system and the evaluation framework.
- CoNLL-SIGMORPHON-2020 Shared Task 1 on Multilingual Grapheme-to-Phoneme Conversion on 15 languages in collaboration with Peter Makarov; our neural solution achieved 2nd rank. Our solution is described in a short paper at the SIGMORPHON workshop.
- CoNLL-SIGMORPHON-2018 Shared Task on morphological inflection on 103 languages in collaboration with Peter Makarov; our neural solution achieved 1st rank in all settings in Task I and was competitive in Task II (our system 2018 paper). Our innovative solution for imitation learning based training was also published as a short paper at EMNLP 2018.
- VarDial Shared Task (co-located with EACL 2017) on identification of written Swiss German dialects (CGI) in collaboration with Peter Makarov; our solution achieved 3rd rank (our system paper).
- CoNLL-SIGMORPHON-2017 Shared Task on morphological inflection on 52 languages in collaboration with Tatyana Ruzsics and Peter Makarov; our neural solution achieved overall 1st rank (our system paper)
- ICDAR2017 Competition on Post-OCR Text Correction (English and French) in collaboration with Chantal Amrhein; our neural solution was the best performing system in the Error Correction Task and performed well on Error Detection (official competition paper)
- TAC KBP 2017 Event Nugget Task (Text Analysis Conference) (English, Spanish and Chinese) in collaboration with Peter Makarov; our neural solution was 1st rank for Spanish and Chinese in all subtasks, and 1st for English in the subtasks that included realis value predictions (our system paper)
Teaching
Lectures (to be updated):
- Text Mining: Spring 2020- 2026
- Machine Learning for Natural Language Processing I & II (MA) HS 2019 to Spring 2026
- Text Mining: Semantische Rollen und relationale Fakten (BA): FS 2019 FS 2017
- Deep Learning in der Sprachtechnologie (MA)HS 2018 HS 2016
- Sentimentanalyse und Media Monitoring (BA) FS 2018 FS 2016
- Aktuelle Forschungsmethodik in der Computerlinguistik (MA) HS 2015
- Maschinelle Lernverfahren für die Sprachtechnologie (MA) FS 2018 FS 2016FS 2014
- Programmiertechniken in der CL I (BA): HS 2021 HS 2020 HS 2019 HS 2018 (partly) HS 2017 (partly) HS 2016 (partly)HS 2015 (partly)HS 2014 (partly)HS 2013 (partly) HS 2012 (partly)HS 2011 (partly)
- Thema "Automatische Erschliessungsverfahren" im Aufbaumodul "Information Retrieval" im MAS Bibliotheks- und Informationswissenschaften Automatische Erschliessung (FS 2011)
- Einführung in die Computerlinguistik I (BA):HS 2021 (partly), HS 2020 (partly) HS 2019 (partly),HS 2018 (partly), HS 2017 (partly), HS 2016 (partly)HS 2015 (partly)HS 2014 (partly)HS 2013 (partly)HS 2012 (partly)HS 2011 (partly),HS 2010,HS 2009, HS 2008, HS 2007, WS 2006)
- Finite-State-Methoden in der Sprachtechnologie (BA/MA):FS 2019 FS 2017 FS 2015,FS 2013, FS 2011,FS 2010
- Morphologie und Lexikographie (FS 2009, FS 2008, SS 2007, SS 2006)
- Programmiertechniken in der CL (only final versions) WS 2005 , SS 2005)
Seminars and colloquia : FS 2015: Crowd-sourcing für Sprachtechnologie
HS 2013: Modernes Information Retrieval und Computerlinguistik
(involved as an assistant: SS 2005, SS 2003, SS 2201, SS 2000, WS2000)
Further education: Lectures on Computational Linguistics and Text Mining in MAS "Bibliotheks- und Informationswissenschaft" and DAS Datenmanagement und Informationstechnologien
Presentations and Talks
(to be updated)
- COLING 2018: Neural Transition-based String Transduction for Limited-Resource Setting in Morphology
- KONVENS 2018: A Simple and Effective biLSTM Approach to Aspect-Based Sentiment Analysis in Social Media Customer Feedback
- NLPCS 2013 : Disambiguation of the Semantics of German Prepositions: a Case Study
- RANLP 2013
- 2nd CALBC Workshop: OntoGene at CALBC II and Some Thoughts on the Need of Document-Wide Harmonization, Second CALBC Workshop – 16/17/18 March 2011 – EBI, Hinxton
- "Electoral campaigns, relation mining, and dependency parsing: extracting semantic network data from Swiss newspaper articles", together with B. Wüest and D. Laupper, Computer-aided methods of textual analysis RECON WP 6 Workshop, Berlin, 27-28 May 2010
- WASSA 2010 (1st Workshop on Computational Approaches to Subjectivity and Sentiment Analysis) ECAI 2010: Evaluation and Extension of a Polarity Lexicon for German
- TACOS 2010: Erweiterung des Adjektivbestands eines Polaritätslexikons für Deutsch
- LV im Modul IR im MAS Bibliotheks- und Informationswissenschaften 2009
- (Colloquium talk in Konstanz from July 2007)
- Koordination und syntaktische Disambiguierung
- Computerlinguistik in Information und Dokumentation (2006)
- Automatische Termextraktion (2004 Terminologiekurs ZHW)
- Markov Models And PoS Tagging with Markov Models (englisch)
- Probabilistic Context-Free Grammars (englisch)
- GermaNet und UniNet – Anknüpfen an semantische Netze