Open Topics for BA and MA Theses as well as Programming Projects

Open topics for final theses

Offene Themen für Programmierprojekte


Automatic Classification of Large Text Collections (Gerold Schneider)

  • Introduction The aim of this programming project or bachelor thesis is to classify large numbers of newspaper articles. The project is conducted in cooperation with a Swiss media database. There are interfaces that provide documents in XML or JSON form. OpenSource tools such as WEKA, Rapidminer, Date or others are to be examined.
  • In addition to the testing of various algorithms and methods, the evaluation of these algorithms and methods is the focus of our work. A comprehensive Gold Standard is made available to us in the media database. Depending on the time available, specific adaptations of the standard solutions can also be made.
  • Aim and PurposeTesting Open Source tools for document classification
  • Evaluation
  • Adjustments, if necessary
  • If the task is successfully completed, it may lead to the development of a new solution with our industrial partner.
  • Requirements Knowledge and skills required for addressing the task include programming skills in Python, Perl or Java. Experience with XML and tools like WEKA is an advantage, but not mandatory. An interest in automatic media content analysis is also recommended.


Sign Language (Sarah Ebling)

Individuelle Themen auf Anfrage


Sentiment Analysis (Manfred Klenner)

Individuelle Themen, z.B.:

  • Sentiment Inference and Attitude Prediction
  • Sentiment Analysis for Social Media
  • Sentiment Analysis in literary texts
  • Distributional Semantics for Sentiment Analysis
  • Text classification: identifying controversial texts​
  • Sentiment Analysis: modality and negation​
  • Hate speech detection

Contact persons for specific topics:

  • Language Technology for Heritage Texts (Martin Volk)
  • Machine Translation (Martin Volk, Rico Sennrich, Annette Rios, Samuel Läubli, Mathias Müller)
  • Toponym Recognition (Martin Volk)
  • Speech (Volker Dellwo)
  • Biomedical Text Mining (Fabio Rinaldi)
  • Machine Learning (Simon Clematide)
  • Large News Corpora (Simon Clematide)
  • Digital Humanities (Gerold Schneider, Simon Clematide)
  • Morphology (Simon Clematide, Tanja Samardžić)
  • Corpus Linguistics (Gerold Schneider, Tanja Samardžić)
  • CALL (Computer-assisted Language Learning) (Gerold Schneider, Manfred Klenner)
  • Automatic Processing of Sign Language (Sarah Ebling)
  • Language Technology for Accessibility (Sarah Ebling)
  • Processing of Swiss German (Tanja Samardžić)
  • Coreference Resolution (Manfred Klenner)
  • Argumentation Mining (Manfred Klenner)
  • Sentiment Analysis (SA) (Manfred Klenner)