Spoken language chunking for Swiss German

Student: N.N.

Supervisor: Tanja Samardžić


Sentence boundaries are typically not present in transcripts of spoken language. However, it is necessary to identify clauses in order to study syntactic phenomena. The goal of this thesis is to perform a systematic analysis of the current segmentation in the ArchiMob corpus and propose a solution for regrouping the current segments into clauses.


Familiarity with syntactic parsing.