Spoken language chunking for Swiss German
Student: N.N.
Supervisor: Tanja Samardžić
Introduction
Sentence boundaries are typically not present in transcripts of spoken language. However, it is necessary to identify clauses in order to study syntactic phenomena. The goal of this thesis is to perform a systematic analysis of the current segmentation in the ArchiMob corpus and propose a solution for regrouping the current segments into clauses.
Requirements
Familiarity with syntactic parsing.