Spoken language chunking for Swiss German

Student: N.N.

Supervisor: Tanja Samardžić

Introduction

Sentence boundaries are typically not present in transcripts of spoken language. However, it is necessary to identify clauses in order to study syntactic phenomena. The goal of this thesis is to perform a systematic analysis of the current segmentation in the ArchiMob corpus and propose a solution for regrouping the current segments into clauses.

Requirements

Familiarity with syntactic parsing.