Towards automatic subtitling for persons with hearing impairments
Supervisor(s): Prof. Dr. Sarah Ebling, Peter Jud (ZHAW)
Summary
While automatic transcription via automatic speech recognition (ASR) works fairly well for many languages, automatic subtitling is a special case of this task, including also spotting (determining starting and ending times of a subtitle) based on audio, speaker identification, music and sound recognition, segmentation, etc.
The aim of this programming project is to establish a pipeline for semi-automatic subtitling for persons with hearing impairments that employs both rule-based and deep-learning-based components and is tested in a real-world scenario.
Requirements
Very good Python skills; familiarity with and interest in training deep learning models to solve some of the subtasks (e.g., spotting based on audio)