Supervisors: Mathias Müller, Annette Rios
Current NMT models are so-called encoder-decoder models. They have a bipartite structure with clear division of labour: the encoder parts reads an input sentence in the source language. The decoder is generating a sentence in the target language.
Crucially: The decoder does not start to produce a translation until the encoder has read the entire source sentence. This results in high latency and may not be necessary. https://arxiv.org/pdf/1810.13409.pdf propose an eager translation model that starts outputting tokens of the translations immediately. The translation quality is similar to an encoder-decoder reference model.
Still, the proposed model is not completely simultaneous: it uses beam search. Doing away with beam search would further lower latency considerably. In this project, you will work on solutions to remove beam search, while not reducing translation quality.
- Familiar with NMT and Attention