Non-autoregressive NMT models
Student: N.N.
Supervisor: Mathias Müller, Annette Rios
Introduction
Current NMT models are autoregressive during translation, i.e. they produce translations one word at a time, and each produced word is fed back to the decoder as an input while the next word is generated. This results in latency roughly linear in the length of decoded sequences.
Since NMT models are non-autoregressive during training, they can perhaps translate non-autoregressively as well. A naive solution would be to estimate distributions over the target vocabulary at the same time for all steps in the target sequence. Example paper: https://arxiv.org/pdf/1711.02281.pdf
Requirements
- machine translation course
- familiar with neural MT models
- (programming skills: python)