Non-autoregressive NMT models

Student: N.N.

Supervisor: Mathias Müller, Annette Rios

Introduction

Current NMT models are autoregressive during translation, i.e. they produce translations one word at a time, and each produced word is fed back to the decoder as an input while the next word is generated. This results in latency roughly linear in the length of decoded sequences.

Since NMT models are non-autoregressive during training, they can perhaps translate non-autoregressively as well. A naive solution would be to estimate distributions over the target vocabulary at the same time for all steps in the target sequence. Example paper: https://arxiv.org/pdf/1711.02281.pdf

Requirements

machine translation course
familiar with neural MT models
(programming skills: python)

Quicklinks

Main navigation

Non-autoregressive NMT models

Introduction

Requirements