Supervisors: Mathias Müller, Annette Rios
Coverage models are additions to standard NMT models that are designed to improve the coverage of the translation. They punish the model if a source token is not translated or if it is translated more than once. In other words, a coverage model constrains the decoder to attend to each position in the source sequence once. See for example this paper for a more detailed description.
The goal of this project is to analyze the different types of coverage models implemented in sockeye. We experimented with coverage models in a cross-domain setting, but did not observe any improvements. A further in-depth analysis should reveal if coverage models can have a positive effect on translation quality (or not).
To help with this task, there is a visualization tool available that can automatically analyze attention scores.
- Familiar with NMT and attention