Analysis of Coverage in NMT

Student/in: N.N.

Supervisors: Mathias Müller, Annette Rios

Introduction

Coverage models are additions to standard NMT models that are designed to improve the coverage of the translation. They punish the model if a source token is not translated or if it is translated more than once. In other words, a coverage model constrains the decoder to attend to each position in the source sequence once. See for example this paper for a more detailed description.

The goal of this project is to analyze the different types of coverage models implemented in sockeye. We experimented with coverage models in a cross-domain setting, but did not observe any improvements. A further in-depth analysis should reveal if coverage models can have a positive effect on  translation quality (or not).

To help with this task, there is a visualization tool available that can automatically analyze attention scores.

 

Requirements

  • Python
  • Familiar with NMT and attention