Nematus - an attention-based encoder-decoder model for neural machine translation
subword-nmt scripts for subword segmentation that we use for neural machine translation, including byte-pair encoding (BPE).
Zmorge - Zurich Morphological Lexicon for German
Bleualign - an MT-based sentence alignment tool
clevertagger - morphologically informed POS-tagging
wmt2014-scripts - scripts and configuration files to (partially) reproduce systems submitted to WMT2014/5 shared translation task for English-German.
Human parity evaluation Human judgements collected for evaluating whether NMT has achieved human parity in document-level evaluation.
ContraPro, a large-scale test set for the evaluation of context-aware pronoun translation in neural machine translation.
WMT 2017 systems Pre-trained neural models and training scripts for WMT 2017 shared translation task.
ContraWSD, a test set for NMT evaluation of word sense disambiguation.
code docstring corpus, a parallel corpus of Python functions and documentation strings.
LingEval97, a test set of contrastive translation pairs for NMT evaluation.
WMT 2016 systems Pre-trained neural models for WMT 2016 shared translation task.
WMT 2016 backtranslations Synthetic parallel data (back-translated monolingual data), used at WMT 2016.
WMT 2015 German treebank Dependency parses (with ParZu) of WMT 2015 training data.