Nematus - an attention-based encoder-decoder model for neural machine translation
subword-nmt subword segmentation scripts for neural machine translation, including byte-pair encoding (BPE).
Zmorge - Zurich Morphological Lexicon for German
clevertagger - morphologically informed POS-tagging
Bleualign - an MT-based sentence alignment tool
x-stance, a multilingual multi-target dataset for stance detection
ContraPro, a large-scale test set for the evaluation of context-aware pronoun translation in neural machine translation.
WMT 2017 systems Pre-trained neural models and training scripts for WMT 2017 shared translation task.
ContraWSD, a test set for NMT evaluation of word sense disambiguation.
code docstring corpus, a parallel corpus of Python functions and documentation strings.
LingEval97, a test set of contrastive translation pairs for NMT evaluation.
WMT 2016 systems Pre-trained neural models for WMT 2016 shared translation task.
WMT 2016 backtranslations Synthetic parallel data (back-translated monolingual data), used at WMT 2016.
WMT 2015 German treebank Dependency parses (with ParZu) of WMT 2015 training data.