Experiment with pretrained embeddings, e.g.:
- pretrain (subword) embeddings for NMT on monolingual data
- initialize and train normally
- initialize and freeze embeddings
- maybe: train with dual updates (more advanced experiment)
- evaluate influence of learning rate
- compare word vs. subword embeddings
- compare vanilla embeddings vs. NMT trained embeddins