Resources

Mostly an unordered collection of pieces of teaching material, writing or code. Feel free to poke around!

If you use any of the below material, please make sure to cite this source. Thanks!

Educational Material

 

Introduction to Neural Networks

I recently gave a general introduction to feed-forward neural networks. Mostly technical, bare-bones explanations and code, no high-level libraries.

Slide sets | Colab notebooks

 

Youtube Playlist Gradient-based Learning

I recently started a Youtube channel! There are 3 videos already, originally meant for my students in a 2019 course. The videos are supposed to give an intuition for what gradient-based learning is.

Watch Playlist

 

Introduction to Machine Learning

Selected slide sets and exercises from my introductory course on machine learning. Requirements: high-school math, statistics and basics of Python programming. This specific course taught together with Phillip Ströbel - thanks Phillip!

  Topic                                                                Slide set               Exercises Notebook
1 Basic concepts of machine learning 1.pdf (PDF, 8624 KB)

1.ipynb (IPYNB, 239 KB)

2 First classification algorithm: KNN 2.pdf (PDF, 4198 KB)

2.ipynb (IPYNB, 9 KB)

3 First regression algoithm: linear regression

3.pdf (PDF, 1396 KB)

3.ipynb (IPYNB, 51 KB)

4 Cross-validation, Hyperparameter Search

4.pdf (PDF, 9472 KB)

4.ipynb (IPYNB, 19 KB)

5 Feature Extraction

5.pdf (PDF, 3070 KB)

5.ipynb (IPYNB, 21 KB)

6 Overview of important classification algorithms

6.pdf (PDF, 2437 KB)

6.ipynb (IPYNB, 11 KB)

 

Introduction to Neural Machine Translation

Selected slide sets and exercises from my introductory course on neural machine translation. Requirements: fundamentals of machine learning, high-school math, statistics and basics of Python programming.

Some of the materials were developed together with Samuel Läubli.

  Topic                                                                Slide set
1 Introduction 1.pdf (PDF, 14092 KB)
2 Evaluation (this slide set by Samuel Läubli) 2.pdf (PDF, 3341 KB)
3 Preprocessing

3.pdf (PDF, 3398 KB)

4 Statistical Machine Translation

4.pdf (PDF, 5935 KB)

5 Linear Algebra, Differential Calculus

5.pdf (PDF, 4817 KB)

6 Linear Models

6.pdf (PDF, 3708 KB)

7 Feed-forward Neural Networks 7.pdf (PDF, 7416 KB)
8 Recurrent Neural Networks 8.pdf (PDF, 4962 KB)
9 Tensorflow 9.pdf (PDF, 4137 KB)
10 Encoder-Decoder Models 10.pdf (PDF, 8732 KB)
11 Attention Networks 11.pdf (PDF, 5840 KB)
12 Decoding (this slide set by Samuel Läubli) 12.pdf (PDF, 663 KB)
13 Current Research / Recent Improvements 13.pdf (PDF, 4667 KB)
14 Summary 14.pdf (PDF, 4260 KB)

Educational NMT Tool "daikon"

Try our educational (= slow, unstable, but insightful) NMT tool, daikon. It's written in Tensorflow, and you will need a GPU to train models. Main authors are Samuel Läubli and myself.

Github Repositories

 

Whatsapp Author Identification

Take text messages from your favorite Whatsapp group to train a system that classifies your friends!

 

Recipes for Sentence Classification with DyNet

Code that exemplifies neural network solutions for classification tasks with DyNet. On top of that, the code demonstrates how to implement a custom classifier that is compatible with scikit-learn's API.

 

RNN Recipes

Forward passes of several flavours of recurrent neural networks in Numy and Tensorflow.

 

Moses Scripts Only

A stripped-down version of the Moses repository, with only the scripts for preprocessing that most people still use.

 

Feed-forward neural networks with Numpy

Implementation of feed-forward networks only using Numpy. Thanks Joel for this idea!

 

Daikon Toy Models

Scripts that show how to train and use models with daikon, our educational NMT system (https://github.com/zurichnlp/daikon).

 

Sockeye Toy Models

Scripts that show how to train and use models with Sockeye.

Writing, Talks, Thoughts

 

Guide to Scientific Writing (PDF, 265 KB)

Guide to writing as scientific thesis. Caution and Disclaimer: this guide is unfinished and I will probably never get to work on it again (or write a thesis the way I am suggesting in it! :-).

 

Crowd-sourcing and English-centric research in NLP (PDF, 1675 KB)

Thoughts on the question whether Crowd-sourcing facilitates research in non-English NLP. Presented at the Jožef Stefan Institute, Ljubljana.

 

Report on Feasibility of Stand-off Markup in TEI Documents (PDF, 307 KB)

Technical report on how exactly to organize Text+Berg annotation layers into several XML files.

 

Cost-effectiveness of Games with a Purpose for Collecting NLP Annotations (PDF, 5187 KB)

Are games with a purpose a cost-effective method of collecting annotations for NLP research?

 

Treatment of Aphasia with Melodic Intonation Therapy in Tone Languages (PDF, 199 KB)

A seminar paper describing the setup and premise for an experiment that would investigate the merit of melodic intonation therapy to treat aphasia in speakers of tone languages.

 

Acquisition of Negation in English (link coming soon)

Discussing hypotheses about acquisition of negation in English speakers.

 

Schreien im Labor (PDF, 6156 KB)

Summary of our research at the Phonetics Lab I gave at a conference for acousticians. In German.

 

Classifying Audience Reactions from Text (PDF, 270 KB)

Using the awesome CORPS corpus of speeches to classify text into audience reactions. For instance: look at a piece of text and try to determine whether the audience laughed after hearing it. Methods in the paper are questionable, but the idea itself is valid I think and still uncharted territory I would say.

 

Typology of Nominal Plural Marking (link coming soon)

Looking at a sample of typologically diverse languages to analyze if and how they mark plural on nominal constructions.