Supervisor: Manfred Klenner
The aim of NER is the recognition of names (people, companies, products, places, etc.). This requires the combination of various resources, e.g. name lists, Wikipedia data, morphology, etc., and the training of a supervised learner. At Konvens 2014, a so-called shared task on NER was carried out. Annotated data for training and evaluation are available.
Aim and purpose
Marking and classification of named entities in texts.
- Programming knowledge in Python
- Course in quantitative methods in Computational Linguistics
- Using a python module for Wikipedia access and machine learning
- Preparation of the annotated data for the training
- Feature generation using various resources
- Training and evaluation
- GermEval 2014 Named Entity Recognition: https://sites.google.com/site/germeval2014ner/data
- First-name lists (Vornamensliste)
- Python module Pattern (Pattern is a web mining module for the Python programming language): Twitter access, Machine Learning, Tagging, Parsing, etc. http://www.clips.ua.ac.be/pages/pattern
D. Benikova, C. Biemann, M. Reznicek. NoSta-D Named Entity Annotation for German: Guidelines and Dataset. Proceedings of LREC 2014, Reykjavik, Iceland