Named-Entity Recognition (NER) for German

BearbeiterIn: N.N.

Supervisor: Manfred Klenner

Introduction

The aim of NER is the recognition of names (people, companies, products, places, etc.). This requires the combination of various resources, e.g. name lists, Wikipedia data, morphology, etc., and the training of a supervised learner. At Konvens 2014, a so-called shared task on NER was carried out. Annotated data for training and evaluation are available.

Aim and purpose

Marking and classification of named entities in texts.

Requirements

  • Programming knowledge in Python
  • Course in quantitative methods in Computational Linguistics
  • Using a python module for Wikipedia access and machine learning

Procedure

  • Preparation of the annotated data for the training
  • Feature generation using various resources
  • Training and evaluation

Necessary resources

Literature

D. Benikova, C. Biemann, M. Reznicek. NoSta-D Named Entity Annotation for German: Guidelines and Dataset. Proceedings of LREC 2014, Reykjavik, Iceland