Sentiment Analysis for Twitter: Supervised Machine Learning

Student: N/A

Supervisor: Manfred Klenner

Introduction

Tweets are a very special kind of text. Determining whether a tweet is positive, negative or neutral is not solely lexically dependent: hashtags, emoticons and other creative constructs are also significant. A supervised sentiment analysis for tweets requires the generation of feature vectors to be able to train a machine learning process. To do this, an existing Python module called Pattern can be used for downloading tweets and providing various machine learning techniques.

Aim and purpose

Automatic analysis of whether a tweet is positive, negative or neutral with German as a target language.

Requirements

  • Programming knowledge in Python
  • Foundational knowledge in the field of machine learning in accordance with quantitative methods course.

Procedure

  • Familiarisation in sentiment analysis based on 1 to 2 articles
  • Analysis of tweets for model construction (which features)
  • Realization of feature extraction in Python
  • Machine learning and evaluation with pattern

Necessary resources

  • Python Implementation of Pattern: Twitter access, machine learning, tagging, parsing. http://www.clips.ua.ac.be/pages/pattern
  • Approx. 7000 manually classified Tweets are available in German
  • Polarity dictionary is available

Literature

Alexandra Balahur (2013). Sentiment Analysis in Social Media Texts. In: 4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis