Header

Search

Advancing Natural Language Processing for Dialects and Linguistic Diversity

language-beyond-the-standard logo

This project addresses challenges in natural language processing for dialects and linguistic diversity, with particular attention to low-resourced languages and varieties. We develop computational models that can effectively capture and process dialectal variation, focusing on innovative representation learning techniques and transfer methods for low-resource scenarios. Our approach combines advanced computational methods with linguistic insights to address four key challenges:

4
Evaluation Frameworks
Developing new evaluation metrics that explicitly account for dialectal variation
3
Synthetic Data Generation
Designing novel data augmentation approaches that generate high-quality training data while preserving dialectal authenticity
2
Dialectal Transfer
Creating robust techniques to leverage existing linguistic resources and tools for dialectal processing
1
Capturing Nuances
Developing enhanced methods to detect and represent nuanced cross-dialectal variations
Breaking the Monolith: Language Beyond the Standard

Our research outcomes will substantially improve dialect-resilient machine translation systems, enhance the ability of large language models to handle linguistic variations, and advance various NLP applications for low-resourced languages and dialects. 

Funding: This project is funded by University of Zurich (UZH PostDoc grant) and runs for 24 months starting from November 2025.

ResearchersSina Ahmadi: Postdoc researcher mentored byRico Sennrich

 

Join Us

We welcome students and researchers interested in contributing to this project. If you are passionate about linguistic diversity, low-resource NLP, or dialectal variation, we offer opportunities for Master's theses, Bachelor's theses, and programming projects. For current open opportunities and application details, please visit Sina Ahmadi's website or contact sina.ahmadi@uzh.ch.