Header

Suche

Text Technology/Digital Linguistics Colloquium FS 2026

Time & Location: every 2 weeks on Tuesdays from 10:15 am to 12:00 pm in room BIN-2-A.10.

Online participation via the MS Teams Team CL Colloquium is also possible.

Responsible: Jason Armitage

Colloquium Schedule

 
17.02.2026 Ahmet Yavuz Uluslu Rajiv Bains
03.03.2026 Fatma-Zohra Rezkellah (Rania) Hanxu Hu
17.03.2026 Sophia Conrad Kirill Semenov
31.03.2026 Chuqiao Yan Sina Ahmadi
14.04.2026 Janis Goldzycher Yang Tian
28.04.2026 Deborah Jakobi Patrick Giedemann
12.05.2026 Lena Bolliger Thyra Krosness
26.05.2026 Anja Ryser Pius von Däniken

17 Feb 2026

Ahmet Yavuz Uluslu: Forensic Agents: Authorship Analysis in the AI Era
While large language models demonstrate remarkable potential for automating forensic text analysis, they remain prone to hallucinations, reasoning shortcuts, and bias. We examine how LLM Agents can address these failures. We discuss specific AI alignment approaches and agentic workflows designed to transform opaque predictions into a transparent, reproducible and robust procedure.

Rajiv Bains: Adaptive Neuromorphic Sonification of Data for BLV Users
Data sonification, the use of non-speech audio to represent quantitative information, offers a promising modality for making data accessible to blind and low-vision (BLV) users. However, existing sonification tools typically conflict with screen readers, as sonification audio competes with the text-to-speech (TTS) output that BLV users rely on to navigate digital interfaces. This project addressed the challenge of multimodal coordination in assistive technology by developing and evaluating a macOS/VoiceOver-compatible sonification system that synchronises non-speech audio with TTS to prevent auditory masking and announcement delays. The system employs a dual-level spiking neural network (SNN) architecture informed by neuromorphic computing principles: a 16-neuron real-time layer embedded in the audio callback loop modulates synthesis parameters, while a 64-neuron batch ‘dreaming’ optimiser proposes parameter updates based on interaction logs. This architecture raises questions relevant to computational approaches in the humanities: how can adaptive systems learn from sparse, qualitative user feedback, and what are the trade-offs between algorithmic optimisation and human-centred design?
Following a design-based research methodology, the work proceeded through two evaluation phases: a pilot study with an expert BLV participant revealed critical failure modes (speech–audio overlap, queued announcements, keyboard focus conflicts), prompting iterative redesign before a main study with a non-expert BLV participant. Results demonstrate that enforcing temporal exclusivity between TTS and sonification enabled sustained data exploration, whereas pilot system failures prevented meaningful interaction. A key finding concerns the tension between optimisation for behavioural efficiency and listening comfort: SNN-optimised parameters increased perceptual contrast but were reported as more fatiguing. This talk will present the coordination mechanisms developed for integrating sonification with screen readers, demonstrate the neuromorphic-inspired adaptation architecture, and discuss design implications for accessible data representation.

3 Mar 2026

Fatma-Zohra Rezkellah (Rania): Argument mining with LLMs: From Prompting to Policy Optimization
Argument Mining (AM) — extracting argumentative units and their relations — is key to modeling human reasoning. We study how LLMs perform AM when framed as an end-to-end generation task producing structured argument graphs. We compare three approaches: few-shot prompting, supervised fine-tuning (SFT), and reinforcement fine-tuning (RFT), across three datasets: AAE, CMV, and AbstRCT, and we benchmark against encoder-based discriminative baselines. Our results show that SFT improves prompting but still struggles with edge prediction, while RFT is more robust across domains despite lower in-domain performance. Relation classification under domain shift remains the hardest challenge. Beyond these findings, this work opens a path for applying reinforcement learning to argument mining.

Hanxu Hu: SFT memorize the easy, RL generalize the hard: Decoupled SFT-RL Training for General Reasoning
While Reinforcement Learning with Verifiable Rewards (RLVR) has driven remarkable success in mathematical reasoning, applying it directly to base models for general STEM reasoning often yields suboptimal results, frequently falling short comparing with SFT. To address this, we propose DeReason, a difficulty-based decoupled training pipeline. We observe that general domains require broad knowledge acquisition (best suited for SFT on east problems) alongside RL on hard problems to learn reasoning. Therefore, we partition training data by difficulty: we first apply SFT on a large set of easier, broader problems to build a knowledge foundation, followed by RL on a curated subset of difficult problems to push the reasoning frontier. Evaluations on general reasoning benchmarks (MMLU-Pro, GPQA-Diamond, SuperGPQA, and BBEH) demonstrate that our decoupled curriculum outperforms pure SFT, pure RL, and random SFT-then-RL baselines at 4B scale.

17 Mar 2026

Sophia Conrad: Can Small Models Perform Well in Multilingual Argument Mining?
Argument mining is the task of automatically identifying argumentative structures in text, such as claims, premises, and the relations between them.  We  investigate how well language models perform on two core argument mining tasks: argument component identification (major claim, claim, premise) and relation classification (support, attack).

Recent studies on argument mining typically use English data and  large models to achieve good results. In contrast, our experiments are multilingual using data in the four Swiss national languages, German, French, Italian, and (low-resource) Romansh. Specifically, we use texts from official Swiss federal voting brochures, which contain clearly structured pro and contra argument sections presented in parallel across languages. Using this dataset, we evaluate a range of open weight language models with different parameter sizes and models with and without reasoning capabilities.

Kirill Semenov: Cross-Lingual Sentiment Analysis of International Conflicts in LLMs and their Training Data
Armed conflicts usually lead to polarized opinions within communities representing the conflict's sides. Therefore, a significant part of war-related events or personalities has two descriptions, each framed by the narrative of each side. This is best formulated in an expression: “One person’s terrorist is another person’s freedom fighter.” In conflicts where each side speaks its own language, we can expect that the distributions of sentiment toward the same events and personalities will differ greatly depending on the language used by a community. 

Will this sentiment difference be reflected in the performance of multilingual LLMs? We hypothesize that it will be observable: for example, a model, prompted in Russian and Ukrainian, will characterize the same war-related personality in a significantly different manner. In this presentation, we will discuss the experimental design aimed at measuring this difference. Specifically, we will address how to analyze the contributions of the main stages in an LLM training pipeline: pre-training corpora, pre-trained (base) models, alignment (instruction tuning), and custom post-training or optimization. The research is ongoing, so we will appreciate any suggestions and criticism, especially from the fields of political stance classification and multilingual political bias.

31 Mar 2026

Chuqiao Yan: Accessible Charts: Alt Text Generation for Screen Reader Users
Charts are widely used everywhere, and they are powerful tools for analyzing data, revealing patterns, and communicating insights. For sighted people, viewing a chart is intuitive. However, for people with visual impairments, charts can create a barrier, which prevents access to important information when no other assistance is available. In practice, visually impaired users often rely on screen reader software such as Apple VoiceOver or JAWS. While these tools are effective for textual content, they are unable to interpret non-textual elements such as charts and images. To address this issue, alternative text (alt text) can provide textual descriptions of visual content and improve accessibility. With recent advances in vision-language models, automatically generating alt text for charts has become feasible, it reduces the time and effort required to write it manually. However, an important challenge remains: how to ensure the quality and usefulness of the generated descriptions. In this talk, I will first review existing methods and datasets for alt text generation. I will then present ongoing work, which is a tag-guided approach with some preliminary results. This opens further discussion on other possible solutions to improve alt text generation.

Sina Ahmadi: Language Varieties as Geolects on a Geodesic Grid
Language identification is a key preprocessing step for building multilingual corpora, yet current approaches treat languages and varieties as discrete, clearly bounded categories. This obscures the continuous nature of linguistic variation: languages do not end where labels say they do, but fade into one another across space. I propose reframing variety identification as a geographically grounded task, representing varieties as geolects. Rather than classifying texts into predefined labels, I place them on a geodesic grid, where predictions are points on the earth's surface and variety distinctions emerge from spatial structure. I will discuss the approach to data collection and some preliminary experiment results.