Metric Space Magnitude for analysis of (L)LMs

Supervisors: Marius Huber, Juri Opitz, Michelle Wastl

Summary

Metric space magnitude is a geometric tool that, similarly to entropy, allows one to measure diversity of point clouds. In NLP, it has been used, e.g., to “fingerprint” LLMs through analysis of their latent spaces.

The purpose of this project is to develop the necessary understanding of magnitude as a tool, and, subsequently, to investigate the value of magnitude for NLP by applying it to problems such as, e.g.,

translation (direction) detection;
uniformity/alignment tradeoff in contrastive learning;
explainability of (L)LMs, such as, e.g., effects of training and/or finetuning on magnitude of latent spaces;
...

Resources:

Metric Space Magnitude for Evaluating the Diversity of Latent Representations

Requirements

Python, NLP, “Mathematical Foundations of Computational Linguistics 2” course or equivalent

Quicklinks

Main navigation

Metric Space Magnitude for analysis of (L)LMs

Summary

Requirements