Machine Unlearning Embedding Model Biases
Supervisor(s): Andrianos Michail (contact point) &Dr. Simon Clematide
Summary
Research problem:
Embedding models have implicitly learned undesirable biases through their training processes. An examples of interest of us include over-reliance on surface-level (lexical) similarity.
RQ:
1. Which of these undesirable biases are present in current embedding models, and how can they be measured automatically?
2. Can embedding models (e.g., mGTE) be post-aligned via auxiliary objectives to unlearn these properties while preserving downstream performance?
Expected outcome:
Text embedding models with reduced biases
Quantitatively evaluated
Suitable for an MA(CL/IFI) Thesis
Requirements
- Deep Learning
- Python/PyTorch