Visual Alignment for OCR Robust Text Embeddings
Supervisor(s): Andrianos Michail (contact point) & Dr. Juri Opitz &Dr. Simon Clematide
Summary
Semantic search performance in historical document collections is significantly reduced by OCR errors introduced during the digitization process.
Research questions:
1. Can post-alignment to visual encoders (e.g., DeepSeekOCR) effectively mitigate OCR-induced performance degradation in historical text embeddings?
2. What is the optimal strategy to integrate this visual alignment approach with existing denoising training methods that we have previously developed?
Relevant Work: https://aclanthology.org/2025.findings-acl.609/
Expected outcome:
- Visually aligned text embedding models
- Quantitatively evaluated
- Qualitatively examined
Requirements
- Deep Learning
- Basics of Computer Vision
- Python/PyTorch