Header

Search

Visual Alignment for OCR Robust Text Embeddings

Supervisor(s): Andrianos Michail (contact point) & Dr. Juri Opitz &Dr. Simon Clematide

Summary

Semantic search performance in historical document collections is significantly reduced by OCR errors introduced during the digitization process.

Research questions:
1. Can post-alignment to visual encoders (e.g., DeepSeekOCR) effectively mitigate OCR-induced performance degradation in historical text embeddings?

2. What is the optimal strategy to integrate this visual alignment approach with existing denoising training methods that we have previously developed?

Relevant Work: https://aclanthology.org/2025.findings-acl.609/

Expected outcome: 

  • Visually aligned text embedding models
  • Quantitatively evaluated
  • Qualitatively examined

Requirements

  • Deep Learning
  • Basics of Computer Vision
  • Python/PyTorch