We are hiring: PhD student – Robust-to-noise Information Extraction (OCR & ASR)
Together with the University of La Rochelle, we are recruiting one PhD student. The position will be based at L3i, La Rochelle University (France), and carried out in collaboration with the University of Ljubljana, Faculty of Computer and Information Science.
Topic: Robust-to-noise Information Extraction (OCR & ASR)
Digital text (OCR) and speech (ASR) are often “noisy” due to document degradation or background interference. While these errors are usually studied separately, this PhD aims to develop a unified methodology to correct noise across both modalities. You will work at the intersection of Generative AI and Multimodal NLP to ensure data quality for downstream tasks.
Main objectives
Unified Modeling: Leveraging Transformers and LLMs to learn correction patterns across text and audio
Hybrid Innovation: Combining machine learning with symbolic rules and domain-specific lexicons.
Advanced Evaluation: Designing new metrics that reflect the quality of information extraction beyond simple error rates.
Your profile
Master’s degree in Computer Science or a related field with a strong background in NLP and deep learning and interest in multimodal data (text, speech, and document images).
Deadline: 8 April 2026
All details are available at this link.
To apply, send an email at mickael.coustaty@univ-lr.fr and cyrille.suire@univ-lr.fr
Documents to be submitted are:
– CV
– Motivation letter
– Reference letters