Research Activities

Focus Areas

Multimodal AI for document understanding, vision-language models, document retrieval, abstractive summarization, low-resource scene-text recognition, explainable multimodal systems, and privacy-aware learning.

Grants and Funding

PI: DocInnovate (ACI JCJC 2024), 7,000 EUR
Co-PI: FELM (EU-CONEXUS 2025-2027), 153,000 EUR
Co-PI: FlexChain (2025-2027), 124,000 EUR
Co-PI: ModamPadoc (RNA-LRU 2025-2027), 120,000 EUR
Co-PI: AI for ECO-WELLNESS (PHC PERIDOT 2024-2026), 28,000 EUR

Collaborations

CVC (UAB, Spain), KU Leuven (Belgium), L2TI (France)
SETU (Ireland), Josef Stefan Institute (Slovenia), USTH (Vietnam)
Cambodia Academy of Digital Technology, MNS University (Pakistan)
Industry: Allread ML Technologies (Spain), Siren Company (France)

Supervision and Mentoring

Co-supervising PhD students on low-resource scene-text recognition and explainable multimodal document understanding
Supervised M1/M2 and B3 projects across France, India, and Vietnam

Service and Leadership

Organizer and moderator: Research Reading Group (L3i, 2023-present)
Organizer: Seminars on deep multimodal learning
Organizing committee: ICPR 2024 Call for Competitions; DAS 2022
Peer review: PR, IJCV, EAAI, Neurocomputing; ICDAR, ICPR, ICIP, LREC-COLING

Invited Talks and Presentations

LORIA-Lab, Nancy: Pareto-efficient multimodal document understanding
L3i, La Rochelle: Semantic multimodal document representation learning (2022)
CVC, Barcelona: Multimodal document understanding

Infrastructure and Tooling

Managed a GPU cluster supporting large-scale multimodal AI experiments
Experiment tracking with MLflow and TensorBoard; reproducible workflows with Docker and SLURM