Research Activities
Focus Areas
Multimodal AI for document understanding, vision-language models, document retrieval, abstractive summarization, low-resource scene-text recognition, explainable multimodal systems, and privacy-aware learning.
Grants and Funding
- PI: DocInnovate (ACI JCJC 2024), 7,000 EUR
- Co-PI: FELM (EU-CONEXUS 2025-2027), 153,000 EUR
- Co-PI: FlexChain (2025-2027), 124,000 EUR
- Co-PI: ModamPadoc (RNA-LRU 2025-2027), 120,000 EUR
- Co-PI: AI for ECO-WELLNESS (PHC PERIDOT 2024-2026), 28,000 EUR
Collaborations
- CVC (UAB, Spain), KU Leuven (Belgium), L2TI (France)
- SETU (Ireland), Josef Stefan Institute (Slovenia), USTH (Vietnam)
- Cambodia Academy of Digital Technology, MNS University (Pakistan)
- Industry: Allread ML Technologies (Spain), Siren Company (France)
Supervision and Mentoring
- Co-supervising PhD students on low-resource scene-text recognition and explainable multimodal document understanding
- Supervised M1/M2 and B3 projects across France, India, and Vietnam
Service and Leadership
- Organizer and moderator: Research Reading Group (L3i, 2023-present)
- Organizer: Seminars on deep multimodal learning
- Organizing committee: ICPR 2024 Call for Competitions; DAS 2022
- Peer review: PR, IJCV, EAAI, Neurocomputing; ICDAR, ICPR, ICIP, LREC-COLING
Invited Talks and Presentations
- LORIA-Lab, Nancy: Pareto-efficient multimodal document understanding
- L3i, La Rochelle: Semantic multimodal document representation learning (2022)
- CVC, Barcelona: Multimodal document understanding
Infrastructure and Tooling
- Managed a GPU cluster supporting large-scale multimodal AI experiments
- Experiment tracking with MLflow and TensorBoard; reproducible workflows with Docker and SLURM
