Research Activities

Focus Areas

Multimodal AI for document understanding, vision-language models, document retrieval, abstractive summarization, low-resource scene-text recognition, explainable multimodal systems, and privacy-aware learning.

Grants and Funding

  • PI: DocInnovate (ACI JCJC 2024), 7,000 EUR
  • Co-PI: FELM (EU-CONEXUS 2025-2027), 153,000 EUR
  • Co-PI: FlexChain (2025-2027), 124,000 EUR
  • Co-PI: ModamPadoc (RNA-LRU 2025-2027), 120,000 EUR
  • Co-PI: AI for ECO-WELLNESS (PHC PERIDOT 2024-2026), 28,000 EUR

Collaborations

  • CVC (UAB, Spain), KU Leuven (Belgium), L2TI (France)
  • SETU (Ireland), Josef Stefan Institute (Slovenia), USTH (Vietnam)
  • Cambodia Academy of Digital Technology, MNS University (Pakistan)
  • Industry: Allread ML Technologies (Spain), Siren Company (France)

Supervision and Mentoring

  • Co-supervising PhD students on low-resource scene-text recognition and explainable multimodal document understanding
  • Supervised M1/M2 and B3 projects across France, India, and Vietnam

Service and Leadership

  • Organizer and moderator: Research Reading Group (L3i, 2023-present)
  • Organizer: Seminars on deep multimodal learning
  • Organizing committee: ICPR 2024 Call for Competitions; DAS 2022
  • Peer review: PR, IJCV, EAAI, Neurocomputing; ICDAR, ICPR, ICIP, LREC-COLING

Invited Talks and Presentations

  • LORIA-Lab, Nancy: Pareto-efficient multimodal document understanding
  • L3i, La Rochelle: Semantic multimodal document representation learning (2022)
  • CVC, Barcelona: Multimodal document understanding

Infrastructure and Tooling

  • Managed a GPU cluster supporting large-scale multimodal AI experiments
  • Experiment tracking with MLflow and TensorBoard; reproducible workflows with Docker and SLURM