GlobalDoc: A Cross-Modal Vision-Language Framework for Real-World Document Image Retrieval and Classification

Published in IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2025

Recommended citation: S. Bakkali, S. Biswas, Z. Ming, M. Coustaty, M. Rusinol, O. Ramos Terrades, J. Llados. "GlobalDoc: A Cross-Modal Vision-Language Framework for Real-World Document Image Retrieval and Classification." WACV 2025, pp. 1436-1446. [Paper]

A cross-modal framework for robust document image retrieval in real-world settings.