VLCDoC: Vision-Language Contrastive Pre-training Model for Cross-Modal Document Classification

Published in Pattern Recognition, 2023

Recommended citation: S. Bakkali, Z. Ming, M. Coustaty, M. Rusinol, O. Ramos Terrades. "VLCDoC: Vision-Language Contrastive Pre-training Model for Cross-Modal Document Classification." Pattern Recognition 139:109419, 2023. [Paper]

Vision-language contrastive pre-training for robust cross-modal document classification.