VLCDoC: Vision-Language Contrastive Pre-training Model for Cross-Modal Document Classification
Published in Pattern Recognition, 2023
Recommended citation: S. Bakkali, Z. Ming, M. Coustaty, M. Rusinol, O. Ramos Terrades. "VLCDoC: Vision-Language Contrastive Pre-training Model for Cross-Modal Document Classification." Pattern Recognition 139:109419, 2023. [Paper]
Vision-language contrastive pre-training for robust cross-modal document classification.
