my work

PreFLMR: SoTA Open-sourced Multi-modal Knowledge Retriever from Scaling Up FLMR

[1,087 words, 5-minute read] Three products emerged from our study in scaling up multi-modal late-interaction retrievers: * The Multi-task Multi-modal Knowledge Retrieval benchmark (M2KR) totaling 4.4M training examples for training and comprehensively evaluating knowledge retrievers on question-to-doc, image-to-doc, and question+image-to-doc tasks. * The Pretrained Fine-grained Late-interaction Multi-modal Retriever (PreFLMR)
Jinghong Chen