Multimodal-Data

Multimodal Data Management and Database works.

Kindly let us know if we have missed any great papers. Thank you!

Recent advances and trends in multimodal deep learning: a review(arXiv.org 24 May 2021)-YZY
Retrieving Multimodal Information for Augmented Generation: A Survey(EMNLP2023)-YZY
Multi-Modal Hashing for Efficient Multimedia Retrieval: A Survey(TKDE2023)-YZY
Multimodal machine learning: A survey and taxonomy(TPAMI2018)-YZY
How to Bridge the Gap between Modalities: A Comprehensive Survey on Multi-modal Large Language Model(arxiv2023)-YZY
A Survey of Multimodal Large Language Model from A Data-centric Perspective(arxiv May2024)-YZY
Multimodal Machine Learning: A Survey and Taxonomy-YZY
Large-scale Multi-modal Pre-trained Models: A Comprehensive Survey-YZY
Data Management For Large Language Models: A Survey-YZY

Tutorial

Tutorial on MultiModal Machine Learning(ICML2023)-YZY

1. Multimodal Learning

Multimodal deep learning(ICML2011)-YZY
What Makes Multi-modal Learning Better than Single (Provably)(nips2021)-YZY
GETTING ALIGNED ON REPRESENTATIONAL ALIGNMENT()-YZY
Provable Dynamic Fusion for Low-Quality Multimodal Data(ICML2023)-YZY
Towards Semantic Consistency: Dirichlet Energy Driven Robust Multi-Modal Entity Alignment(ICDE2024)-YZY
Multimodal Foundation Models: From Specialists to General-Purpose Assistants-YZY
Learning Transferable Visual Models From Natural Language Supervision (CLIP)(ICML2021)-ZM
Attention Is All You Need (Transformer)(NeurIPS)-ZM
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (Vision Transformer, ViT)-ZM
Deep Residual Learning for Image Recognition (ResNet)(CVPR2016)-ZM

2. Multimodal Retrieval

MUST: An Effective and Scalable Framework for Multimodal Search of Target Modality(ICDE2024)-YZY
Efficient and Effective Multi-Modal Queries through Heterogeneous Network Embedding(TKDE2021)-YZY, ZM
Symphony: Towards natural language query answering over multi-modal data lakes(CIDR2023)-YZY
CAESURA: Language Models as Multi-Modal Query Planners(CIDR2024)-YZY
Scalable Deep Multimodal Learning for Cross-Modal Retrieval(SIGIR2019)-YZY
Learned Data-aware Image Representations of Line Charts for Similarity Search(SIGMOD2023)-YZY
Multimodal Graph Learning for Cross-Modal Retrieval(SDM2023)-YZY
End-to-end Knowledge Retrieval with Multi-modal Queries(ACl2023)-YZY
Multimodal Chart Retrieval: A Comparison of Text, Table and Image Based Approaches(ACL2023)-YZY
Make: Vision-language pre-training based product retrieval in taobao search(WWW2023)-ZM

Benchmark

Rethinking Benchmarks for Cross-modal Image-text Retrieval(SIGIR2023)-ZM
COLA: A Benchmark for Compositional Text-to-image Retrieval(NeurIPS2023)-ZM
CoVR: Learning Composed Video Retrieval from Web Video Captions(AAAI2024)-ZM
MR2: A Benchmark for Multimodal Retrieval-Augmented Rumor Detection in Social Media(SIGIR2023)-ZM

3. Tabular Data

4. Data Centric

5. Augmentations for multimodal data

Cross-Modal Attribute Insertions for Assessing the Robustness of Vision-and-Language Learning(ACL2023)-ZM

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Multimodal-Data

Table of Contents

0. Survey and Tutorial

Survey

Tutorial

1. Multimodal Learning

2. Multimodal Retrieval

Benchmark

3. Tabular Data

4. Data Centric

5. Augmentations for multimodal data

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

SuDIS-ZJU/Multimodal-Data

Folders and files

Latest commit

History

Repository files navigation

Multimodal-Data

Table of Contents

0. Survey and Tutorial

Survey

Tutorial

1. Multimodal Learning

2. Multimodal Retrieval

Benchmark

3. Tabular Data

4. Data Centric

5. Augmentations for multimodal data

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Packages