Multimodal Data Management and Database works.
Kindly let us know if we have missed any great papers. Thank you!
- Survey and Tutorial
- Multimodal Learning
- Multimodal Retrieval
- Tabular Data
- Data Centric
- Augmentations for multimodal data
- Recent advances and trends in multimodal deep learning: a review(arXiv.org 24 May 2021)-YZY
- Retrieving Multimodal Information for Augmented Generation: A Survey(EMNLP2023)-YZY
- Multi-Modal Hashing for Efficient Multimedia Retrieval: A Survey(TKDE2023)-YZY
- Multimodal machine learning: A survey and taxonomy(TPAMI2018)-YZY
- How to Bridge the Gap between Modalities: A Comprehensive Survey on Multi-modal Large Language Model(arxiv2023)-YZY
- A Survey of Multimodal Large Language Model from A Data-centric Perspective(arxiv May2024)-YZY
- Multimodal Machine Learning: A Survey and Taxonomy-YZY
- Large-scale Multi-modal Pre-trained Models: A Comprehensive Survey-YZY
- Data Management For Large Language Models: A Survey-YZY
- Tutorial on MultiModal Machine Learning(ICML2023)-YZY
- Multimodal deep learning(ICML2011)-YZY
- What Makes Multi-modal Learning Better than Single (Provably)(nips2021)-YZY
- GETTING ALIGNED ON REPRESENTATIONAL ALIGNMENT()-YZY
- Provable Dynamic Fusion for Low-Quality Multimodal Data(ICML2023)-YZY
- Towards Semantic Consistency: Dirichlet Energy Driven Robust Multi-Modal Entity Alignment(ICDE2024)-YZY
- Multimodal Foundation Models: From Specialists to General-Purpose Assistants-YZY
- Learning Transferable Visual Models From Natural Language Supervision (CLIP)(ICML2021)-ZM
- Attention Is All You Need (Transformer)(NeurIPS)-ZM
- An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (Vision Transformer, ViT)-ZM
- Deep Residual Learning for Image Recognition (ResNet)(CVPR2016)-ZM
- MUST: An Effective and Scalable Framework for Multimodal Search of Target Modality(ICDE2024)-YZY
- Efficient and Effective Multi-Modal Queries through Heterogeneous Network Embedding(TKDE2021)-YZY, ZM
- Symphony: Towards natural language query answering over multi-modal data lakes(CIDR2023)-YZY
- CAESURA: Language Models as Multi-Modal Query Planners(CIDR2024)-YZY
- Scalable Deep Multimodal Learning for Cross-Modal Retrieval(SIGIR2019)-YZY
- Learned Data-aware Image Representations of Line Charts for Similarity Search(SIGMOD2023)-YZY
- Multimodal Graph Learning for Cross-Modal Retrieval(SDM2023)-YZY
- End-to-end Knowledge Retrieval with Multi-modal Queries(ACl2023)-YZY
- Multimodal Chart Retrieval: A Comparison of Text, Table and Image Based Approaches(ACL2023)-YZY
- Make: Vision-language pre-training based product retrieval in taobao search(WWW2023)-ZM
- Rethinking Benchmarks for Cross-modal Image-text Retrieval(SIGIR2023)-ZM
- COLA: A Benchmark for Compositional Text-to-image Retrieval(NeurIPS2023)-ZM
- CoVR: Learning Composed Video Retrieval from Web Video Captions(AAAI2024)-ZM
- MR2: A Benchmark for Multimodal Retrieval-Augmented Rumor Detection in Social Media(SIGIR2023)-ZM
- TABERT: Pretraining for Joint Understanding of Textual and Tabular Data(ACL2020)-YZY
- Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training-YZY
- How Large Language Models Will Disrupt Data Management(VLDB2023)-YZY
- Beyond Scale: the Diversity Coefficient as a Data Quality Metric Demonstrates LLMs are Pre-trained on Formally Diverse Data(ICML2023)-YZY
- From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning(NAACL2024)-YZY
- Adaptive data augmentation for supervised learning over missing data(VLDB2021)-YZY
- Data Collection and Quality Challenges in Deep Learning: A Data-Centric AI Perspective-YZY