Data-centric Approaches to Boosting MLLMs: A Systematic Review

This repository contains the papers, datasets, and resources associated with the systematic review titled "Data-centric Approaches to Boosting MLLMs (Multimodal Large Language Models): A Systematic Review." The review focuses on exploring various data-centric strategies employed to enhance the performance, robustness, and safety of MLLMs across different domains.

Introduction

Multimodal Large Language Models (MLLMs) are emerging as powerful tools for handling complex tasks that involve both visual and textual information. However, their success heavily depends on the quality and diversity of the data used during training and fine-tuning. This repository collects papers and resources related to data-centric approaches that focus on optimizing, curating, and utilizing data to improve MLLM performance in various applications such as vision-language understanding, visual question answering, and safety-critical tasks.

Our goal is to systematically review and categorize recent works that highlight data-centric methodologies for:

Improving model generalization
Enhancing multimodal alignment
Ensuring safety and fairness
Handling data noise and imbalance

Repository Structure

📁 Data-centric-Approaches-MLLMs │ ├── 📁 papers/ # Folder containing the key papers reviewed │ ├── paper1.pdf │ ├── paper2.pdf │ └── ... │ ├── 📁 datasets/ # Links and references to relevant datasets │ ├── dataset1_info.md │ ├── dataset2_info.md │ └── ... │ ├── 📁 scripts/ # Any useful scripts for processing datasets or experiments │ └── process_data.py │ └── README.md # You are here!

Key Papers

Here is a selection of key papers reviewed in this repository:

Paper Title 1
Authors: A. Author, B. Author
Summary: A comprehensive overview of data-centric techniques for improving vision-language alignment in MLLMs. Link to paper
Paper Title 2
Authors: C. Author, D. Author
Summary: This paper focuses on noise-robust training methodologies for multimodal models by curating high-quality datasets. Link to paper
Paper Title 3
Authors: E. Author, F. Author
Summary: A study on the impact of fine-tuning with curated safety datasets to prevent harmful content generation in MLLMs. Link to paper

For a full list of papers, see the papers directory.

Datasets

We review and compile several datasets relevant to data-centric MLLM training and evaluation. Some examples include:

Dataset Name 1
Description: A large-scale multimodal dataset for image-captioning tasks.
Link to dataset
Dataset Name 2
Description: A benchmark dataset for evaluating safety and fairness in vision-language models.
Link to dataset

Refer to the datasets directory for detailed descriptions and links.

Contributing

Contributions are welcome! If you have relevant papers, datasets, or scripts that align with the theme of this repository, feel free to open a pull request or raise an issue.

Please make sure to follow these guidelines when contributing:

Fork the repository.
Create a new branch for your changes.
Ensure your contribution is relevant to data-centric approaches in MLLMs.
Submit a pull request for review.

License

This repository is licensed under the MIT License. See the LICENSE file for more details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Data-centric Approaches to Boosting MLLMs: A Systematic Review

Table of Contents

Introduction

Repository Structure

Key Papers

Datasets

Contributing

License

About

Uh oh!

Releases

Packages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
LICENSE		LICENSE
README.md		README.md

License

SuDIS-ZJU/Data-Centric-MLLMs-Enhancements

Folders and files

Latest commit

History

Repository files navigation

Data-centric Approaches to Boosting MLLMs: A Systematic Review

Table of Contents

Introduction

Repository Structure

Key Papers

Datasets

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages