Skip to content

[ICLR 2025] This repo is the official implementation of "The Labyrinth of Links: Navigating the Associative Maze of Multi-modal LLMs".

License

Notifications You must be signed in to change notification settings

lihongcs/LLM_Inception

Repository files navigation

The Labyrinth of Links: Navigating the Associative Maze of Multi-modal LLMs

ICLR 2025

     
LLM_Inception Teaser

Get Started

  1. Clone the repository
git clone https://github.com/lihong2303/LLM_Inception.git
cd LLM_Inception
  1. Create conda environment and install dependencies.
conda create -n llm_inception python=3.10
conda activate llm_inception

# install PyTorch, take our version for example
conda install pytorch==2.2.2 torchvision==0.17.2 torchaudio==2.2.2 cudatoolkit=11.8 -c pytorch

pip install -r requirements.txt
  1. Eval Single-step Association with:
python eval_singlestep.py \
    --data_root Data \
    --data_type pangea_data \
    --model_type "mplug3" \
    --prompt_type "task_instruction_nomem" \
    --attr_constraint "cut" \
    --expt_dir "logs" \
    --few_shot_num 3
  1. Eval Multi-step Association with:
python eval_multistep.py \
    --data_root Data \
    --data_type ocl_attr_data \
    --model_type "llava-onevision" \
    --prompt_type "task_instruction" \
    --attr_constraint "furry,metal" \
    --expt_dir "logs" \
    --few_shot_num 3

Dataset

We reconstructed two association datasets based on adjective and verb concepts, for details on how to download the dataset and the structure please refer to Data.

Reference

@article{li2024labyrinth,
  title={The Labyrinth of Links: Navigating the Associative Maze of Multi-modal LLMs},
  author={Li, Hong and Li, Nanxi and Chen, Yuanjie and Zhu, Jianbin and Guo, Qinlu and Lu, Cewu and Li, Yong-Lu},
  journal={arXiv preprint arXiv:2410.01417},
  year={2024}
}

Acknowledgement

We extend our gratitude to the prior outstanding work in object concept learning, particularly OCL and Pangea, which serve as the foundation for our research.

About

[ICLR 2025] This repo is the official implementation of "The Labyrinth of Links: Navigating the Associative Maze of Multi-modal LLMs".

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages