Skip to content

Deepayan137/R2P

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

9 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Training-Free Personalization via Retrieval and Reasoning on Fingerprints

ICCV 2025 arXiv Project Page

Training-Free Personalization via Retrieval and Reasoning on Fingerprints
Deepayan Das, Davide Talon, Yiming Wang, Massimiliano Mancini, Elisa Ricci International Conference on Computer Vision (ICCV) 2025

πŸ“‹ Abstract

This repository contains the official implementation of R2P (Retrieval and Reasoning for Personalization), a novel training-free framework that enables personalization using only pre-trained Vision-Language Models (VLMs). Our approach demonstrates for the first time that training-free personalization is feasible by leveraging retrieval and reasoning mechanisms on fingerprints.

πŸš€ Key Features

  • Training-Free: No fine-tuning required - works with pre-trained VLMs
  • Novel R2P Framework: Combines retrieval and reasoning for personalization
  • Fingerprint-Based: Uses fingerprints as the basis for personal concept inference
  • Efficient: Minimal computational overhead compared to training-based methods

πŸ“ Repository Structure

β”œβ”€β”€ LICENSE
β”œβ”€β”€ README.md
β”œβ”€β”€ requirements.txt       # Python dependencies
β”œβ”€β”€ data/                  # Raw datasets (YoLLaVA, MyVLM, PerVA)
β”œβ”€β”€ eval_files/           # Generated evaluation files
β”œβ”€β”€ example_database/     # Example database for quick testing
β”œβ”€β”€ scripts/              # Utility scripts
└── src/                  # Source code
    β”œβ”€β”€ main.py           # Main evaluation script
    β”œβ”€β”€ detector.py       # Detection components
    β”œβ”€β”€ retriever.py      # Retrieval system
    β”œβ”€β”€ database/         # Database creation and management
    β”‚   β”œβ”€β”€ create_db.py
    β”‚   β”œβ”€β”€ create_train_test_split.py
    β”‚   β”œβ”€β”€ create_train_test_perva_split.py
    β”‚   β”œβ”€β”€ create_ret_files.py
    β”‚   └── mini_cpm_info.py
    β”œβ”€β”€ models/           # Model implementations
    β”‚   β”œβ”€β”€ model_interface.py
    β”‚   β”œβ”€β”€ model_adapters.py
    β”‚   β”œβ”€β”€ prompt_generator.py
    β”‚   β”œβ”€β”€ mini_cpm_reasoning.py
    β”‚   β”œβ”€β”€ internvl_reasoning.py
    β”‚   └── qwen_reasoning.py
    β”œβ”€β”€ evaluators/       # Evaluation metrics
    β”‚   β”œβ”€β”€ compute_accuracy.py
    β”‚   └── compute_confidence.py
    └── utils/            # Utility functions
        β”œβ”€β”€ defined.py
        β”œβ”€β”€ helpers.py
        └── generate_report.py

πŸ› οΈ Installation

Prerequisites

  • Python >= 3.8
  • CUDA >= 11.0 (for GPU support)
  • 16GB+ RAM recommended

Environment Setup

  1. Clone the repository

    git clone https://github.com/Deepayan137/R2P
    cd R2P
  2. Create virtual environment

    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
  3. Install dependencies

    pip install -r requirements.txt

Verify Installation

python -c "import torch; print(f'PyTorch version: {torch.__version__}')"
python -c "import transformers; print('Transformers installed successfully')"

πŸ“Š Dataset Setup

Download Datasets

Our experiments use three datasets for personalization evaluation:

  1. YoLLaVA Dataset
  2. MyVLM Dataset
  3. PerVA Dataset

For YoLLaVA and MyVLM, Please download the data used in the paper from the repositories of MyVLM and Yo'LLaVA.

For PerVA download the dataset from here

Dataset Structure

After downloading, your data directory should contain:

data/
β”œβ”€β”€ data-yollava/              # YoLLaVA dataset
β”œβ”€β”€ data-myvlm/                # MyVLM dataset
└── data-perva/                # PerVA dataset

πŸš€ Setup and Preparation

Step 1: Create Train-Test Splits

Before running the main evaluation, you need to generate train-test splits for each dataset:

For YoLLaVA and MyVLM datasets:

# Generate train-test split for MyVLM
python src/database/create_train_test_split.py --dataset MyVLM --seed 23

# Generate train-test split for YoLLaVA  
python src/database/create_train_test_split.py --dataset YoLLaVA --seed 23

For PerVA dataset:

# PerVA uses a different split creation script
python src/database/create_train_test_perva_split.py --seed 23

Step 2: Create Database for Retrieval

Create the retrieval database for each dataset:

# Create database for MyVLM
python src/database/create_db.py --dataset MyVLM --seed 23 --user_defined

# Create database for YoLLaVA
python src/database/create_db.py --dataset YoLLaVA --seed 23 --user_defined

# Create database for PerVA
python src/database/create_db.py --dataset PerVA --seed 23 --user_defined

Step 3: Generate Evaluation Files

Create the evaluation files needed for testing:

# Generate evaluation files for YoLLaVA
python ssrc/database/create_eval_files.py --dataset YoLLaVA  --seed 23

# Generate evaluation files for MyVLM
python ssrc/database/create_eval_files.py --dataset MyVLM  --seed 23

# Generate evaluation files for PerVA
python ssrc/database/create_eval_files.py --dataset PerVA  --seed 23

πŸƒβ€β™‚οΈ Running Experiments

Main Evaluation Command

After completing the setup steps, run the main evaluation with the following command structure:

python src/main.py --concept_name $CONCEPT_NAME --category_name $CATEGORY_NAME \
--seed $SEED --data_name $DATA_NAME --rerank_early --refined_k 3 \
--attribute_based_step_by_step --two_step --input_type text --only_recall --model_name $MODEL_NAME

Dataset-Specific Usage

For YoLLaVA and MyVLM datasets:

# Example: Run on MyVLM dataset
python src/main.py --concept_name "person_john" --category_name "all" \
--seed 23 --data_name "MyVLM" --rerank_early --refined_k 3 \
--attribute_based_step_by_step --two_step --input_type text --only_recall --model_name mini_cpm

# Example: Run on YoLLaVA dataset  
python src/main.py --concept_name "dog_max" --category_name "all" \
--seed 23 --data_name "YoLLaVA" --rerank_early --refined_k 3 \
--attribute_based_step_by_step --two_step --input_type text --only_recall --model_name mini_cpm

For PerVA dataset:

# Example: Run on PerVA dataset with specific category
python src/main.py --concept_name "my_bag" --category_name "bag" \
--seed 23 --data_name "PerVA" --rerank_early --refined_k 3 \
--attribute_based_step_by_step --two_step --input_type text --only_recall --model_name mini_cpm

Available Categories

YoLLaVA and MyVLM: Use "all" as category_name

PerVA: Choose from the following categories:

  • bag book bottle bowl clothe cup decoration headphone pillow plant plate remote retail telephone tie towel toy tro_bag tumbler umbrella veg

Available Models

The framework supports multiple VLM models:

  • mini_cpm: Mini-CPM model (default)
  • internvl: InternVL model
  • qwen: Qwen model

Note: Please follow the instructions for installing dependencies related to qwen and intervl from their respective HuggingFace repositories. The present codebase has not been extensively tested with internvl and qwen.

Key Parameters

  • --concept_name: Name of the personalized concept (e.g., "person_john", "my_bag")
  • --category_name: Category/superclass of the concept ("all" for YoLLaVA/MyVLM, specific category for PerVA)
  • --seed: Random seed for reproducibility (recommended: 23)
  • --data_name: Dataset name (YoLLaVA, MyVLM, or PerVA)
  • --refined_k: Number of top retrievals to consider (default: 3)
  • --model_name: VLM model to use for reasoning
  • --rerank_early: Enable early reranking
  • --attribute_based_step_by_step: Use attribute-based reasoning
  • --two_step: Enable two-step reasoning process
  • --input_type text: Use text input type
  • --only_recall: Focus on recall evaluation

Batch Processing with SLURM

If you have access to a SLURM cluster, you can use the provided job array scripts for efficient batch processing:

# Submit YoLLaVA evaluation jobs
sbatch scripts/test_yollava.sh 23

# Submit MyVLM evaluation jobs  
sbatch scripts/test_myvlm.sh 23

# Submit PerVA evaluation jobs
sbatch scripts/test_perva.sh 23

Batch Processing without SLURM

If SLURM is not available, you can use bash loop scripts for sequential processing:

# Make scripts executable
chmod +x scripts/run_yollava_loop.sh
chmod +x scripts/run_myvlm_loop.sh  
chmod +x scripts/run_perva_loop.sh

# Run YoLLaVA evaluation
./scripts/run_yollava_loop.sh 23

# Run MyVLM evaluation  
./scripts/run_myvlm_loop.sh 23

# Run PerVA evaluation
./scripts/run_perva_loop.sh 23

Features of bash loop scripts:

  • Sequential processing of all concepts in a dataset
  • Individual log files for each concept (stored in logs/ directory)
  • Progress tracking and error handling
  • Same functionality as SLURM scripts but without job scheduling

Note: Bash loop scripts process concepts sequentially, which may take longer than SLURM parallel processing.

Parameters:

  • First argument: seed value (e.g., 23)

Manual Single-Concept Evaluation

For testing individual concepts or when SLURM is not available, use the manual commands shown above.

πŸ“Š Results and Report Generation

Output Structure

After running the main.py, results are stored in the following structure:

outputs/
└── Mini_CPM_YoLLaVA_seed_23/          # Format: {MODEL_NAME}_{DATASET}_{seed}_{SEED_VALUE}
    β”œβ”€β”€ concept1.json                   # Individual concept results
    β”œβ”€β”€ concept2.json
    └── ...

Each concept JSON file contains:

  • Recall values: Caption recall metrics
  • Recognition accuracy: Classification performance metrics

Generate Aggregate Reports

To calculate aggregate metrics across all concepts in a dataset:

For Recall metrics:

# YoLLaVA dataset
python src/utils/generate_report.py --model_name mini_cpm --dataset YoLLaVA --seed 23 --name recall_R2P

# MyVLM dataset  
python src/utils/generate_report.py --model_name mini_cpm --dataset MyVLM --seed 23 --name recall_R2P

# PerVA dataset
python src/utils/generate_report.py --model_name mini_cpm --dataset PerVA --seed 23 --name recall_R2P

For Recognition + Recall metrics:

# YoLLaVA dataset
python src/utils/generate_report.py --model_name mini_cpm --dataset YoLLaVA --seed 23 --name reco+recall_R2P

# MyVLM dataset
python src/utils/generate_report.py --model_name mini_cpm --dataset MyVLM --seed 23 --name reco+recall_R2P

# PerVA dataset  
python src/utils/generate_report.py --model_name mini_cpm --dataset PerVA --seed 23 --name reco+recall_R2P

Generated Reports

Reports are saved in the following structure:

report_23/                                    # Format: report_{SEED}
β”œβ”€β”€ YoLLaVA_mini_cpm_recall_results.json     # Recall metrics
β”œβ”€β”€ YoLLaVA_mini_cpm_reco+recall_results.json # Recognition + Recall metrics
β”œβ”€β”€ MyVLM_mini_cpm_recall_results.json
β”œβ”€β”€ MyVLM_mini_cpm_reco+recall_results.json
β”œβ”€β”€ PerVA_mini_cpm_recall_results.json
└── PerVA_mini_cpm_reco+recall_results.json

Report Parameters

  • --model_name: Model used for evaluation (mini_cpm, internvl, qwen)
  • --dataset: Dataset name (YoLLaVA, MyVLM, PerVA)
  • --seed: Seed value used in evaluation
  • --name: Report type (recall_R2P or reco+recall_R2P)

Complete Workflow Example

# 1. Run evaluation for all concepts
sbatch scripts/test_yollava.sh 23
# OR
./scripts/run_yollava_loop.sh 23

# 2. Generate recall report
python src/utils/generate_report.py --model_name mini_cpm --dataset YoLLaVA --seed 23 --name recall_R2P

# 3. Generate recognition + recall report  
python src/utils/generate_report.py --model_name mini_cpm --dataset YoLLaVA --seed 23 --name reco+recall_R2P

# 4. Check results
cat report_23/YoLLaVA_mini_cpm_recall_results.json

πŸ“ˆ Results

Main Results

Our R2P framework achieves state-of-the-art performance across three personalization datasets:

Recognition Performance (Positive Accuracy)

Method MyVLM Dataset YoLLaVA Dataset PerVA Dataset
MyVLM 96.6% 94.9% 66.0%
YoLLaVA 97.0% 86.0% 75.1%
RAP 94.4% - 92.9%
MiniCPM-o + prompt 98.5% 81.2% 73.0%
R2P (Ours) 96.3% 91.1% 90.2%

Captioning Performance (Recall)

Method MyVLM Dataset YoLLaVA Dataset PerVA Dataset
MyVLM 84.7% 81.6% 0.3%
YoLLaVA 86.7% 79.7% 6.6%
RAP 88.0% - 64.1%
MiniCPM-o + prompt 87.4% 73.9% 65.7%
R2P (Ours) 91.4% 87.1% 72.5%

Detailed results and analysis can be found in our paper.

βš–οΈ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ“– Citation

If you find our work useful, please consider citing:

@inproceedings{[your-name]2025r2p,
  title={Training-Free Personalization via Retrieval and Reasoning on Fingerprints},
  author={[Your Name(s)]},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
  year={2025}
}

πŸ™ Acknowledgments

  • The codebase is based on RAP. Thanks to the authors.

πŸ“§ Contact


⭐ Star this repo if you find it helpful! ⭐

About

Official codebase for the paper "Training-Free Personalization via Retrieval and Reasoning on Fingerprints"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published