MERLIN

Official github repository of MERLIN : Multimodal Embedding Refinement via LLM-based Iterative Navigation for Text-Video Retrieval-Rerank Pipeline

🔴 I updated the initial version of the code and tested with several cases of MSRVTT-1k. You will be able to followup how MERLIN works with current code. (you can check some examples in the output folder just in case)
🔴 Soon I will clean up the codes and make test run them on other datasets too.
🔴 I will update the README.md with more details about the code.
🚧 Rest of the code will be updated ASAP🙏. 🚧

Subsampled Testset

Link

Precomputed Embeddings

We provide precomputed vertex embeddings for the following datasets:

MERLIN's log on ActivityNet, MSRVTT-1k, MSVD

Link

Brief of MERLIN

There are 3 main components in MERLIN to help understand the pipeline.

Questioner
Reranker
Answerer (it is human simulating agent and can be replaced with actual human). It is not part of MERLIN pipeline but it is used to replace human in the loop.

Environment Setup

Since most of embedding computation is done by API, there is no strict version dependency. Code will run on most of versions of packages(torch, sklearn, numpy, etc.) you have.
I will update the environment setup with both pixi & requirements.txt soon though.

Before running MERLIN

You need to have a Google Cloud account with Vertex AI API enabled.
You need to have an OpenAI API key.
Vertex AI API cost should not cost much but be careful with using OpenAI API.
You need to have a dataset downloaded and placed in the data folder.

Running MERLIN

To run MERLIN on a specific dataset:

python run_merlin.py --dataset msrvtt --data_path /path/to/data --num_rounds 5

Available datasets:

msrvtt: MSRVTT-1k dataset
msvd: MSVD dataset
anet: ActivityNet dataset

Command Line Arguments

--dataset: Dataset to process (required, choices: "msvd", "msrvtt", "anet")
--data_path: Path to dataset directory (default: "data")
--output_dir: Directory to save outputs (default: "outputs")
--num_rounds: Number of rounds for question-answering iteration (default: 5)
--model_name: OpenAI model to use (default from .env)
--max_tokens: Maximum tokens for model response (default from .env)
--debug: Enable debug mode
--env_file: Path to .env file (default: ".env.local")

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
data/msrvtt		data/msrvtt
human_agent		human_agent
merlin		merlin
outputs/chatlog_rerank_msrvtt		outputs/chatlog_rerank_msrvtt
utils		utils
.env		.env
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pixi.toml		pixi.toml
run_merlin.py		run_merlin.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MERLIN

Subsampled Testset

Precomputed Embeddings

MERLIN's log on ActivityNet, MSRVTT-1k, MSVD

Brief of MERLIN

Environment Setup

Before running MERLIN

Running MERLIN

Command Line Arguments

About

Uh oh!

Releases

Packages

Languages

License

leegisang/MERLIN_text_to_video_search

Folders and files

Latest commit

History

Repository files navigation

MERLIN

Subsampled Testset

Precomputed Embeddings

MERLIN's log on ActivityNet, MSRVTT-1k, MSVD

Brief of MERLIN

Environment Setup

Before running MERLIN

Running MERLIN

Command Line Arguments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages