EchoPrime: A Multi-Video View-Informed Vision-Language Model for Comprehensive Echocardiography Interpretation
This repository contains the official inference code for the following paper:
EchoPrime: A Multi-Video View-Informed Vision-Language Model for Comprehensive Echocardiography Interpretation
Milos Vukadinovic, Xiu Tang, Neal Yuan, Paul Cheng, Debiao Li, Susan Cheng, Bryan He*, David Ouyang*
Read the paper on arXiv,
See the demo
- Clone the repository and navigate to the EchoPrime directory
git clone https://github.com/echonet/EchoPrime
- Download model data
wget https://github.com/echonet/EchoPrime/releases/download/v1.0.0/model_data.zip
wget https://github.com/echonet/EchoPrime/releases/download/v1.0.0/candidate_embeddings_p1.pt
wget https://github.com/echonet/EchoPrime/releases/download/v1.0.0/candidate_embeddings_p2.pt
unzip model_data.zip
mv candidate_embeddings_p1.pt model_data/candidates_data/
mv candidate_embeddings_p2.pt model_data/candidates_data/
- Install requirements
pip install -r requirements.txt
- Test on a sample input. 50 - number of videos, 3 number of channels, 16 - number of frames, 224 - height and width
from echo_prime import EchoPrime
import torch
ep = EchoPrime()
ep.predict_metrics(ep.encode_study(torch.zeros((50, 3, 16, 224, 224))))
- Follow EchoPrimeDemo.ipynb notebook to see how to correctly process the input and inference Echoprime.
This project is licensed under the terms of the MIT license.
load_for_finetuning.py
script shows how to load pretrained EchoPrime video and text encoder.
Make sure that you have the correct libraries installed. Use requirements.txt to install the dependencies.
docker build -t echo-prime .
docker run -d --name echoprime-container --gpus all echo-prime tail -f /dev/null
Then you can attach to this container and run the notebook located at
/workspace/EchoPrime/EchoPrimeDemo.ipynb
.