Skip to content

hpnquoc/vggt_depth

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VGGT Depth

@inproceedings{wang2025vggt,
  title={VGGT: Visual Geometry Grounded Transformer},
  author={Wang, Jianyuan and Chen, Minghao and Karaev, Nikita and Vedaldi, Andrea and Rupprecht, Christian and Novotny, David},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2025}
}

Overview

A simplified version that removes unnecessary heads while retaining the depth estimation head for depth estimation purposes.

Quick Start

First, clone this repository to your local machine, and install the dependencies (torch, torchvision, numpy, Pillow, and huggingface_hub).

git clone git@github.com:hpnquoc/vggt_depth.git 
cd vggt
pip install -r requirements.txt

current models:

Now, try the model with just a few lines of code:

import torch
from vggt.models.vggt_depth import VGGT
from vggt.utils.load_fn import load_and_preprocess_images

device = "cuda" if torch.cuda.is_available() else "cpu"
# bfloat16 is supported on Ampere GPUs (Compute Capability 8.0+) 
dtype = torch.bfloat16 if torch.cuda.get_device_capability()[0] >= 8 else torch.float16

# Initialize the model and load the pretrained weights.
# This will automatically download the model weights the first time it's run, which may take a while.
model = VGGT.from_pretrained("hpnquoc/{model_card}").to(device)

# Load and preprocess example images (replace with your own image paths)
image_names = ["path/to/imageA.png", "path/to/imageB.png", "path/to/imageC.png"]  
images = load_and_preprocess_images(image_names).to(device)

with torch.no_grad():
    with torch.amp.autocast('cuda', dtype=dtype):
        # Predict attributes including depth maps, depth_conf and original images.
        predictions = model(images)

The model weights will be automatically downloaded from Hugging Face. If you encounter issues such as slow loading, you can manually download them here and load, or:

model = VGGT()
_URL = "https://huggingface.co/hpnquoc/{model_card}/resolve/main/model.pt"
model.load_state_dict(torch.hub.load_state_dict_from_url(_URL))

Checklist

  • VGGT-1B-Depth uploaded

License

See the LICENSE file for details about the license under which this code is made available.

About

Simplified version of VGGT for depth estimation purposes

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%