VGGT Depth

@inproceedings{wang2025vggt,
  title={VGGT: Visual Geometry Grounded Transformer},
  author={Wang, Jianyuan and Chen, Minghao and Karaev, Nikita and Vedaldi, Andrea and Rupprecht, Christian and Novotny, David},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2025}
}

Overview

A simplified version that removes unnecessary heads while retaining the depth estimation head for depth estimation purposes.

Quick Start

First, clone this repository to your local machine, and install the dependencies (torch, torchvision, numpy, Pillow, and huggingface_hub).

git clone git@github.com:hpnquoc/vggt_depth.git 
cd vggt
pip install -r requirements.txt

current models:

VGGT-1B-Depth

Now, try the model with just a few lines of code:

import torch
from vggt.models.vggt_depth import VGGT
from vggt.utils.load_fn import load_and_preprocess_images

device = "cuda" if torch.cuda.is_available() else "cpu"
# bfloat16 is supported on Ampere GPUs (Compute Capability 8.0+) 
dtype = torch.bfloat16 if torch.cuda.get_device_capability()[0] >= 8 else torch.float16

# Initialize the model and load the pretrained weights.
# This will automatically download the model weights the first time it's run, which may take a while.
model = VGGT.from_pretrained("hpnquoc/{model_card}").to(device)

# Load and preprocess example images (replace with your own image paths)
image_names = ["path/to/imageA.png", "path/to/imageB.png", "path/to/imageC.png"]  
images = load_and_preprocess_images(image_names).to(device)

with torch.no_grad():
    with torch.amp.autocast('cuda', dtype=dtype):
        # Predict attributes including depth maps, depth_conf and original images.
        predictions = model(images)

The model weights will be automatically downloaded from Hugging Face. If you encounter issues such as slow loading, you can manually download them here and load, or:

model = VGGT()
_URL = "https://huggingface.co/hpnquoc/{model_card}/resolve/main/model.pt"
model.load_state_dict(torch.hub.load_state_dict_from_url(_URL))

Checklist

VGGT-1B-Depth uploaded

License

See the LICENSE file for details about the license under which this code is made available.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
docs		docs
examples		examples
vggt		vggt
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

VGGT Depth

Overview

Quick Start

Checklist

License

About

Uh oh!

Releases

Packages

Languages

License

hpnquoc/vggt_depth

Folders and files

Latest commit

History

Repository files navigation

VGGT Depth

Overview

Quick Start

Checklist

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages