|
2 | 2 | <h1 align="center">Structured 3D Latents<br>for Scalable and Versatile 3D Generation</h1>
|
3 | 3 | <p align="center"><a href="https://arxiv.org/abs/2412.01506"><img src='https://img.shields.io/badge/arXiv-Paper-red?logo=arxiv&logoColor=white' alt='arXiv'></a>
|
4 | 4 | <a href='https://trellis3d.github.io'><img src='https://img.shields.io/badge/Project_Page-Website-green?logo=googlechrome&logoColor=white' alt='Project Page'></a>
|
5 |
| -<a href='https://huggingface.co/spaces/JeffreyXiang/TRELLIS'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Live_Demo-blue'></a> |
| 5 | +<a href='https://huggingface.co/spaces?q=TRELLIS'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Live_Demo-blue'></a> |
6 | 6 | </p>
|
7 | 7 | <p align="center"><img src="assets/teaser.png" width="100%"></p>
|
8 | 8 |
|
|
84 | 84 | --demo Install all dependencies for demo
|
85 | 85 | ```
|
86 | 86 |
|
87 |
| -<!-- Pretrained Models --> |
88 |
| -## 🤖 Pretrained Models |
89 |
| - |
90 |
| -We provide the following pretrained models: |
91 |
| - |
92 |
| -| Model | Description | #Params | Download | |
93 |
| -| --- | --- | --- | --- | |
94 |
| -| TRELLIS-image-large | Large image-to-3D model | 1.2B | [Download](https://huggingface.co/microsoft/TRELLIS-image-large) | |
95 |
| -| TRELLIS-text-base | Base text-to-3D model | 342M | [Download](https://huggingface.co/microsoft/TRELLIS-text-base) | |
96 |
| -| TRELLIS-text-large | Large text-to-3D model | 1.1B | [Download](https://huggingface.co/microsoft/TRELLIS-text-large) | |
97 |
| -| TRELLIS-text-xlarge | Extra-large text-to-3D model | 2.0B | [Download](https://huggingface.co/microsoft/TRELLIS-text-xlarge) | |
98 |
| - |
99 |
| -*Note: It is always recommended to use the image conditioned version of the models for better performance.* |
100 |
| - |
101 |
| -*Note: All VAEs are included in **TRELLIS-image-large** model repo.* |
102 |
| - |
103 |
| -The models are hosted on Hugging Face. You can directly load the models with their repository names in the code: |
104 |
| -```python |
105 |
| -TrellisImageTo3DPipeline.from_pretrained("microsoft/TRELLIS-image-large") |
106 |
| -``` |
107 |
| - |
108 |
| -If you prefer loading the model from local, you can download the model files from the links above and load the model with the folder path (folder structure should be maintained): |
109 |
| -```python |
110 |
| -TrellisImageTo3DPipeline.from_pretrained("/path/to/TRELLIS-image-large") |
111 |
| -``` |
112 |
| - |
113 | 87 | <!-- Usage -->
|
114 | 88 | ## 💡 Usage
|
115 | 89 |
|
@@ -199,8 +173,6 @@ python app.py
|
199 | 173 |
|
200 | 174 | Then, you can access the demo at the address shown in the terminal.
|
201 | 175 |
|
202 |
| -***The web demo is also available on [Hugging Face Spaces](https://huggingface.co/spaces/JeffreyXiang/TRELLIS)!*** |
203 |
| - |
204 | 176 |
|
205 | 177 | <!-- Dataset -->
|
206 | 178 | ## 📚 Dataset
|
@@ -239,20 +211,20 @@ TRELLIS’s training framework is organized to provide a flexible and modular ap
|
239 | 211 | - Training hyperparameters and model architectures are defined in configuration files under the `configs/` directory.
|
240 | 212 | - Example configuration files include:
|
241 | 213 |
|
242 |
| -| Config | Pretained Model | Description | |
243 |
| -| --- | --- | --- | |
244 |
| -| [`vae/ss_vae_conv3d_16l8_fp16.json`](configs/vae/ss_vae_conv3d_16l8_fp16.json) | [Encoder](https://huggingface.co/microsoft/TRELLIS-image-large/blob/main/ckpts/ss_enc_conv3d_16l8_fp16.safetensors) [Decoder](https://huggingface.co/microsoft/TRELLIS-image-large/blob/main/ckpts/ss_dec_conv3d_16l8_fp16.safetensors) | Sparse structure VAE | |
245 |
| -| [`vae/slat_vae_enc_dec_gs_swin8_B_64l8_fp16.json`](configs/vae/slat_vae_enc_dec_gs_swin8_B_64l8_fp16.json) | [Encoder](https://huggingface.co/microsoft/TRELLIS-image-large/blob/main/ckpts/slat_enc_swin8_B_64l8_fp16.safetensors) [Decoder](https://huggingface.co/microsoft/TRELLIS-image-large/blob/main/ckpts/slat_dec_gs_swin8_B_64l8gs32_fp16.safetensors) | SLat VAE with Gaussian Decoder | |
246 |
| -| [`vae/slat_vae_dec_rf_swin8_B_64l8_fp16.json`](configs/vae/slat_vae_dec_rf_swin8_B_64l8_fp16.json) | [Decoder](https://huggingface.co/microsoft/TRELLIS-image-large/blob/main/ckpts/slat_dec_rf_swin8_B_64l8r16_fp16.safetensors) | SLat Radiance Field Decoder | |
247 |
| -| [`vae/slat_vae_dec_mesh_swin8_B_64l8_fp16.json`](configs/vae/slat_vae_dec_mesh_swin8_B_64l8_fp16.json) | [Decoder](https://huggingface.co/microsoft/TRELLIS-image-large/blob/main/ckpts/slat_dec_mesh_swin8_B_64l8m256c_fp16.safetensors) | SLat Mesh Decoder | |
248 |
| -| [`generation/ss_flow_img_dit_L_16l8_fp16.json`](configs/generation/ss_flow_img_dit_L_16l8_fp16.json) | [Denoiser](https://huggingface.co/microsoft/TRELLIS-image-large/blob/main/ckpts/ss_flow_img_dit_L_16l8_fp16.safetensors) | Image conditioned sparse structure Flow Model | |
249 |
| -| [`generation/slat_flow_img_dit_L_64l8p2_fp16.json`](configs/generation/slat_flow_img_dit_L_64l8p2_fp16.json) | [Denoiser](https://huggingface.co/microsoft/TRELLIS-image-large/blob/main/ckpts/slat_flow_img_dit_L_64l8p2_fp16.safetensors) | Image conditioned SLat Flow Model | |
250 |
| -| [`generation/ss_flow_txt_dit_B_16l8_fp16.json`](configs/generation/ss_flow_txt_dit_B_16l8_fp16.json) | [Denoiser](https://huggingface.co/microsoft/TRELLIS-text-base/blob/main/ckpts/ss_flow_txt_dit_B_16l8_fp16.safetensors) | Base text-conditioned sparse structure Flow Model | |
251 |
| -| [`generation/slat_flow_txt_dit_B_64l8p2_fp16.json`](configs/generation/slat_flow_txt_dit_B_64l8p2_fp16.json) | [Denoiser](https://huggingface.co/microsoft/TRELLIS-text-base/blob/main/ckpts/slat_flow_txt_dit_B_64l8p2_fp16.safetensors) | Base text-conditioned SLat Flow Model | |
252 |
| -| [`generation/ss_flow_txt_dit_L_16l8_fp16.json`](configs/generation/ss_flow_txt_dit_L_16l8_fp16.json) | [Denoiser](https://huggingface.co/microsoft/TRELLIS-text-large/blob/main/ckpts/ss_flow_txt_dit_L_16l8_fp16.safetensors) | Large text-conditioned sparse structure Flow Model | |
253 |
| -| [`generation/slat_flow_txt_dit_L_64l8p2_fp16.json`](configs/generation/slat_flow_txt_dit_L_64l8p2_fp16.json) | [Denoiser](https://huggingface.co/microsoft/TRELLIS-text-large/blob/main/ckpts/slat_flow_txt_dit_L_64l8p2_fp16.safetensors) | Large text-conditioned SLat Flow Model | |
254 |
| -| [`generation/ss_flow_txt_dit_XL_16l8_fp16.json`](configs/generation/ss_flow_txt_dit_XL_16l8_fp16.json) | [Denoiser](https://huggingface.co/microsoft/TRELLIS-text-xlarge/blob/main/ckpts/ss_flow_txt_dit_XL_16l8_fp16.safetensors) | Extra-large text-conditioned sparse structure Flow Model | |
255 |
| -| [`generation/slat_flow_txt_dit_XL_64l8p2_fp16.json`](configs/generation/slat_flow_txt_dit_XL_64l8p2_fp16.json) | [Denoiser](https://huggingface.co/microsoft/TRELLIS-text-xlarge/blob/main/ckpts/slat_flow_txt_dit_XL_64l8p2_fp16.safetensors) | Extra-large text-conditioned SLat Flow Model | |
| 214 | +| Config | Description | |
| 215 | +| --- | --- | |
| 216 | +| [`vae/ss_vae_conv3d_16l8_fp16.json`](configs/vae/ss_vae_conv3d_16l8_fp16.json) | Sparse structure VAE | |
| 217 | +| [`vae/slat_vae_enc_dec_gs_swin8_B_64l8_fp16.json`](configs/vae/slat_vae_enc_dec_gs_swin8_B_64l8_fp16.json) | SLat VAE with Gaussian Decoder | |
| 218 | +| [`vae/slat_vae_dec_rf_swin8_B_64l8_fp16.json`](configs/vae/slat_vae_dec_rf_swin8_B_64l8_fp16.json) | SLat Radiance Field Decoder | |
| 219 | +| [`vae/slat_vae_dec_mesh_swin8_B_64l8_fp16.json`](configs/vae/slat_vae_dec_mesh_swin8_B_64l8_fp16.json) | SLat Mesh Decoder | |
| 220 | +| [`generation/ss_flow_img_dit_L_16l8_fp16.json`](configs/generation/ss_flow_img_dit_L_16l8_fp16.json) | Image conditioned sparse structure Flow Model | |
| 221 | +| [`generation/slat_flow_img_dit_L_64l8p2_fp16.json`](configs/generation/slat_flow_img_dit_L_64l8p2_fp16.json) | Image conditioned SLat Flow Model | |
| 222 | +| [`generation/ss_flow_txt_dit_B_16l8_fp16.json`](configs/generation/ss_flow_txt_dit_B_16l8_fp16.json) | Base text-conditioned sparse structure Flow Model | |
| 223 | +| [`generation/slat_flow_txt_dit_B_64l8p2_fp16.json`](configs/generation/slat_flow_txt_dit_B_64l8p2_fp16.json) | Base text-conditioned SLat Flow Model | |
| 224 | +| [`generation/ss_flow_txt_dit_L_16l8_fp16.json`](configs/generation/ss_flow_txt_dit_L_16l8_fp16.json) | Large text-conditioned sparse structure Flow Model | |
| 225 | +| [`generation/slat_flow_txt_dit_L_64l8p2_fp16.json`](configs/generation/slat_flow_txt_dit_L_64l8p2_fp16.json) | Large text-conditioned SLat Flow Model | |
| 226 | +| [`generation/ss_flow_txt_dit_XL_16l8_fp16.json`](configs/generation/ss_flow_txt_dit_XL_16l8_fp16.json) | Extra-large text-conditioned sparse structure Flow Model | |
| 227 | +| [`generation/slat_flow_txt_dit_XL_64l8p2_fp16.json`](configs/generation/slat_flow_txt_dit_XL_64l8p2_fp16.json) | Extra-large text-conditioned SLat Flow Model | |
256 | 228 |
|
257 | 229 |
|
258 | 230 | ### Command-Line Options
|
|
0 commit comments