Skip to content

Commit 5320b4c

Browse files
committed
Init
0 parents  commit 5320b4c

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

79 files changed

+6707
-0
lines changed

.gitignore

Lines changed: 151 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,151 @@
1+
# Custom
2+
archive
3+
explore/exploration-outputs/*.ply
4+
explore/exploration-outputs/*.png
5+
explore/exploration-outputs/*.html
6+
.vscode
7+
wandb
8+
outputs
9+
tmp*
10+
slurm-logs
11+
slurm_logs
12+
experiments/figures/splash/older_versions/images
13+
experiments/figures/splash/images
14+
experiments/figures/rebuttal/images
15+
sps-*
16+
.vscode
17+
example-logs-and-checkpoints.zip
18+
example-logs-and-checkpoints/
19+
example-vis.tar
20+
example-vis.zip
21+
22+
23+
# Byte-compiled / optimized / DLL files
24+
__pycache__/
25+
*.py[cod]
26+
*$py.class
27+
.github
28+
29+
# C extensions
30+
*.so
31+
32+
# Distribution / packaging
33+
.Python
34+
build/
35+
develop-eggs/
36+
dist/
37+
downloads/
38+
eggs/
39+
.eggs/
40+
lib/
41+
lib64/
42+
parts/
43+
sdist/
44+
var/
45+
wheels/
46+
*.egg-info/
47+
.installed.cfg
48+
*.egg
49+
MANIFEST
50+
51+
# Lightning /research
52+
test_tube_exp/
53+
tests/tests_tt_dir/
54+
tests/save_dir
55+
default/
56+
data/
57+
test_tube_logs/
58+
test_tube_data/
59+
datasets/
60+
model_weights/
61+
tests/save_dir
62+
tests/tests_tt_dir/
63+
processed/
64+
raw/
65+
66+
# PyInstaller
67+
# Usually these files are written by a python script from a template
68+
# before PyInstaller builds the exe, so as to inject date/other infos into it.
69+
*.manifest
70+
*.spec
71+
72+
# Installer logs
73+
pip-log.txt
74+
pip-delete-this-directory.txt
75+
76+
# Unit test / coverage reports
77+
htmlcov/
78+
.tox/
79+
.coverage
80+
.coverage.*
81+
.cache
82+
nosetests.xml
83+
coverage.xml
84+
*.cover
85+
.hypothesis/
86+
.pytest_cache/
87+
88+
# Translations
89+
*.mo
90+
*.pot
91+
92+
# Django stuff:
93+
*.log
94+
local_settings.py
95+
db.sqlite3
96+
97+
# Flask stuff:
98+
instance/
99+
.webassets-cache
100+
101+
# Scrapy stuff:
102+
.scrapy
103+
104+
# Sphinx documentation
105+
docs/_build/
106+
107+
# PyBuilder
108+
target/
109+
110+
# Jupyter Notebook
111+
.ipynb_checkpoints
112+
113+
# pyenv
114+
.python-version
115+
116+
# celery beat schedule file
117+
celerybeat-schedule
118+
119+
# SageMath parsed files
120+
*.sage.py
121+
122+
# Environments
123+
.env
124+
.venv
125+
env/
126+
venv/
127+
ENV/
128+
env.bak/
129+
venv.bak/
130+
131+
# Spyder project settings
132+
.spyderproject
133+
.spyproject
134+
135+
# Rope project settings
136+
.ropeproject
137+
138+
# mkdocs documentation
139+
/site
140+
141+
# mypy
142+
.mypy_cache/
143+
144+
# IDEs
145+
.idea
146+
.vscode
147+
148+
# seed project
149+
lightning_logs/
150+
MNIST
151+
.DS_Store

README.md

Lines changed: 198 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,198 @@
1+
<div align="center">
2+
3+
## PC^2 Projection-Conditioned Point Cloud Diffusion for Single-Image 3D Reconstruction
4+
### CVPR 2023 (Highlight)
5+
6+
[![Arxiv](http://img.shields.io/badge/Arxiv-2302.10668-B31B1B.svg)](https://arxiv.org/abs/2302.10668)
7+
[![CVPR](http://img.shields.io/badge/CVPR-2023-4b44ce.svg)](https://arxiv.org/abs/2302.10668)
8+
</div>
9+
10+
## Table of Contents
11+
12+
- [Overview](#overview)
13+
* [Explanatory Video](#explanatory-video)
14+
* [Code Overview](#code-overview)
15+
* [Abstract](#abstract)
16+
* [Examples](#examples)
17+
* [Method](#method)
18+
- [Running the code](#running-the-code)
19+
* [Dependencies](#dependencies)
20+
* [Data](#data)
21+
* [Training](#training)
22+
* [Sampling](#sampling)
23+
* [Pretrained checkpoints](#pretrained-checkpoints)
24+
* [Common issues](#common-issues)
25+
- [Acknowledgement](#acknowledgement)
26+
- [Citation](#citation)
27+
28+
## Overview
29+
30+
### Explanatory Video
31+
32+
<div align="center"> <a href="https://www.youtube.com/watch?v=kAkwpsT1pRA"><img src="https://img.youtube.com/vi/kAkwpsT1pRA/0.jpg" alt="Explanatory Video"></a></div>
33+
34+
### Code Overview
35+
36+
This repository uses [PyTorch3D](https://github.com/facebookresearch/pytorch3d) for most 3D operations. It uses [Hydra](https://hydra.cc/docs/intro/) for configuration, and the config is located at `config/structured.py`. The entrypoints for training are `main.py` for the point cloud diffusion model and `main_coloring.py` for the point cloud coloring model. There are shared utilities in `diffusion_utils.py` and `training_utils.py`. The data is [Co3Dv2](https://github.com/facebookresearch/co3d).
37+
38+
I substantially refactored the repository for the public release to use the `diffusers` library from HuggingFace. As a results, most of the code is different from the original code used for the paper. Only the Co3Dv2 dataset is implemented in this version of this code, but it should be easy to run on other datasets if you need to.
39+
40+
If you have any questions or contributions, feel free to leave an issue or a pull request.
41+
42+
### Abstract
43+
44+
Reconstructing the 3D shape of an object from a single RGB image is a long-standing and highly challenging problem in computer vision. In this paper, we propose a novel method for single-image 3D reconstruction which generates a sparse point cloud via a conditional denoising diffusion process. Our method takes as input a single RGB image along with its camera pose and gradually denoises a set of 3D points, whose positions are initially sampled randomly from a three-dimensional Gaussian distribution, into the shape of an object. The key to our method is a geometrically-consistent conditioning process which we call projection conditioning: at each step in the diffusion process, we project local image features onto the partially-denoised point cloud from the given camera pose. This projection conditioning process enables us to generate high-resolution sparse geometries that are well-aligned with the input image, and can additionally be used to predict point colors after shape reconstruction. Moreover, due to the probabilistic nature of the diffusion process, our method is naturally capable of generating multiple different shapes consistent with a single input image. In contrast to prior work, our approach not only performs well on synthetic benchmarks, but also gives large qualitative improvements on complex real-world data.
45+
46+
### Examples
47+
48+
![Examples](assets/splash-figure.png)
49+
50+
### Method
51+
52+
![Diagram](assets/method-diagram-combined-v3.png)
53+
54+
55+
## Running the code
56+
57+
### Dependencies
58+
59+
Dependencies may be installed with pip:
60+
```bash
61+
pip install -r requirements.txt
62+
```
63+
64+
PyTorch and PyTorch3D are not included in `requirements.txt` because that sometimes messes up `conda` installations by trying to re-install PyTorch using `pip`. I assume you've already installed these by yourself. If not, you can use a command such as:
65+
66+
```bash
67+
mamba install pytorch torchvision pytorch-cuda=11.7 pytorch3d -c pytorch -c nvidia -c pytorch3d
68+
```
69+
70+
### Data
71+
72+
For our data, we use [Co3Dv2](https://github.com/facebookresearch/co3d). Full information about the dataset is provided on the GitHub page.
73+
74+
We train on individual categories, so you can just download one category or a subset of the categories (for example hydrants or teddy bears).
75+
76+
Then you can set the environment variable `CO3DV2_DATASET_ROOT` to the dataset root:
77+
```bash
78+
export CO3DV2_DATASET_ROOT="your_dataset_root_folder"
79+
```
80+
81+
### Training
82+
83+
The config is in `config/structured.py`.
84+
85+
You can specify your job mode using `run.job=train`, `run.job=train_coloring`, `run.job=sample`, or `run.job=sample_coloring`. By default, the mode is set to `train`.
86+
87+
An example training command is:
88+
```bash
89+
python main.py dataset.category=hydrant dataloader.batch_size=24 dataloader.num_workers=8 run.vis_before_training=True run.val_before_training=True run.name=train__hydrant__ebs_24
90+
```
91+
92+
To run multiple jobs in parallel on a SLURM cluster, you can use a script such as:
93+
```bash
94+
python scripts/example-slurm.py --partition ${PARTITION_NAME} --submit
95+
```
96+
97+
Separately, you can train a coloring model to predict the color of points with fixed locations in 3D space.
98+
99+
An example command is:
100+
```bash
101+
python main_coloring.py run.job=train_coloing model=coloring_model run.mixed_precision=no dataset.category=hydrant dataloader.batch_size=24 run.max_steps=20_000 run.coloring_training_noise_std=0.1 run.name=train_coloring__hydrant__ebs_24
102+
```
103+
104+
### Sampling
105+
106+
For sampling point clouds, use `run.job=sample`.
107+
108+
For example:
109+
```bash
110+
python main.py run.job=sample dataloader.batch_size=16 dataloader.num_workers=6 dataset.category=hydrant checkpoint.resume="/path/to/checkpoint/like/train__hydrant__ebs_24/2022-11-01--17-04-36/checkpoint-latest.pth" run.name=sample__hydrant__ebs_24
111+
```
112+
113+
Results will be saved to your output directory.
114+
115+
Afterwards, you can predict colors using the point clouds obtained from the sampling procedure above, specifying them with the argument `run.coloring_sample_dir`.
116+
117+
For example:
118+
```bash
119+
python main_coloring.py run.job=sample_coloing dataset.category=hydrant dataloader.batch_size=8 model=coloring_model checkpoint.resume="/path/to/coloring/model/checkpoint-latest.pth" run.coloring_sample_dir="/path/to/sample/dir/like/sample__hydrant__ebs_24/2022-09-22--18-03-20/sample/" run.name=sample_coloring__hydrant__ebs_24
120+
```
121+
122+
_Side note:_ although this is called "`sample_coloring`" in the code, it is not really doing any sampling because the coloring model is deterministic.
123+
124+
### Pretrained checkpoints
125+
126+
You can download example checkpoints here:
127+
```bash
128+
# Downloads checkpoint and logs (1.2G)
129+
bash ./scripts/download-example-logs-and-checkpoints.sh
130+
# Downloads visualizations over the course of training, as an example. Since
131+
# these are large (3.5G), we have made them a separate download.
132+
bash ./scripts/download-example-vis.sh
133+
```
134+
These are newly-trained models with this codebase. We can train and upload models for other categories as well if you would like; just let us know.
135+
136+
### Common issues
137+
138+
(1) If you get an error of the form `Error building extension '_pvcnn_backend'`, make sure you have installed `gcc` and `g++`. Then check the path in `model/pvcnn/modules/functional/backend.py` and edit it to your desired location.
139+
140+
(2) I believe PyTorch3D has some large changes recently and it is possible some of their code is now broken. I am using version 0.7.3 with a patch on line 634 of `pytorch3d/implicitron/dataset/frame_data.py`.
141+
```python
142+
image_rgb = torch.from_numpy(load_image(self._local_path(path)))
143+
```
144+
145+
(3) You may also have to patch the `accelerate` library in order to properly batch the `FrameData` objects from PyTorch3D. To fix this I replaced the following lines in `accelerate/utils/operations.py` (L91-99)
146+
```python
147+
elif isinstance(data, Mapping):
148+
return type(data)(
149+
{
150+
k: recursively_apply(
151+
func, v, *args, test_type=test_type, error_on_other_type=error_on_other_type, **kwargs
152+
)
153+
for k, v in data.items()
154+
}
155+
)
156+
```
157+
with the following lines
158+
```python
159+
elif isinstance(data, Mapping):
160+
from pytorch3d.implicitron.dataset.data_loader_map_provider import FrameData
161+
if isinstance(data, (FrameData)):
162+
return type(data)(
163+
**{
164+
k: recursively_apply(
165+
func, v, *args, test_type=test_type, error_on_other_type=error_on_other_type, **kwargs
166+
)
167+
for k, v in data.items()
168+
}
169+
)
170+
else:
171+
return type(data)(
172+
{
173+
k: recursively_apply(
174+
func, v, *args, test_type=test_type, error_on_other_type=error_on_other_type, **kwargs
175+
)
176+
for k, v in data.items()
177+
}
178+
)
179+
```
180+
181+
## Acknowledgement
182+
183+
* The [PyTorch3D](https://github.com/facebookresearch/pytorch3d) library.
184+
* The [diffusers](https://github.com/huggingface/diffusers) library.
185+
* The [Co3D and Co3Dv2](https://github.com/facebookresearch/co3d) datasets.
186+
* _Our funding:_ Luke Melas-Kyriazi is supported by the Rhodes Trust. Andrea Vedaldi and Christian Rupprecht are supported by ERC-UNION-CoG-101001212. Christian Rupprecht is also supported by VisualAI EP/T028572/1.
187+
188+
## Citation
189+
```
190+
@misc{melaskyriazi2023projection,
191+
doi = {10.48550/ARXIV.2302.10668},
192+
url = {https://arxiv.org/abs/2302.10668},
193+
author = {Melas-Kyriazi, Luke and Rupprecht, Christian and Vedaldi, Andrea},
194+
title = {PC^2 Projection-Conditioned Point Cloud Diffusion for Single-Image 3D Reconstruction},
195+
publisher = {arXiv},
196+
year = {2023},
197+
}
198+
```

assets/method-diagram-combined-v3.png

330 KB
Loading

assets/splash-figure.png

2.81 MB
Loading

experiments/__init__.py

Whitespace-only changes.

experiments/config/__init__.py

Whitespace-only changes.

0 commit comments

Comments
 (0)