Skip to content

Commit 2e60033

Browse files
committed
update readme
1 parent 3d35bb9 commit 2e60033

File tree

1 file changed

+43
-25
lines changed

1 file changed

+43
-25
lines changed

README.md

Lines changed: 43 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -32,22 +32,24 @@ pip install -r full_requirements.txt
3232
This repository provide direct and simple access to the pretrained "deep" versions of BigGAN for 128, 256 and 512 pixels resolutions as described in the [associated publication](https://openreview.net/forum?id=B1xsqj09Fm).
3333
Here are some details on the models:
3434

35-
- `BigGAN-deep-128` is a 50.4M parameters model generating 128x128 pixels images, the model dump weights 201 MB,
36-
- `BigGAN-deep-256` is a 55.9M parameters model generating 256x256 pixels images, the model dump weights 224 MB,
37-
- `BigGAN-deep-512` is a 56.2M parameters model generating 512x512 pixels images, the model dump weights 225 MB.
35+
- `BigGAN-deep-128`: a 50.4M parameters model generating 128x128 pixels images, the model dump weights 201 MB,
36+
- `BigGAN-deep-256`: a 55.9M parameters model generating 256x256 pixels images, the model dump weights 224 MB,
37+
- `BigGAN-deep-512`: a 56.2M parameters model generating 512x512 pixels images, the model dump weights 225 MB.
3838

39-
Please refere to Appendix B of the paper for details on the architectures.
39+
Please refer to Appendix B of the paper for details on the architectures.
4040

4141
All models comprise pre-computed batch norm statistics for 51 truncation values between 0 and 1 (see Appendix C.1 in the paper for details).
4242

4343
## Usage
4444

4545
Here is a quick-start example using `BigGAN` with a pre-trained model.
46-
See the [doc section](#doc) below for all the details on these classes.
46+
47+
See the [doc section](#doc) below for details on these classes and methods.
4748

4849
```python
4950
import torch
50-
from pytorch_pretrained_biggan import BigGAN, one_hot_from_name, truncated_noise_sample, save_as_images, display_in_terminal
51+
from pytorch_pretrained_biggan import (BigGAN, one_hot_from_name, truncated_noise_sample,
52+
save_as_images, display_in_terminal)
5153

5254
# OPTIONAL: if you want to have more information on what's happening, activate the logger as follows
5355
import logging
@@ -71,15 +73,16 @@ dogball = model(noise_vector, class_vector, truncation)
7173
# Save results as png images
7274
save_as_images(dogball)
7375

74-
# If you have a sixtel compatible terminal you can display the images in the terminal (see https://github.com/saitoha/libsixel)
76+
# If you have a sixtel compatible terminal you can display the images in the terminal
77+
# (see https://github.com/saitoha/libsixel for details)
7578
display_in_terminal(dogball)
7679
```
7780

7881
## Doc
7982

80-
### Loading DeepMind's pre-trained weigths
83+
### Loading DeepMind's pre-trained weights
8184

82-
To load one of DeepMind's pre-trained models, instantiate an instance of `BigGAN` as
85+
To load one of DeepMind's pre-trained models, instantiate a `BigGAN` model with `from_pretrained()` as:
8386

8487
```python
8588
model = BigGAN.from_pretrained(PRE_TRAINED_MODEL_NAME_OR_PATH, cache_dir=None)
@@ -105,9 +108,9 @@ where
105108

106109
### Configuration
107110

108-
`BigGANConfig` is the BigGAN configuration class stored in [`config.py`](./pytorch_pretrained_biggan/config.py).
111+
`BigGANConfig` is a class to store and load BigGAN configurations. It's defined in [`config.py`](./pytorch_pretrained_biggan/config.py).
109112

110-
Here are the details of the attributes:
113+
Here are some details on the attributes:
111114

112115
- `output_dim`: output resolution of the GAN (128, 256 or 512) for the pre-trained models,
113116
- `z_dim`: size of the noise vector (128 for the pre-trained models).
@@ -121,25 +124,29 @@ Here are the details of the attributes:
121124

122125
### Model
123126

124-
`BigGAN` is the BigGAN model. It comprises the class embeddings linear layer and the generator. The discrimiantor is currently not provided since pre-trained weights have not been released.
127+
`BigGAN` is a PyTorch model (`torch.nn.Module`) of BigGAN defined in [`model.py`](./pytorch_pretrained_biggan/model.py). This model comprises the class embeddings (a linear layer) and the generator with a series of convolutions and conditional batch norms. The discriminator is currently not implemented since pre-trained weights have not been released for it.
125128

126129
The inputs and output are **identical to the TensorFlow model inputs and outputs**.
127130

128-
We detail them here. This model takes as *inputs*:
129-
[`model.py`](./pytorch_pretrained_biggan/model.py)
131+
We detail them here.
132+
133+
`BigGAN` takes as *inputs*:
134+
130135
- `z`: a torch.FloatTensor of shape [batch_size, config.z_dim] with noise sampled from a truncated normal distribution, and
131136
- `class_label`: an optional torch.LongTensor of shape [batch_size, sequence_length] with the token types indices selected in [0, 1]. Type 0 corresponds to a `sentence A` and type 1 corresponds to a `sentence B` token (see BERT paper for more details).
132137
- `truncation`: a float between 0 (not comprised) and 1. The truncation of the truncated normal used for creating the noise vector. This truncation value is used to selecte between a set of pre-computed statistics (means and variances) for the batch norm layers.
133138

134-
This model *outputs* an array of shape [batch_size, 3, resolution, resolution] where resolution is 128, 256 or 512 depending of the model:
139+
`BigGAN` *outputs* an array of shape [batch_size, 3, resolution, resolution] where resolution is 128, 256 or 512 depending of the model:
135140

136141
### Utilities: Images, Noise, Imagenet classes
137142

138-
We provide a few utility method to use the model in [`utils.py`](./pytorch_pretrained_biggan/utils.py).
143+
We provide a few utility method to use the model. They are defined in [`utils.py`](./pytorch_pretrained_biggan/utils.py).
139144

140145
Here are some details on these methods:
141146

142-
- `truncated_noise_sample(batch_size=1, dim_z=128, truncation=1., seed=None)`: Create a truncated noise vector.
147+
- `truncated_noise_sample(batch_size=1, dim_z=128, truncation=1., seed=None)`:
148+
149+
Create a truncated noise vector.
143150
- Params:
144151
- batch_size: batch size.
145152
- dim_z: dimension of z
@@ -148,25 +155,33 @@ Here are some details on these methods:
148155
- Output:
149156
array of shape (batch_size, dim_z)
150157

151-
- `convert_to_images(obj)`: Convert an output tensor from BigGAN in a list of images.
158+
- `convert_to_images(obj)`:
159+
160+
Convert an output tensor from BigGAN in a list of images.
152161
- Params:
153162
- obj: tensor or numpy array of shape (batch_size, channels, height, width)
154163
- Output:
155164
- list of Pillow Images of size (height, width)
156165

157-
- `save_as_images(obj, file_name='output')`: Convert and save an output tensor from BigGAN in a list of saved images.
166+
- `save_as_images(obj, file_name='output')`:
167+
168+
Convert and save an output tensor from BigGAN in a list of saved images.
158169
- Params:
159170
- obj: tensor or numpy array of shape (batch_size, channels, height, width)
160171
- file_name: path and beggingin of filename to save.
161172
Images will be saved as `file_name_{image_number}.png`
162173

163-
- `display_in_terminal(obj)`: Convert and display an output tensor from BigGAN in the terminal. This function use `libsixel` and will only work in a libsixel-compatible terminal. Please refer to https://github.com/saitoha/libsixel for more details.
174+
- `display_in_terminal(obj)`:
175+
176+
Convert and display an output tensor from BigGAN in the terminal. This function use `libsixel` and will only work in a libsixel-compatible terminal. Please refer to https://github.com/saitoha/libsixel for more details.
164177
- Params:
165178
- obj: tensor or numpy array of shape (batch_size, channels, height, width)
166179
- file_name: path and beggingin of filename to save.
167180
Images will be saved as `file_name_{image_number}.png`
168181

169-
- `one_hot_from_int(int_or_list, batch_size=1)`: Create a one-hot vector from a class index or a list of class indices.
182+
- `one_hot_from_int(int_or_list, batch_size=1)`:
183+
184+
Create a one-hot vector from a class index or a list of class indices.
170185
- Params:
171186
- int_or_list: int, or list of int, of the imagenet classes (between 0 and 999)
172187
- batch_size: batch size.
@@ -175,17 +190,20 @@ Here are some details on these methods:
175190
- Output:
176191
- array of shape (batch_size, 1000)
177192

178-
- `one_hot_from_name(class_name, batch_size=1)`: Create a one-hot vector from the name of an imagenet class ('tennis ball', 'daisy', ...). We use NLTK's wordnet search to try to find the relevant synset of ImageNet and take the first one. If we can't find it direcly, we look at the hyponyms and hypernyms of the class name.
193+
- `one_hot_from_name(class_name, batch_size=1)`:
194+
195+
Create a one-hot vector from the name of an imagenet class ('tennis ball', 'daisy', ...). We use NLTK's wordnet search to try to find the relevant synset of ImageNet and take the first one. If we can't find it direcly, we look at the hyponyms and hypernyms of the class name.
179196
- Params:
180197
- class_name: string containing the name of an imagenet object.
181198
- Output:
182199
- array of shape (batch_size, 1000)
183200

184-
## Conversion script
201+
## Download and conversion scripts
185202

186-
A script that can be used to convert models from TensorFlow Hub is provided in [./scripts/convert_tf_hub_models.sh](./scripts/convert_tf_hub_models.sh).
203+
Scripts to download and convert the TensorFlow models from TensorFlow Hub are provided in [./scripts](./scripts/).
187204

188-
The script can be used directly as:
205+
The scripts can be used directly as:
189206
```bash
207+
./scripts/download_tf_hub_models.sh
190208
./scripts/convert_tf_hub_models.sh
191209
```

0 commit comments

Comments
 (0)