Skip to content

Commit 4d20c39

Browse files
committed
Merge remote-tracking branch 'upstream/master'
2 parents e903f9d + 9bc6103 commit 4d20c39

File tree

461 files changed

+28113
-1580
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

461 files changed

+28113
-1580
lines changed

README.md

Lines changed: 30 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -46,8 +46,7 @@ to receive updates on the project.
4646
import tensorflow_datasets as tfds
4747
import tensorflow as tf
4848

49-
# tfds works in both Eager and Graph modes
50-
tf.compat.v1.enable_eager_execution()
49+
# Here we assume Eager mode is enabled (TF2), but tfds also works in Graph mode.
5150

5251
# See available datasets
5352
print(tfds.list_builders())
@@ -92,32 +91,36 @@ ds = mnist_builder.as_dataset(split='train')
9291
# dataset and its features
9392
info = mnist_builder.info
9493
print(info)
94+
```
95+
96+
This will print the dataset info content:
9597

96-
tfds.core.DatasetInfo(
97-
name='mnist',
98-
version=1.0.0,
99-
description='The MNIST database of handwritten digits.',
100-
homepage='http://yann.lecun.com/exdb/mnist/',
101-
features=FeaturesDict({
102-
'image': Image(shape=(28, 28, 1), dtype=tf.uint8),
103-
'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=10)
104-
},
105-
total_num_examples=70000,
106-
splits={
107-
'test': <tfds.core.SplitInfo num_examples=10000>,
108-
'train': <tfds.core.SplitInfo num_examples=60000>
109-
},
110-
supervised_keys=('image', 'label'),
111-
citation='"""
112-
@article{lecun2010mnist,
113-
title={MNIST handwritten digit database},
114-
author={LeCun, Yann and Cortes, Corinna and Burges, CJ},
115-
journal={ATT Labs [Online]. Available: http://yann. lecun. com/exdb/mnist},
116-
volume={2},
117-
year={2010}
118-
}
119-
"""',
120-
)
98+
```
99+
tfds.core.DatasetInfo(
100+
name='mnist',
101+
version=1.0.0,
102+
description='The MNIST database of handwritten digits.',
103+
homepage='http://yann.lecun.com/exdb/mnist/',
104+
features=FeaturesDict({
105+
'image': Image(shape=(28, 28, 1), dtype=tf.uint8),
106+
'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=10)
107+
},
108+
total_num_examples=70000,
109+
splits={
110+
'test': <tfds.core.SplitInfo num_examples=10000>,
111+
'train': <tfds.core.SplitInfo num_examples=60000>
112+
},
113+
supervised_keys=('image', 'label'),
114+
citation='"""
115+
@article{lecun2010mnist,
116+
title={MNIST handwritten digit database},
117+
author={LeCun, Yann and Cortes, Corinna and Burges, CJ},
118+
journal={ATT Labs [Online]. Available: http://yann. lecun. com/exdb/mnist},
119+
volume={2},
120+
year={2010}
121+
}
122+
"""',
123+
)
121124
```
122125

123126
You can also get details about the classes (number of classes and their names).

docs/_index.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@
3737
"from __future__ import division\n",
3838
"from __future__ import print_function\n",
3939
"\n",
40-
"import tensorflow as tf\n",
40+
"import tensorflow.compat.v2 as tf\n",
4141
"import tensorflow_datasets as tfds\n",
4242
"\n",
4343
"# tfds works in both Eager and Graph modes\n",

docs/_index.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ landing_page:
2323
<a href="./datasets">list of datasets</a>.
2424
- code_block: |
2525
<pre class = "prettyprint">
26-
import tensorflow as tf
26+
import tensorflow.compat.v2 as tf
2727
import tensorflow_datasets as tfds
2828
2929
# tfds works in both Eager and Graph modes
@@ -48,10 +48,10 @@ landing_page:
4848
items:
4949
- heading: Introducing TensorFlow Datasets
5050
image_path: /resources/images/tf-logo-card-16x9.png
51-
path: https://github.com/tensorflow/datasets/blob/master/docs/announce_proxy.md
51+
path: https://blog.tensorflow.org/2019/02/introducing-tensorflow-datasets.html
5252
buttons:
5353
- label: Read on TensorFlow Blog
54-
path: https://github.com/tensorflow/datasets/blob/master/docs/announce_proxy.md
54+
path: https://blog.tensorflow.org/2019/02/introducing-tensorflow-datasets.html
5555
- heading: TensorFlow Datasets on GitHub
5656
image_path: /resources/images/github-card-16x9.png
5757
path: https://github.com/tensorflow/datasets

docs/add_dataset.md

Lines changed: 8 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -30,8 +30,7 @@ isn't already added.
3030
* [3. Double-check the citation](#3-double-check-the-citation)
3131
* [4. Add a test](#4-add-a-test)
3232
* [5. Check your code style](#5-check-your-code-style)
33-
* [6. Add release notes](#6-add-release-notes)
34-
* [7. Send for review!](#7-send-for-review)
33+
* [6. Send for review!](#6-send-for-review)
3534
* [Define the dataset outside TFDS](#define-the-dataset-outside-tfds)
3635
* [Large datasets and distributed generation](#large-datasets-and-distributed-generation)
3736
* [Testing `MyDataset`](#testing-mydataset)
@@ -543,7 +542,7 @@ except TensorFlow uses 2 spaces instead of 4. Please conform to the
543542
[Google Python Style Guide](https://github.com/google/styleguide/blob/gh-pages/pyguide.md),
544543

545544
Most importantly, use
546-
[`tensorflow_datasets/oss_scripts/lint.sh`](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/oss_scripts/lint.sh)
545+
[`tensorflow_datasets/oss_scripts/lint.sh`](https://github.com/tensorflow/datasets/tree/master/oss_scripts/lint.sh)
547546
to ensure your code is properly formatted. For example, to lint the `image`
548547
directory:
549548

@@ -555,13 +554,7 @@ See
555554
[TensorFlow code style guide](https://www.tensorflow.org/community/contribute/code_style)
556555
for more information.
557556

558-
### 6. Add release notes
559-
560-
Add the dataset to the
561-
[release notes](https://github.com/tensorflow/datasets/tree/master/docs/release_notes.md).
562-
The release note will be published for the next release.
563-
564-
### 7. Send for review!
557+
### 6. Send for review!
565558

566559
Send the pull request for review.
567560

@@ -586,7 +579,7 @@ To create this checksum file the first time, you can use the
586579
`tensorflow_datasets.scripts.download_and_prepare` script and pass the flags
587580
`--register_checksums --checksums_dir=/path/to/checksums_dir`.
588581

589-
### 2. Adjust the fake example direcory
582+
### 2. Adjust the fake example directory
590583

591584
For testing, instead of using the default
592585
[fake example directory](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/testing/test_data/fake_examples)
@@ -595,7 +588,7 @@ you can define your own by setting the `EXAMPLE_DIR` property of
595588

596589
```
597590
class MyDatasetTest(tfds.testing.DatasetBuilderTestCase):
598-
EXAMPLE_DIR = 'path/to/fakedata'`
591+
EXAMPLE_DIR = 'path/to/fakedata'
599592
```
600593

601594
## Large datasets and distributed generation
@@ -617,6 +610,9 @@ as downloaded and extracted. It can be created manually or automatically with a
617610
script
618611
([example script](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/testing/cifar.py)).
619612

613+
If you're using automation to generate the test data, please include that script
614+
in [`testing`](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/testing).
615+
620616
Make sure to use different data in your test data splits, as the test will
621617
fail if your dataset splits overlap.
622618

docs/api_docs/python/tfds/ReadConfig.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,8 @@
2020
<a target="_blank" href="https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/core/utils/read_config.py">View
2121
source</a>
2222

23+
<!-- Equality marker -->
24+
2325
## Class `ReadConfig`
2426

2527
Configures input reading pipeline.

docs/api_docs/python/tfds/Split.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,8 @@
1818
<a target="_blank" href="https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/core/splits.py">View
1919
source</a>
2020

21+
<!-- Equality marker -->
22+
2123
## Class `Split`
2224

2325
`Enum` for dataset splits.

docs/api_docs/python/tfds/as_numpy.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,8 @@
1313
<a target="_blank" href="https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/core/dataset_utils.py">View
1414
source</a>
1515

16+
<!-- Equality marker -->
17+
1618
Converts a `tf.data.Dataset` to an iterable of NumPy arrays.
1719

1820
``` python

docs/api_docs/python/tfds/builder.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,8 @@
1313
<a target="_blank" href="https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/core/registered.py">View
1414
source</a>
1515

16+
<!-- Equality marker -->
17+
1618
Fetches a
1719
<a href="../tfds/core/DatasetBuilder.md"><code>tfds.core.DatasetBuilder</code></a>
1820
by string name.

docs/api_docs/python/tfds/core/BeamBasedBuilder.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,8 @@
2929
<a target="_blank" href="https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/core/dataset_builder.py">View
3030
source</a>
3131

32+
<!-- Equality marker -->
33+
3234
## Class `BeamBasedBuilder`
3335

3436
Beam based Builder.

docs/api_docs/python/tfds/core/BeamMetadataDict.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,8 @@
1616
<a target="_blank" href="https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/core/dataset_info.py">View
1717
source</a>
1818

19+
<!-- Equality marker -->
20+
1921
## Class `BeamMetadataDict`
2022

2123
A <a href="../../tfds/core/Metadata.md"><code>tfds.core.Metadata</code></a>

0 commit comments

Comments
 (0)