tensorflow
diff --git a/‎README.md
Lines changed: 15 additions & 1 deletion b/‎README.md
Lines changed: 15 additions & 1 deletion
diff --git a/‎docs/_book.yaml
Lines changed: 4 additions & 0 deletions b/‎docs/_book.yaml
Lines changed: 4 additions & 0 deletions
diff --git a/‎docs/_project.yaml
Lines changed: 1 addition & 1 deletion b/‎docs/_project.yaml
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/add_dataset.md
Lines changed: 70 additions & 21 deletions b/‎docs/add_dataset.md
Lines changed: 70 additions & 21 deletions
diff --git a/‎docs/api_docs/python/_redirects.yaml
Lines changed: 2 additions & 0 deletions b/‎docs/api_docs/python/_redirects.yaml
Lines changed: 2 additions & 0 deletions
diff --git a/‎docs/api_docs/python/_toc.yaml
Lines changed: 12 additions & 4 deletions b/‎docs/api_docs/python/_toc.yaml
Lines changed: 12 additions & 4 deletions
@@ -24,13 +24,16 @@ TensorFlow Datasets provides many public datasets as `tf.data.Datasets`.
 ```sh
 pip install tensorflow-datasets
 
-# Requires TF 1.12+ to be installed.
+# Requires TF 1.13+ to be installed.
 # Some datasets require additional libraries; see setup.py extras_require
 pip install tensorflow
 # or:
 pip install tensorflow-gpu
 ```
 
+Join [our Google group](https://groups.google.com/forum/#!forum/tensorflow-datasets-public-announce)
+to receive updates on the project.
+
 ### Usage
 
 ```python
@@ -111,6 +114,17 @@ print(info)
   )
 ```
 
+You can also get details about the classes (number of classes and their names).
+
+```python
+info = tfds.builder('cats_vs_dogs').info
+
+info.features['label'].num_classes  # 2
+info.features['label'].names  # ['cat', 'dog']
+info.features['label'].int2str(1)  # "dog"
+info.features['label'].str2int('cat')  # 0
+```
+
 ### NumPy Usage with `tfds.as_numpy`
 
 As a convenience for users that want simple NumPy arrays in their programs, you
 
@@ -23,6 +23,10 @@ upper_tabs:
         path: /datasets/splits
       - title: Add a dataset
         path: /datasets/add_dataset
+      - title: Add huge datasets
+        path: /datasets/beam_datasets
+      - title: Store your dataset on GCS
+        path: /datasets/gcs
     - name: API
       skip_translation: true
       contents:
 
@@ -1,5 +1,5 @@
 name: TensorFlow Datasets
-breadcrumb_name: Datasets v1.0.1
+breadcrumb_name: Datasets v1.0.2
 home_url: /datasets/
 parent_project_metadata_path: /_project.yaml
 description: >
 
@@ -5,21 +5,34 @@ Follow this guide to add a dataset to TFDS.
 See our [list of datasets](datasets.md) to see if the dataset you want isn't
 already added.
 
-* [Overview](#overview)
-* [Writing `my_dataset.py`](#writing-my-datasetpy)
-* [Specifying `DatasetInfo`](#specifying-datasetinfo)
-  * [`FeatureConnector`s](#featureconnectors)
-* [Downloading and extracting source data](#downloading-and-extracting-source-data)
-  * [Manual download and extraction](#manual-download-and-extraction)
-* [Specifying dataset splits](#specifying-dataset-splits)
-* [Writing an example generator](#writing-an-example-generator)
-  * [File access and `tf.io.gfile`](#file-access-and-tfiogfile)
-  * [Extra dependencies](#extra-dependencies)
-* [Dataset configuration](#dataset-configuration)
-* [Create your own `FeatureConnector`](#create-your-own-featureconnector)
-* [Adding the dataset to `tensorflow/datasets`](#adding-the-dataset-to-tensorflowdatasets)
-* [Large datasets and distributed generation](#large-datasets-and-distributed-generation)
-* [Testing `MyDataset`](#testing-mydataset)
+*   [Overview](#overview)
+*   [Writing `my_dataset.py`](#writing-my-datasetpy)
+    *   [Use the default template](#use-the-default-template)
+    *   [DatasetBuilder](#datasetbuilde)
+    *   [my_dataset.py](#my-datasetpy)
+*   [Specifying `DatasetInfo`](#specifying-datasetinfo)
+    *   [`FeatureConnector`s](#featureconnectors)
+*   [Downloading and extracting source data](#downloading-and-extracting-source-data)
+    *   [Manual download and extraction](#manual-download-and-extraction)
+*   [Specifying dataset splits](#specifying-dataset-splits)
+*   [Writing an example generator](#writing-an-example-generator)
+    *   [File access and `tf.io.gfile`](#file-access-and-tfiogfile)
+    *   [Extra dependencies](#extra-dependencies)
+    *   [Corrupted data](#corrupted-data)
+    *   [Inconsistent data](#inconsistent-data)
+*   [Dataset configuration](#dataset-configuration)
+    *   [Heavy configuration with BuilderConfig](#heavy-configuration-with-builderconfig)
+    *   [Light configuration with constructor args](#light-configuration-with-constructor-args)
+*   [Create your own `FeatureConnector`](#create-your-own-featureconnector)
+*   [Adding the dataset to `tensorflow/datasets`](#adding-the-dataset-to-tensorflowdatasets)
+    *   [1. Add an import for registration](#1-add-an-import-for-registration)
+    *   [2. Run download_and_prepare locally](#2-run-download-and-prepare-locally)
+    *   [3. Double-check the citation](#3-double-check-the-citation)
+    *   [4. Add a test](#4-add-a-test)
+    *   [5. Check your code style](#5-check-your-code-style)
+    *   [6. Send for review!](#6-send-for-review)
+*   [Large datasets and distributed generation](#large-datasets-and-distributed-generation)
+*   [Testing `MyDataset`](#testing-mydataset)
 
 ## Overview
 
@@ -49,6 +62,24 @@ generate on a single machine. See the
 
 ## Writing `my_dataset.py`
 
+### Use the default template
+
+If you want to
+[contribute to our repo](https://github.com/tensorflow/datasets/blob/master/CONTRIBUTING.md)
+and add a new dataset, the following script will help you get started by
+generating the required python files,...
+To use it, clone the `tfds` repository and run the following command:
+
+```
+python tensorflow_datasets/scripts/create_new_dataset.py \
+  --dataset my_dataset \
+  --type image  # text, audio, translation,...
+```
+
+
+Then search for `TODO(my_dataset)` in the generated files to do the
+modifications.
+
 ### `DatasetBuilder`
 
 Each dataset is defined as a subclass of
@@ -193,15 +224,15 @@ through [`tfds.Split.subsplit`](splits.md#subsplit).
     # Specify the splits
     return [
         tfds.core.SplitGenerator(
-            name="train",
+            name=tfds.Split.TRAIN,
             num_shards=10,
             gen_kwargs={
                 "images_dir_path": os.path.join(extracted_path, "train"),
                 "labels": os.path.join(extracted_path, "train_labels.csv"),
             },
         ),
         tfds.core.SplitGenerator(
-            name="test",
+            name=tfds.Split.TEST,
             num_shards=1,
             gen_kwargs={
                 "images_dir_path": os.path.join(extracted_path, "test"),
@@ -501,17 +532,35 @@ Most datasets in TFDS should have a unit test and your reviewer may ask you
 to add one if you haven't already. See the
 [testing section](#testing-mydataset) below.
 
-### 5. Send for review!
+### 5. Check your code style
+
+Follow the [PEP 8 Python style guide](https://www.python.org/dev/peps/pep-0008),
+except TensorFlow uses 2 spaces instead of 4. Please conform to the
+[Google Python Style Guide](https://github.com/google/styleguide/blob/gh-pages/pyguide.md),
+
+Most importantly, use
+[`tensorflow_datasets/oss_scripts/lint.sh`](https://github.com/tensorflow/datasets/blob/master/oss_scripts/lint.sh)
+to ensure your code is properly formatted. For example, to lint the `image`
+directory:
+
+```sh
+./oss_scripts/lint.sh tensorflow_datasets/image
+```
+
+See
+[TensorFlow code style guide](https://www.tensorflow.org/community/contribute/code_style)
+for more information.
+
+### 6. Send for review!
 
 Send the pull request for review.
 
 
 ## Large datasets and distributed generation
 
 Some datasets are so large as to require multiple machines to download and
-generate. We intend to soon support this use case using Apache Beam. Follow
-[our tracking issue](https://github.com/tensorflow/datasets/issues/10)
-to be updated.
+generate. We support this use case using Apache Beam. Please read the
+[Beam Dataset Guide](beam_datasets.md) to get started.
 
 ## Testing MyDataset
 
 
@@ -3,6 +3,8 @@ redirects:
   to: /datasets/api_docs/python/tfds/download/GenerateMode
 - from: /datasets/api_docs/python/tfds/testing/FeatureExpectationsTestCase/failureException
   to: /datasets/api_docs/python/tfds/testing/DatasetBuilderTestCase/failureException
+- from: /datasets/api_docs/python/tfds/testing/SubTestCase/failureException
+  to: /datasets/api_docs/python/tfds/testing/DatasetBuilderTestCase/failureException
 - from: /datasets/api_docs/python/tfds/testing/TestCase/failureException
   to: /datasets/api_docs/python/tfds/testing/DatasetBuilderTestCase/failureException
 - from: /datasets/api_docs/python/tfds/features/text
 
@@ -8,6 +8,10 @@ toc:
       path: /datasets/api_docs/python/tfds/as_numpy
     - title: builder
       path: /datasets/api_docs/python/tfds/builder
+    - title: disable_progress_bar
+      path: /datasets/api_docs/python/tfds/disable_progress_bar
+    - title: is_dataset_on_gcs
+      path: /datasets/api_docs/python/tfds/is_dataset_on_gcs
     - title: list_builders
       path: /datasets/api_docs/python/tfds/list_builders
     - title: load
@@ -20,6 +24,8 @@ toc:
     section:
     - title: Overview
       path: /datasets/api_docs/python/tfds/core
+    - title: BeamBasedBuilder
+      path: /datasets/api_docs/python/tfds/core/BeamBasedBuilder
     - title: BuilderConfig
       path: /datasets/api_docs/python/tfds/core/BuilderConfig
     - title: DatasetBuilder
@@ -32,6 +38,10 @@ toc:
       path: /datasets/api_docs/python/tfds/core/get_tfds_path
     - title: lazy_imports
       path: /datasets/api_docs/python/tfds/core/lazy_imports
+    - title: Metadata
+      path: /datasets/api_docs/python/tfds/core/Metadata
+    - title: MetadataDict
+      path: /datasets/api_docs/python/tfds/core/MetadataDict
     - title: NamedSplit
       path: /datasets/api_docs/python/tfds/core/NamedSplit
     - title: SplitBase
@@ -82,8 +92,6 @@ toc:
       path: /datasets/api_docs/python/tfds/features/Image
     - title: Sequence
       path: /datasets/api_docs/python/tfds/features/Sequence
-    - title: SequenceDict
-      path: /datasets/api_docs/python/tfds/features/SequenceDict
     - title: Tensor
       path: /datasets/api_docs/python/tfds/features/Tensor
     - title: TensorInfo
@@ -112,8 +120,6 @@ toc:
     section:
     - title: Overview
       path: /datasets/api_docs/python/tfds/file_adapter
-    - title: CSVAdapter
-      path: /datasets/api_docs/python/tfds/file_adapter/CSVAdapter
     - title: FileFormatAdapter
       path: /datasets/api_docs/python/tfds/file_adapter/FileFormatAdapter
     - title: TFRecordExampleAdapter
@@ -142,6 +148,8 @@ toc:
       path: /datasets/api_docs/python/tfds/testing/rm_tmp_dir
     - title: run_in_graph_and_eager_modes
       path: /datasets/api_docs/python/tfds/testing/run_in_graph_and_eager_modes
+    - title: SubTestCase
+      path: /datasets/api_docs/python/tfds/testing/SubTestCase
     - title: TestCase
       path: /datasets/api_docs/python/tfds/testing/TestCase
     - title: test_main