tensorflow
diff --git a/‎.github/ISSUE_TEMPLATE/dataset-request.md
Lines changed: 3 additions & 1 deletion b/‎.github/ISSUE_TEMPLATE/dataset-request.md
Lines changed: 3 additions & 1 deletion
diff --git a/‎CONTRIBUTING.md
Lines changed: 29 additions & 1 deletion b/‎CONTRIBUTING.md
Lines changed: 29 additions & 1 deletion
diff --git a/‎README.md
Lines changed: 5 additions & 5 deletions b/‎README.md
Lines changed: 5 additions & 5 deletions
diff --git a/‎docs/README.md
Lines changed: 1 addition & 0 deletions b/‎docs/README.md
Lines changed: 1 addition & 0 deletions
diff --git a/‎docs/add_dataset.md
Lines changed: 3 additions & 3 deletions b/‎docs/add_dataset.md
Lines changed: 3 additions & 3 deletions
@@ -12,4 +12,6 @@ assignees: ''
 * License of dataset: <license type>
 * Short description of dataset and use case(s): <description>
 
-Folks who would also like to see this dataset in `tensorflow/datasets`, please +1/thumbs-up so the developers can know which requests to prioritize.
+Folks who would also like to see this dataset in `tensorflow/datasets`, please thumbs-up so the developers can know which requests to prioritize.
+
+And if you'd like to contribute the dataset (thank you!), see our [guide to adding a dataset](https://github.com/tensorflow/datasets/blob/master/docs/add_dataset.md).
@@ -1,5 +1,33 @@
 # How to Contribute
 
+Thanks for thinking about contributing to our library !
+
+
+## Before you start
+* Please accept the [Contributor License Agreement](https://cla.developers.google.com) (see below)
+* [Ask here](https://github.com/tensorflow/datasets/issues/142) to be added to
+  the list of collaborators so that issues can be assigned to you.
+* Comment on the issue that you plan to work on so we can assign it to you and
+  there isn't unnecessary duplication of work.
+* When you plan to work on something larger (for example, adding new
+  `FeatureConnectors`), please respond on the issue (or create one if there
+  isn't one) to explain your plan and give others a chance to discuss.
+* If you're fixing some smaller issue - please check the list of
+  [pending Pull Requests](https://github.com/tensorflow/datasets/pulls) to
+  avoid unnecessary duplication.
+
+
+## How you can help:
+
+You can help in multiple ways:
+
+* Adding new datasets and/or requested features (see the [issues](https://github.com/tensorflow/datasets/issues))
+* Reproducing bugs reported by others: This helps us **a lot**.
+* Doing code reviews on the Pull Requests from the community.
+* Verifying that Pull Requests from others are working correctly
+  (especially the ones that add new datasets).
+
+
 ## Datasets
 
 Adding a public dataset to `tensorflow-datasets` is a great way of making it
@@ -42,7 +70,7 @@ require:
 *Note that tests for DatasetBuilders are different and are documented in the*
 *[guide to add a dataset](https://github.com/tensorflow/datasets/tree/master/docs/add_dataset.md#testing-mydataset).*
 
-# Pull Requests
+## Pull Requests
 
 All contributions are done through Pull Requests here on GitHub.
 
 
@@ -1,6 +1,6 @@
 # TensorFlow Datasets
 
-TensorFlow Datasets provides many public datasets as `tf.data.Dataset`s.
+TensorFlow Datasets provides many public datasets as `tf.data.Datasets`.
 
 [![Kokoro](https://storage.googleapis.com/tfds-kokoro-public/kokoro-build.svg)](https://storage.googleapis.com/tfds-kokoro-public/kokoro-build.html)
 [![PyPI version](https://badge.fury.io/py/tensorflow-datasets.svg)](https://badge.fury.io/py/tensorflow-datasets)
@@ -77,7 +77,7 @@ mnist_builder = tfds.builder("mnist")
 mnist_builder.download_and_prepare()
 
 # Construct a tf.data.Dataset
-dataset = mnist_builder.as_dataset(split=tfds.Split.TRAIN)
+ds = mnist_builder.as_dataset(split=tfds.Split.TRAIN)
 
 # Get the `DatasetInfo` object, which contains useful information about the
 # dataset and its features
@@ -132,9 +132,9 @@ You can also use `tfds.as_numpy` in conjunction with `batch_size=-1` to
 get the full dataset in NumPy arrays from the returned `tf.Tensor` object:
 
 ```python
-train_data = tfds.load("mnist", split=tfds.Split.TRAIN, batch_size=-1)
-numpy_data = tfds.as_numpy(train_data)
-numpy_images, numpy_labels = numpy_dataset["image"], numpy_dataset["label"]
+train_ds = tfds.load("mnist", split=tfds.Split.TRAIN, batch_size=-1)
+numpy_ds = tfds.as_numpy(train_ds)
+numpy_images, numpy_labels = numpy_ds["image"], numpy_ds["label"]
 ```
 
 Note that the library still requires `tensorflow` as an internal dependency.
 
@@ -5,3 +5,4 @@
 *   [API Documentation](https://www.tensorflow.org/datasets/api_docs/python/tfds)
 *   [Splits](splits.md)
 *   [Adding a new dataset](add_dataset.md)
+*   [Using Google Cloud Storage to cache preprocessed data](gcs.md)
@@ -496,12 +496,12 @@ to be updated.
 dataset. It uses "fake examples" as test data that mimic the structure of the
 source dataset.
 
-The test data should be put in in
+The test data should be put in
 [`testing/test_data/fake_examples/`](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/testing/test_data/fake_examples/)
 under the `my_dataset` directory and should mimic the source dataset artifacts
 as downloaded and extracted. It can be created manually or automatically with a
-script ([example
-script](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/testing/cifar.py)).
+script
+([example script](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/testing/cifar.py)).
 
 Make sure to use different data in your test data splits, as the test will
 fail if your dataset splits overlap.