Skip to content

Commit 6b93a23

Browse files
Merge pull request #746 from 8bitmp3:patch-1
PiperOrigin-RevId: 257064734
2 parents 234af41 + 903bfd4 commit 6b93a23

File tree

1 file changed

+11
-14
lines changed

1 file changed

+11
-14
lines changed

docs/splits.md

Lines changed: 11 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,10 @@
11
# Splits
22

33
All `DatasetBuilder`s expose various data subsets defined as
4-
[`tfds.Split`s](api_docs/python/tfds/Split.md)
5-
(typically `tfds.Split.TRAIN` and `tfds.Split.TEST`). A given dataset's
6-
splits are defined in
4+
[`tfds.Split`](api_docs/python/tfds/Split.md)s (typically `tfds.Split.TRAIN` and
5+
`tfds.Split.TEST`). A given dataset's splits are defined in
76
[`tfds.DatasetBuilder.info.splits`](api_docs/python/tfds/core/DatasetBuilder.md#info)
8-
and are accessible through
9-
[`tfds.load`](api_docs/python/tfds/load.md)
10-
and
7+
and are accessible through [`tfds.load`](api_docs/python/tfds/load.md) and
118
[`tfds.DatasetBuilder.as_dataset`](api_docs/python/tfds/core/DatasetBuilder.md#as_dataset),
129
both of which take `split=` as a keyword argument.
1310

@@ -27,7 +24,7 @@ Note that a special `tfds.Split.ALL` keyword exists to merge all splits
2724
together:
2825

2926
```py
30-
# Ds will iterate over test, train and validation merged together
27+
# `ds` will iterate over test, train and validation merged together
3128
ds = tfds.load("mnist", split=tfds.Split.ALL)
3229
```
3330

@@ -36,17 +33,17 @@ ds = tfds.load("mnist", split=tfds.Split.ALL)
3633
You have 3 options for how to get a thinner slice of the data than the
3734
base splits, all based on `tfds.Split.subsplit`.
3835

39-
*Warning*: TFDS does not currently guarantee the order of the data on disk when
40-
data is generated, so if you regenerate the data, the subsplits may no longer be
41-
the same.
36+
*Warning*: TensorFlow Datasets does not currently guarantee the order of the
37+
data on disk when data is generated. Therefore, if you regenerate the data, the
38+
subsplits may no longer be the same.
4239

4340
*Warning*: If the `total_number_examples % 100 != 0`, then remainder examples
4441
may not be evenly distributed among subsplits.
4542

4643
### Specify number of subsplits
4744

4845
```py
49-
train_half_1, train_half_2 = tfds.Split.TRAIN.subsplit(2)
46+
train_half_1, train_half_2 = tfds.Split.TRAIN.subsplit(k=2)
5047

5148
dataset = tfds.load("mnist", split=train_half_1)
5249
```
@@ -64,7 +61,7 @@ dataset = tfds.load("mnist", split=middle_50_percent)
6461
### Specifying weights
6562

6663
```py
67-
half, quarter1, quarter2 = tfds.Split.TRAIN.subsplit([2, 1, 1])
64+
half, quarter1, quarter2 = tfds.Split.TRAIN.subsplit(weighted=[2, 1, 1])
6865

6966
dataset = tfds.load("mnist", split=half)
7067
```
@@ -78,7 +75,7 @@ It's possible to compose the above operations:
7875
split = tfds.Split.TRAIN.subsplit(tfds.percent[:50]) + tfds.Split.TEST
7976

8077
# Split the combined TRAIN and TEST splits into 2
81-
first_half, second_half = (tfds.Split.TRAIN + tfds.Split.TEST).subsplit(2)
78+
first_half, second_half = (tfds.Split.TRAIN + tfds.Split.TEST).subsplit(k=2)
8279
```
8380

8481
Note that a split cannot be added twice, and subsplitting can only happen once.
@@ -89,7 +86,7 @@ For example, these are invalid:
8986
split = tfds.Split.TRAIN.subsplit(tfds.percent[:25]) + tfds.Split.TRAIN
9087

9188
# INVALID! Subsplit of subsplit
92-
split = tfds.Split.TRAIN.subsplit(tfds.percent[0:25]).subsplit(2)
89+
split = tfds.Split.TRAIN.subsplit(tfds.percent[0:25]).subsplit(k=2)
9390

9491
# INVALID! Subsplit of subsplit
9592
split = (tfds.Split.TRAIN.subsplit(tfds.percent[:25]) +

0 commit comments

Comments
 (0)