[docs/data] Fix shuffle section wording (#51289)

richardliaw · web-flow · commit a3d66daa19f7 · 2025-03-17T22:27:18.000Z
## Why are these changes needed?  ## Related issue number  ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
diff --git a/doc/source/data/shuffling-data.rst b/doc/source/data/shuffling-data.rst
@@ -91,8 +91,8 @@ To perform block order shuffling, use :meth:`randomize_block_order <ray.data.Dat
     # Randomize the block order of this dataset.
     ds = ds.randomize_block_order()
 
-Shuffle all rows
-~~~~~~~~~~~~~~~~
+Shuffle all rows (Global shuffle)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 To randomly shuffle all rows globally, call :meth:`~ray.data.Dataset.random_shuffle`.
 This is the slowest option for shuffle, and requires transferring data across
@@ -128,13 +128,7 @@ to data transfer costs. This cost can be prohibitive when using very large datas
 
 The best route for determining the best tradeoff between preprocessing time and cost and
 per-epoch shuffle quality is to measure the precision gain per training step for your
-particular model under different shuffling policies:
-
-* no shuffling,
-* local (per-shard) limited-memory shuffle buffer,
-* local (per-shard) shuffling,
-* windowed (pseudo-global) shuffling, and
-* fully global shuffling.
+particular model under different shuffling policies such as no shuffling, local shuffling, or global shuffling.
 
 As long as your data loading and shuffling throughput is higher than your training throughput, your GPU should
 be saturated. If you have shuffle-sensitive models, push the