Skip to content

add Random.fork(rng::Xoshiro) to split rng into a new instance #58193

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

rfourquet
Copy link
Member

@rfourquet rfourquet commented Apr 22, 2025

When new tasks are spawned, the task-local RNG of the current task is "split", or "forked", into a new RNG for the new task. This is achieved in a fast and sound way, involving a secondary RNG "hidden" in the 5th field of the TaskLocalRNG struct.
Xoshiro was made to mirror TaskLocalRNG and also has a 5th field, mostly unused. It was useful for example when you want to save the state of TaskLocalRNG() via copy(TaskLocalRNG()), which returns a Xoshiro.

This PR re-implements in julia this forking mechanism from jl_rng_split() in "src/task.c", to allow forking Xoshiro RNGs independently of task spawning. This is in particular useful for parallel (reproducible) computations, where tasks can be spawned with a forked (explicit) RNG from an initial Xoshiro object.

There are alternatives, like "jumping", or for example seeding new RNGs objects from a master seed and a "worker id", like in Xoshiro([master_seed, worker_id]), but these techniques require some coordination. For example, it can be unsafe to locally create a new RNG via jumping, because you might have collision with another RNG created from the parent task also via jumping with the same number of steps.
In contrast, "forking" allows to make local decisions about the shape of the computation without risks of collisions.
And it's simpler: you just write forked_rng = Random.fork(src_rng).

Alternative names could be

  • spawn, which is used in Numpy, but that I find less descriptive than fork; or
  • split, perhaps what is mostly found in the literature, and also appears in the name jl_rng_split; one problem being that Base.split means something else.

"Fork" seems appropriate as it's used in the name Random.forkRand for array-filling with SIMD, and in the long comment above jl_rng_split.

@rfourquet rfourquet added randomness Random number generation and the Random stdlib feature Indicates new feature / enhancement requests labels Apr 22, 2025
@rfourquet
Copy link
Member Author

Small changes:

  • With Random: allow seeding from an RNG #51533 allowing seeding from an rng, fork!(dst, src) becomes redundant with seed!(dst, src), so implement the latter via the same forking algorithm.
  • add an array version, which is useful for example with Threads.@threads
  • remove fork(TaskLocalRNG()): it initially returned a Xoshiro(), but this was complicating the docstring; returning TaskLocalRNG() might be better if it's appropriate for all expected uses of fork. But punt on that for now.

@fingolfin
Copy link
Member

has merge conflicts now

rfourquet added 3 commits May 8, 2025 09:26
When new tasks are spawned, the task-local RNG of the current task is
"split", or "forked", into a new RNG for the new task. This is achieved
in a fast and sound way, involving a secondary RNG "hidden" in the 5th
field of the `TaskLocalRNG` struct.
`Xoshiro` was made to mirror `TaskLocalRNG` and also has a 5th field,
mostly unused. It was useful for example when you want to save the state
of `TaskLocalRNG()` via `copy(TaskLocalRNG())`, which returns a `Xoshiro`.

This PR re-implements in julia this forking mechanism from `jl_rng_split()`
in "src/task.c", to allow forking `Xoshiro` RNGs independently of task spawning.
This is in particular useful for parallel (reproducible) computations, where tasks
can be spawned with a forked (explicit) RNG from an initial `Xoshiro` object.

There are alternatives, like "jumping", or for example seeding new RNGs objects
from a master seed and a "worker id", like in `Xoshiro((master_seed, id))`, but
these techniques require some coordination. For example, it can be unsafe to
locally create a new RNG via jumping, because you might have collision with
another RNG created from the parent task also via jumping with the same number
of steps.
In contrast, "forking" allows to make local decisions about the shape of the
computation without risks of collisions.

Alternative names could be
* `spawn`, which is used in Numpy, but that I find less descriptive than `fork`; or
* `split`, perhaps what is mostly found in the literature, and also appears in the name
  `jl_rng_split`; one problem being that `Base.split` means something else.

"Fork" seems appropriate as it's used in the name `Random.forkRand` for array-filling
with SIMD, and in the long comment above `jl_rng_split`.
Now that seed! accepts an rng as 2nd arg, fork! is replaced by seed!.
Array version is useful for `@threads` (cf. docstring).
`fork(TaskLocalRNG())` is ambiguous: return a Xoshiro or TaskLocalRNG() ?
Leave that question for later.
Also, returning Xoshiro was seriously
complicating the docstring.
rfourquet added a commit that referenced this pull request May 8, 2025
We have long had methods for RNG "jumps ahead", i.e. advancing the state by
a given number of "steps", but no good API for that.

The only public API is `Future.randjump(r::MersenneTwister, steps::Integer)`,
and there are also functions for `Xoshiro` which are not public
(`Random.jump_128` and friends).

The following generic API is implemented here:
* `Random.jump(rng)` to jump by a reasonable default number of steps
* `Random.jump(rng; by::Real)` to jump by `by` steps
* `Random.jump!(rng; [by])` to equivalently jump in-place
* `Random.jump(rng, dims...)` to create an array of jumped RNGs

In old julia versions, there also existed a method of `randjump` returning an
array, but the 1st element of this array was the passed argument;
the version here does not do this aliasing.

There are two kinds of integers one would wish to pass: dimensions for the array
version, and the number of steps.
Using jumps is relatively "niche", but needing to fidle with the number of steps
is even more niche. It's expected that in the vast majority of cases,
a good default is enough.

Some APIs in other languages have `jump` (e.g. 2^128 steps) and `long_jump`
(e.g. 2^192 steps), or `leap` in java, for more complicated cases;
for example each process gets a jumped RNG via
`long_jump`, and within each process, each thread gets a jumped RNG via `jump`.
But this is not very scalable if more kind of jumps are needed: should
`huge_jump` be introduced? For these rare cases where the default number of
steps is not sufficient, it seems better to let the programmer explicitly
specify the number of steps via an integer.

There is even a third kind of integers one might want to pass: in
`Random.jump_128(x::Xoshiro, i::Integer)`, `i` represents the number of times
a jump of size `2^128` is applied; this is because `Xoshiro` doesn't support
arbitrary number of steps; this is not supported in the proposed API, because
1) it's trivial for the user to implement herself, and 2) in probably most use
cases, using the array version will be a valid alternative, and more efficient
because previous computations are not wasted
(like in `[Random.jump_128(x, i) for i=1:num_tasks]` vs
`Random.jump(x, num_tasks)`).

Another argument in favor of this API is that it mirrors the
proposed `Random.fork(rng, dims...)` function from #58193.
rfourquet added a commit that referenced this pull request May 8, 2025
We have long had methods for RNG "jumps ahead", i.e. advancing the state by
a given number of "steps", but no good API for that.

The only public API is `Future.randjump(r::MersenneTwister, steps::Integer)`,
and there are also functions for `Xoshiro` which are not public
(`Random.jump_128` and friends).

The following generic API is implemented here:
* `Random.jump(rng)` to jump by a reasonable default number of steps
* `Random.jump(rng; by::Real)` to jump by `by` steps
* `Random.jump!(rng; [by])` to equivalently jump in-place
* `Random.jump(rng, dims...; [by])` to create an array of jumped RNGs

In old julia versions, there also existed a method of `randjump` returning an
array, but the 1st element of this array was the passed argument;
the version here does not do this aliasing.

There are two kinds of integers one would wish to pass: dimensions for the array
version, and the number of steps.
Using jumps is relatively "niche", but needing to fidle with the number of steps
is even more niche. It's expected that in the vast majority of cases,
a good default is enough.

Some APIs in other languages have `jump` (e.g. 2^128 steps) and `long_jump`
(e.g. 2^192 steps), or `leap` in java, for more complicated cases;
for example each process gets a jumped RNG via
`long_jump`, and within each process, each thread gets a jumped RNG via `jump`.
But this is not very scalable if more kind of jumps are needed: should
`huge_jump` be introduced? For these rare cases where the default number of
steps is not sufficient, it seems better to let the programmer explicitly
specify the number of steps via an integer.

There is even a third kind of integers one might want to pass: in
`Random.jump_128(x::Xoshiro, i::Integer)`, `i` represents the number of times
a jump of size `2^128` is applied; this is because `Xoshiro` doesn't support
arbitrary number of steps; this is not supported in the proposed API, because
1) it's trivial for the user to implement herself, and 2) in probably most use
cases, using the array version will be a valid alternative, and more efficient
because previous computations are not wasted
(like in `[Random.jump_128(x, i) for i=1:num_tasks]` vs
`Random.jump(x, num_tasks)`).

Another argument in favor of this API is that it mirrors the
proposed `Random.fork(rng, dims...)` function from #58193.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Indicates new feature / enhancement requests randomness Random number generation and the Random stdlib
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants