Ironseed is an R package that improves seeding for R’s built in random number generators. An ironseed is a finite-entropy (or fixed-entropy) hash digest that can be used to generate an unlimited sequence of seeds for initializing the state of a random number generator. It is inspired by the work of M.E. O’Neill and others [1, 2, 3].
An ironseed is a 256-bit hash digest constructed from a variable-length sequence of 32-bit inputs. Each ironseed consists of eight 32-bit sub-digests. The sub-digests are values of 32-bit multilinear hashes [4] that accumulate entropy from the input sequence. Each input is included in every sub-digest. The coefficients for the multilinear hashes are generated by a Weyl sequence.
Multilinear hashes are also used to generate an output seed sequence from an ironseed. Each 32-bit output value is generated by uniquely hashing the sub-digests. The coefficients for the output are generated by a second Weyl sequence.
To improve the observed randomness of each hash output, bits are mixed using a finalizer adapted from SplitMix64 [5]. With the additional mixing from the finalizer, the output seed sequence passes PractRand tests [6].
# Install the released version of the package from CRAN as usual:
install.packages("ironseed")
# Or the development version from GitHub:
# install.packages("pak")
pak::pak("reedacartwright/ironseed")
Ironseed can be used at the top of a script to robustly initialize R’s builtin random number generator. The resulting ironseed is returned invisibly, and a message is generated notifying the user that initialization has occurred. This message can be logged and later used to reproduce the run.
#!/usr/bin/env -S Rscript --vanilla
ironseed::ironseed("Experiment", 20251031, 1)
#> ** Ironseed : Seed ZZaisFMror3-ZhLosGqfN3R-8B6PqHJ5hhf-CFFsWhrQvBd v0.1.0.9002
runif(10)
#> [1] 0.89705422 0.04560317 0.83527653 0.45908361 0.12400794 0.60115341
#> [7] 0.12773858 0.91012865 0.53115788 0.22756973
If your script is intended to be called multiple times as part of a large study, you can also seed based on the command line arguments.
#!/usr/bin/env -S Rscript --vanilla
args <- commandArgs(trailingOnly = TRUE)
ironseed::ironseed("A Simulation Script 1", args)
#> ** Ironseed : Seed 8cqtTGB8vrS-WYjGZ3isTH2-pHzMHbEDV38-PBfFNYcUjEG v0.1.0.9002
runif(10)
#> [1] 0.84992808 0.64469355 0.89756643 0.19331885 0.23951235 0.62736989
#> [7] 0.18903081 0.11728875 0.98641615 0.04123155
Specific command line arguments can also be used. For large, nested studies, it is useful for scripts to support seeding using multiple seeds. Ironseed makes this easy to accomplish.
#!/usr/bin/env -S Rscript --vanilla
args <- commandArgs(trailingOnly = TRUE)
ironseed::ironseed("A Simulation Script 2", args[grepl("--seed=", args)])
#> ** Ironseed : Seed CP5fKZWMYsB-ERVXpk3k8xd-6j18poR5Sgc-eKWXg7VJFPd v0.1.0.9002
runif(10)
#> [1] 0.89075876 0.48203428 0.08680504 0.51708634 0.04348393 0.50693477
#> [7] 0.38465350 0.64060167 0.26286422 0.54601453
Ironseed can also automatically initialize the random number generator
using an ironseed constructed from multiple sources of entropy. This
occurs if no data is passed to ironseed()
.
#!/usr/bin/env -S Rscript --vanilla
ironseed::ironseed()
#> ** Ironseed : Seed AfaPdekUCoA-fhRoazvKnTc-EnqCD7wKNfB-2mdwgP1zuP5 v0.1.0.9002
runif(10)
#> [1] 0.81606451 0.01563447 0.02842008 0.92643311 0.93597330 0.70551466
#> [7] 0.75512336 0.25908712 0.98250676 0.03561528
# Since RNG initializing has occurred, the next call will simply
# return the ironseed used in previous seeding.
fe <- ironseed::ironseed()
fe
#> Ironseed: AfaPdekUCoA-fhRoazvKnTc-EnqCD7wKNfB-2mdwgP1zuP5
Or achieving the same thing with one call. Note that the automatically generated seed is different from the previous run.
#!/usr/bin/env -S Rscript --vanilla
fe <- ironseed::ironseed()
#> ** Ironseed : Seed iENGNd69yVH-ue59xcZc3fF-f4R9qYhESkd-pfWMYxMmFoM v0.1.0.9002
runif(10)
#> [1] 0.60404956 0.07975961 0.03384779 0.84441315 0.53671970 0.46416535
#> [7] 0.10455209 0.33560387 0.43316807 0.27369468
fe
#> Ironseed: iENGNd69yVH-ue59xcZc3fF-f4R9qYhESkd-pfWMYxMmFoM
An ironseed can also be used directly to reproduce a previous initialization. This is most useful when automatic seeding has been used, and the previously generated seed has been logged.
#!/usr/bin/env -S Rscript --vanilla
ironseed::ironseed("RW7vjwjeiHF-QG7RYPvrntR-6tGPoi65sVc-N1n5SQi5RH4")
#> ** Ironseed : Seed RW7vjwjeiHF-QG7RYPvrntR-6tGPoi65sVc-N1n5SQi5RH4 v0.1.0.9002
runif(10)
#> [1] 0.05348978 0.02073685 0.08883963 0.94422739 0.36191244 0.25720998
#> [7] 0.61729578 0.89536005 0.86046463 0.06095049
A good hash function has good avalanche properties. If we change one bit of information in the input, our goal is to change 50% of the bits in the output. To test this we, will first build a function to construct a random pair of ironseeds that differ by a single input bit.
rand_fe_pair <- function(w) {
x <- sample(0:1, w, replace=TRUE)
n <- sample(seq_along(x), 1)
y <- x
y[n] <- if(y[n] == 1) 0L else 1L
x <- packBits(x, "integer")
y <- packBits(y, "integer")
x <- ironseed::ironseed(x, set_seed = FALSE)
y <- ironseed::ironseed(y, set_seed = FALSE)
list(x = x, y = y)
}
Next we will generate 100,000 pairs using 32-bit inputs. We will use R’s built-in seeding algorithm so that the results are independent of Ironseed’s seeding algorithm. We will also measure how many hash bits were flipped by flipping one input bit.
set.seed(20251220)
z <- replicate(100000, rand_fe_pair(32), simplify = FALSE)
dat <- sapply(z, \(a) sum(intToBits(a$x) != intToBits(a$y)))
mean(dat) # expectation: 128
#> [1] 128.0011
sd(dat) # expectation: 8
#> [1] 8.008929
hist(dat, breaks = 86:170, main = NULL)
We will repeat the same analysis for 256-bit inputs.
set.seed(20251221)
z <- replicate(100000, rand_fe_pair(256), simplify = FALSE)
dat <- sapply(z, \(a) sum(intToBits(a$x) != intToBits(a$y)))
mean(dat) # expectation: 128
#> [1] 128.0405
sd(dat) # expectation: 8
#> [1] 8.019668
hist(dat, breaks = 86:170, main = NULL)
As one can see, the avalanche behavior of the input hash is excellent.