You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: REFERENCE.md
+3-5Lines changed: 3 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -403,13 +403,11 @@ While the algorithm in `regression_forest` is very similar to that of classic ra
403
403
404
404
Overall, GRF is designed to produce the same estimates across platforms when using a consistent value for the random seed through the training option seed. However, there are still some cases where GRF can produce different estimates across platforms. When it comes to cross-platform predictions, the output of GRF will depend on a few factors beyond the forest seed.
405
405
406
-
One such factor is the compiler that was used to build GRF. Different compilers may have different default behavior around floating-point rounding, and these could lead to slightly different forest splits if the data requires numerical precision. Another factor is how the forest construction is distributed across different threads. Right now, our forest splitting algorithm can give different results depending on the number of threads that were used to build the forest.
406
+
One such factor is the compiler that was used to build GRF. Different compilers may have different default behavior around floating-point behavior and instruction optimizations, and these could lead to slightly different forest splits if the data requires numerical precision. In addition to setting the seed argument, rounding all input data to at most 8 significant digits may help.
407
407
408
-
Therefore, in order to ensure consistent results, we provide the following recommendations.
409
-
- Make sure arguments `seed` and `num.threads` are the same across platforms
410
-
- Round data to 8 significant digits
408
+
Even though the compiler is the same, different CPU architectures may produce slightly different output. One such example is GRF compiled with clang and run on x86 (Intel) vs. ARM (Apple Silicon).
411
409
412
-
Also, please note that we have not done extensive testing on Windows platforms, although we do not expect random number generation issues there to be different from Linux/Mac. Regardless of the platform, if results are still not consistent please help us by submitting a Github issue.
410
+
Prior to GRF version 2.4.0, another factor was how the forest construction was distributed across different threads. In these versions, our forest splitting algorithm can give different results depending on the number of threads used to build the forest, meaning that the num.threads argument had to be the same for cross-platform reproducibility. To restore this behavior in current versions of GRF, you can set the global R option `options(grf.legacy.seed=TRUE)` and exactly recover results produced with past versions of the package.
0 commit comments