-
Notifications
You must be signed in to change notification settings - Fork 4
Description
Here is code where I would have expected the aggregate results at the end for two identical benchmarks to be identical, but they are not. Since I am only an intermediate level coder in R, perhaps there is something wrong with my code. In any event, I pass this along for your consideration as a possible issue in mlr3automl. As you can imagine, this code takes a while to execute, ~10 minutes on my iMac Pro.
#############################################################
# Cross-validating the regression learners
#############################################################
library("doFuture")
library("doRNG")
library("future")
library("future.apply")
library("mlr3verse")
library("mlr3automl")
library("mlr3hyperband")
# set logger thresholds
lgr::get_logger("mlr3")$set_threshold("error")
lgr::get_logger("bbotk")$set_threshold("error")
# specify regression learners
learners = list(
lrn(
"regr.featureless",
id = "fl"
),
lrn(
"regr.lm",
id = "lm"
),
lrn(
"regr.cv_glmnet",
id = "glm"
),
lrn(
"regr.ranger",
id = "rf"
),
lrn(
"regr.xgboost",
id = "xgb"
),
lrn(
"regr.svm",
id = "svm"
)
)
learner_ids = sapply(
learners,
function(x) x$id
)
# define regression task
task = tsk("boston_housing")
# select small subset of features
task$select(c("age", "crim", "lat", "lon"))
# specify resampling
resampling = rsmp("cv")
# specify measure
measure = msr("regr.mse")
# autotuners for models with hyperparameters
learners[[3]] = create_autotuner(
learner = lrn("regr.cv_glmnet"),
tuner = tnr("hyperband")
)
learners[[4]] = create_autotuner(
learner = lrn("regr.ranger"),
tuner = tnr("hyperband"),
num_effective_vars = length(
task$feature_names
)
)
learners[[5]] = create_autotuner(
learner = lrn("regr.xgboost"),
tuner = tnr("hyperband")
)
learners[[6]] = create_autotuner(
learner = lrn("regr.svm"),
tuner = tnr("hyperband")
)
# create benchmark grid
design = benchmark_grid(
tasks = task,
learners = learners,
resamplings = resampling
)
# start parallel processing
registerDoFuture()
plan(multisession, workers = availableCores() - 1)
registerDoRNG(123456)
# execute benchmark
bmr1 = mlr3::benchmark(design)
# terminate parallel processing
plan(sequential)
# start parallel processing
registerDoFuture()
plan(multisession, workers = availableCores() - 1)
registerDoRNG(123456)
# execute benchmark
bmr2 = mlr3::benchmark(design)
# terminate parallel processing
plan(sequential)
# test for reproducibility
bmr1$aggregate()$regr.mse == bmr2$aggregate()$regr.mse
Here are a couple of interesting clues. If I run this code several times, the end result is the same each time (i.e., the same mix of TRUE and FALSE results for the different stochastic learners). But if I run this code in R and then run the same code in RStudio, I get a different mix of TRUE and FALSE results depending on the platform. Finally, if I run this code substituting a different dataset, then I get a different mix of TRUE and FALSE results at the end.