-
Notifications
You must be signed in to change notification settings - Fork 154
Finetune algorithms log only train regret #76
Copy link
Copy link
Open
Labels
bugSomething isn't workingSomething isn't workingwontfixThis will not be worked onThis will not be worked on
Description
All of the algorithms with offline-to-online finetuning log training regret (regret obtained by online interactions which are used for training) under both train/regret
and eval/regret
. So we report only train regret which is different from Cal-QL work where authors report eval regret. Reporting eval regret is strange because the thing we really want to minimize on practice is a train regret so this bug is not critical but should be kept in mind. I will fix it but without reruning all of the algorithms due to compute limitations (maybe later we will rerun it).
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingwontfixThis will not be worked onThis will not be worked on