[Feature] Migrate CQL from D4RLExperienceReplay to MinariExperienceReplay + fix W&B logging and SLURM usage #3035

Ibinarriaga8 · 2025-07-04T13:50:55Z

Description

This PR updates the offline CQL sota-implementation by migrating from the deprecated D4RLExperienceReplay to the new MinariExperienceReplay, following the recent deprecation notice from the D4RL maintainers. It also includes fixes and improvements for tooling and script robustness.

Motivation and Context

D4RL’s official repositories have announced that:

All offline datasets in D4RL have been moved to Minari, and all online environments have been transferred to Gymnasium, MiniGrid, and Gymnasium-Robotics.

To stay aligned with the evolving ecosystem and maintain long-term compatibility, this PR updates the offline dataset interface to use MinariExperienceReplay.

Additionally, it fixes minor bugs affecting W&B logging and SLURM job script behavior, while also ensuring full compliance with the repo’s linting and formatting standards.

No functional changes have been introduced beyond the mentioned components.

Fixes #3034

I have raised an issue to propose this change (required for new features and bug fixes)

Types of changes

What types of changes does your code introduce? Remove all that do not apply:

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds core functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation (update in the documentation)
Example (update in the folder of examples)

Checklist

Go over all the following points, and put an x in all the boxes that apply.
If you are unsure about any of these, don't hesitate to ask. We are here to help!

I have read the CONTRIBUTION guide (required)
My change requires a change to the documentation.
I have updated the tests accordingly (required for a bug fix or a new feature).
I have updated the documentation accordingly.

pytorch-bot · 2025-07-04T13:50:58Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3035

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Ibinarriaga8 · 2025-07-04T14:02:35Z

sota-implementations/cql/utils.py

@@ -181,13 +182,13 @@ def make_replay_buffer(


 def make_offline_replay_buffer(rb_cfg):
-    data = D4RLExperienceReplay(
+    data = MinariExperienceReplay(


Replaced D4RLExperienceReplay with MinariExperienceReplay following official deprecation of D4RL datasets.
See: https://github.com/Farama-Foundation/d4rl

vmoens

Happy with these changes, do you have a run log to share?

Ibinarriaga8 · 2025-07-04T14:45:34Z

Happy with these changes, do you have a run log to share?

Yes, you can find the run here:
https://wandb.ai/202206789-universidad-pontificia-comillas/torchrl_example_cql/workspace?nw=nwuser202206789

It includes a run of the updated offline CQL sota-implementation using MinariExperienceReplay on mujoco/hopper/expert-v0 dataset.

vmoens · 2025-07-04T14:51:54Z

I cannot see it unfortunately

Ibinarriaga8 · 2025-07-04T15:03:03Z

MinariExperienceReplay Report.pdf

You can see the logs from W&B in this pdf

Ibinarriaga8 · 2025-07-07T14:21:19Z

This is a comparison between the training logs using Minari and the D4RL Replay Buffer. Although they generally display similar behavior, D4RL shows higher noise in the Q-function loss during training.

D4RL v Minari Report.pdf
W&B Report

While the overall training behavior remains consistent, the increased Q-function noise with D4RL suggests Minari may provide slightly more stable learning curves

… CQL offline

vmoens

The sota-implementations runs seem to be failing because of this (I also fixed a minor issue with redq so I rebased your branch on main to make it look cleaner)
Can you have a look?

Ibinarriaga8 · 2025-07-08T09:59:48Z

@vmoens I can see sota-checks were succesful: https://github.com/pytorch/rl/actions/runs/16122638416/workflow

However, can you clarify what you mean by sota-runs implementations failed? Is this referring to a different workflow or step beyond the sota-check?

vmoens

Ok let's merge then!

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 4, 2025

Ibinarriaga8 changed the title ~~[Offline RL] Migrate CQL from D4RLExperienceReplay to MinariExperienceReplay + fix W&B logging and SLURM usage~~ [Feature] Migrate CQL from D4RLExperienceReplay to MinariExperienceReplay + fix W&B logging and SLURM usage Jul 4, 2025

Ibinarriaga8 commented Jul 4, 2025

View reviewed changes

vmoens reviewed Jul 4, 2025

View reviewed changes

jorge.ibinarriaga.robles.becas and others added 5 commits July 7, 2025 17:32

[Feature] Replace D4RLExperienceReplay with MinariExperienceReplay in…

3792dc6

… CQL offline

[Refactor] Update logging backend to use wandb

30ea320

[Fix] log WandB scalars using correct step argument

0fc3a5f

[Fix] Correct exit behavior in usage display function

b1238eb

[Refactor]: fix linting errors to pass pre-commit checks

9eba388

vmoens reviewed Jul 7, 2025

View reviewed changes

vmoens force-pushed the feature/minari-replay branch from 4fa5618 to 9eba388 Compare July 7, 2025 16:34

vmoens added the enhancement New feature or request label Jul 8, 2025

vmoens approved these changes Jul 8, 2025

View reviewed changes

Delete sota-implementations/cql/=0.9.0,

4e4ff71

vmoens merged commit 3cf1df0 into pytorch:main Jul 8, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] Migrate CQL from D4RLExperienceReplay to MinariExperienceReplay + fix W&B logging and SLURM usage #3035

[Feature] Migrate CQL from D4RLExperienceReplay to MinariExperienceReplay + fix W&B logging and SLURM usage #3035

Ibinarriaga8 commented Jul 4, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Jul 4, 2025 •

edited

Loading

Uh oh!

Ibinarriaga8 Jul 4, 2025

Uh oh!

vmoens left a comment

Uh oh!

Ibinarriaga8 commented Jul 4, 2025

Uh oh!

vmoens commented Jul 4, 2025

Uh oh!

Ibinarriaga8 commented Jul 4, 2025

Uh oh!

Ibinarriaga8 commented Jul 7, 2025 •

edited

Loading

Uh oh!

vmoens left a comment

Uh oh!

Ibinarriaga8 commented Jul 8, 2025 •

edited

Loading

Uh oh!

vmoens left a comment

Uh oh!

Uh oh!

Uh oh!

[Feature] Migrate CQL from D4RLExperienceReplay to MinariExperienceReplay + fix W&B logging and SLURM usage #3035

[Feature] Migrate CQL from D4RLExperienceReplay to MinariExperienceReplay + fix W&B logging and SLURM usage #3035

Conversation

Ibinarriaga8 commented Jul 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Types of changes

Checklist

Uh oh!

pytorch-bot bot commented Jul 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3035

Uh oh!

Ibinarriaga8 Jul 4, 2025

Choose a reason for hiding this comment

Uh oh!

vmoens left a comment

Choose a reason for hiding this comment

Uh oh!

Ibinarriaga8 commented Jul 4, 2025

Uh oh!

vmoens commented Jul 4, 2025

Uh oh!

Ibinarriaga8 commented Jul 4, 2025

Uh oh!

Ibinarriaga8 commented Jul 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vmoens left a comment

Choose a reason for hiding this comment

Uh oh!

Ibinarriaga8 commented Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vmoens left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Ibinarriaga8 commented Jul 4, 2025 •

edited

Loading

pytorch-bot bot commented Jul 4, 2025 •

edited

Loading

Ibinarriaga8 commented Jul 7, 2025 •

edited

Loading

Ibinarriaga8 commented Jul 8, 2025 •

edited

Loading