Skip to content

Commit ea913a8

Browse files
authored
Release v2.6.0 (#2109)
1 parent 656de97 commit ea913a8

File tree

3 files changed

+17
-4
lines changed

3 files changed

+17
-4
lines changed

docs/guide/rl_tips.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -233,9 +233,9 @@ If you want to quickly try a random agent on your environment, you can also do:
233233
**Why should I normalize the action space?**
234234

235235

236-
Most reinforcement learning algorithms rely on a Gaussian distribution (initially centered at 0 with std 1) for continuous actions.
236+
Most reinforcement learning algorithms rely on a `Gaussian distribution <https://araffin.github.io/post/sac-massive-sim/>`_ (initially centered at 0 with std 1) for continuous actions.
237237
So, if you forget to normalize the action space when using a custom environment,
238-
this can harm learning and can be difficult to debug (cf attached image and `issue #473 <https://github.com/hill-a/stable-baselines/issues/473>`_).
238+
this can `harm learning <https://araffin.github.io/post/sac-massive-sim/>`_ and can be difficult to debug (cf attached image and `issue #473 <https://github.com/hill-a/stable-baselines/issues/473>`_).
239239

240240
.. figure:: ../_static/img/mistake.png
241241

docs/misc/changelog.rst

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,10 @@
33
Changelog
44
==========
55

6-
Release 2.6.0a2 (WIP)
6+
Release 2.6.0 (2025-03-24)
77
--------------------------
88

9+
**New ``LogEveryNTimesteps`` callback and ``has_attr`` method, refactored hyperparameter optimization**
910

1011
Breaking Changes:
1112
^^^^^^^^^^^^^^^^^
@@ -22,12 +23,24 @@ Bug Fixes:
2223

2324
`SB3-Contrib`_
2425
^^^^^^^^^^^^^^
26+
- Renamed ``_dump_logs()`` to ``dump_logs()``
27+
- Fixed issues with ``SubprocVecEnv`` and ``MaskablePPO`` by using ``vec_env.has_attr()`` (pickling issues, mask function not present)
2528

2629
`RL Zoo`_
2730
^^^^^^^^^
31+
- Refactored hyperparameter optimization. The Optuna `Journal storage backend <https://optuna.readthedocs.io/en/stable/reference/generated/optuna.storages.JournalStorage.html>`__ is now supported (recommended default) and you can easily load tuned hyperparameter via the new ``--trial-id`` argument of ``train.py``.
32+
- Save the exact command line used to launch a training
33+
- Added support for special vectorized env (e.g. Brax, IsaacSim) by allowing to override the ``VecEnv`` class use to instantiate the env in the ``ExperimentManager``
34+
- Allow to disable auto-logging by passing ``--log-interval -2`` (useful when logging things manually)
35+
- Added Gymnasium v1.1 support
36+
- Fixed use of old HF api in ``get_hf_trained_models()``
2837

2938
`SBX`_ (SB3 + Jax)
3039
^^^^^^^^^^^^^^^^^^
40+
- Updated PPO to support ``net_arch``, and additional fixes
41+
- Fixed entropy coeff wrongly logged for SAC and derivatives.
42+
- Fixed PPO ``predict()`` for env that were not normalized (action spaces with limits != [-1, 1])
43+
- PPO now logs the standard deviation
3144

3245
Deprecations:
3346
^^^^^^^^^^^^^

stable_baselines3/version.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
2.6.0a2
1+
2.6.0

0 commit comments

Comments
 (0)