Adds an initial StableBaselines3 RL environment as an example #2667

arjo129 · 2024-11-07T07:12:04Z

This PR provides an incredibly basic example of how to use gazebo with StableBaselines3 for RL. This example is that of the classic cartpole which is commonly used as a "getting started" task in reinforcement learning. The python script trains a simple model using python to balance a cart pole. We use the gui to visualize it.

This PR adds support for the Reset API to the test fixture. As `TestFixture` is one of the main ways one can get access to the ECM in python when trying to write some scripts for Deep Reinforcement Learning I realized that without `Reset` supported in the `TestFixture` API, end users would have a very hard time using our python APIs (which are actually quite nice). For reference I'm hacking a demo template here: https://github.com/arjo129/gz_deep_rl_experiments/tree/ionic Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>

This allows us to reset simulations without having to call into gz-transport making the code more readable from an external API. Depends on #2647 Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>

Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>

A lot of things are not working. Particularly when `ResetAll` is called, the EnableVelocityChecks does not trigger the phyics system to populate the velocity components. This is a blocker for the current example. Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>

Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>

arjo129 · 2024-11-12T07:06:58Z

So the above code should be able to train a RL model even on a potato. Currently I've got the algorithm to successfully balance a cart pole. There are some open issues however that will block this from being merged. Primarily, my main concern is that I've hacked together an API for running the gui client.

* Adds support for Reset in test fixture This PR adds support for the Reset API to the test fixture. As `TestFixture` is one of the main ways one can get access to the ECM in python when trying to write some scripts for Deep Reinforcement Learning I realized that without `Reset` supported in the `TestFixture` API, end users would have a very hard time using our python APIs (which are actually quite nice). For reference I'm hacking a demo template here: #2667 --------- Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>

Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>

…t_public_api

Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>

python/src/gz/sim/Gui.cc

…l_example

Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>

Signed-off-by: Ubuntu <arjoc@intrinsic.ai>

…ample' into arjo/examples/rl_example

Signed-off-by: Ubuntu <arjoc@intrinsic.ai>

Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>

…l_example

Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>

j-rivero · 2025-03-24T14:20:36Z

@osrf-jenkins run tests

arjo129 · 2025-04-16T07:54:24Z

Can someone please review this?

iche033

cool I was able to run this. Left a few minor comments

examples/scripts/reinforcement_learning/simple_cart_pole/README.md

examples/scripts/reinforcement_learning/simple_cart_pole/cart_pole.sdf

examples/scripts/reinforcement_learning/simple_cart_pole/cart_pole_env.py

examples/scripts/reinforcement_learning/simple_cart_pole/README.md

examples/scripts/reinforcement_learning/simple_cart_pole/cart_pole.sdf

Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>

iche033

looks good to me!

Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>

arjo129 added 7 commits October 11, 2024 13:20

Add support for simulation reset via a publicly callable API

053152f

This allows us to reset simulations without having to call into gz-transport making the code more readable from an external API. Depends on #2647 Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>

Style

ff468b7

Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>

Style

e0ee0dd

Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>

Style

047be5b

Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>

Typo

76adc26

Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>

arjo129 mentioned this pull request Nov 7, 2024

Example of doing Reinforcement Learning in Gazebo #2662

Open

19 tasks

arjo129 added 3 commits November 11, 2024 13:12

Fixed readme instructions

c3eea00

Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>

Style

9a2b742

Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>

Got the gui working. Time for gradient descent by grad student

da33f11

Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>

arjo129 added 4 commits January 8, 2025 19:14

Address feedback

3e83828

Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>

Merge remote-tracking branch 'origin/main' into arjo/feat/server_rese…

35eca3e

…t_public_api

Add support for individual resets

15fa867

Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>

Merge branch 'main' into arjo/feat/server_reset_public_api

ed502ee

arjo129 commented Feb 5, 2025

View reviewed changes

python/src/gz/sim/Gui.cc Outdated Show resolved Hide resolved

arjo129 added 8 commits February 6, 2025 15:48

Merge branch 'arjo/feat/server_reset_public_api' into arjo/examples/r…

120c1ec

…l_example

Style

16d4407

Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>

Bind installation directories and install

6c8131c

Signed-off-by: Ubuntu <arjoc@intrinsic.ai>

Missed the file in the last commit

5117071

Signed-off-by: Ubuntu <arjoc@intrinsic.ai>

Merge remote-tracking branch 'refs/remotes/origin/arjo/examples/rl_ex…

e602467

…ample' into arjo/examples/rl_example

style

46e3ca8

Signed-off-by: Ubuntu <arjoc@intrinsic.ai>

Style

6799df2

Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>

Merge branch 'arjo/feat/server_reset_public_api' into arjo/examples/r…

ad1ace4

…l_example

Base automatically changed from arjo/feat/server_reset_public_api to main February 10, 2025 06:44

arjo129 added 2 commits February 10, 2025 14:52

Style

856f316

Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>

Enable GUI rollout

f8ddbf6

Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>

arjo129 marked this pull request as ready for review March 5, 2025 03:39

arjo129 requested a review from mjcarroll as a code owner March 5, 2025 03:39

j-rivero and others added 2 commits March 13, 2025 22:00

Merge branch 'main' into arjo/examples/rl_example

648fa11

Merge branch 'main' into arjo/examples/rl_example

5318663

Merge branch 'main' into arjo/examples/rl_example

9cf5b69

iche033 reviewed Apr 16, 2025

View reviewed changes

arjo129 added 3 commits April 17, 2025 09:15

Address feedback and more thorough Docs

e540136

Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>

Fix SDF file

3343fe0

Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>

Fix style

6c083c6

Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>

arjo129 requested a review from iche033 April 23, 2025 17:22

Merge branch 'main' into arjo/examples/rl_example

5b8af2a

iche033 approved these changes Apr 24, 2025

View reviewed changes

github-project-automation bot moved this from In progress to In review in Core development Apr 24, 2025

arjo129 enabled auto-merge (squash) April 24, 2025 01:47

arjo129 added 2 commits May 28, 2025 02:49

Merge branch 'main' into arjo/examples/rl_example

748d979

Fix compilation errors

bb2a515

Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>

arjo129 merged commit d5d1979 into main May 28, 2025
7 of 9 checks passed

arjo129 deleted the arjo/examples/rl_example branch May 28, 2025 04:24

github-project-automation bot moved this from In review to Done in Core development May 28, 2025

arjo129 mentioned this pull request May 28, 2025

Fix no-GUI Tests #2928

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adds an initial StableBaselines3 RL environment as an example #2667

Adds an initial StableBaselines3 RL environment as an example #2667

Uh oh!

arjo129 commented Nov 7, 2024 •

edited

Loading

Uh oh!

arjo129 commented Nov 12, 2024

Uh oh!

Uh oh!

j-rivero commented Mar 24, 2025

Uh oh!

arjo129 commented Apr 16, 2025

Uh oh!

iche033 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

iche033 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Adds an initial StableBaselines3 RL environment as an example #2667

Adds an initial StableBaselines3 RL environment as an example #2667

Uh oh!

Conversation

arjo129 commented Nov 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arjo129 commented Nov 12, 2024

Uh oh!

Uh oh!

j-rivero commented Mar 24, 2025

Uh oh!

arjo129 commented Apr 16, 2025

Uh oh!

iche033 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

iche033 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

arjo129 commented Nov 7, 2024 •

edited

Loading