-
Notifications
You must be signed in to change notification settings - Fork 342
Adds an initial StableBaselines3 RL environment as an example #2667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This PR adds support for the Reset API to the test fixture. As `TestFixture` is one of the main ways one can get access to the ECM in python when trying to write some scripts for Deep Reinforcement Learning I realized that without `Reset` supported in the `TestFixture` API, end users would have a very hard time using our python APIs (which are actually quite nice). For reference I'm hacking a demo template here: https://github.com/arjo129/gz_deep_rl_experiments/tree/ionic Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
This allows us to reset simulations without having to call into gz-transport making the code more readable from an external API. Depends on #2647 Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
A lot of things are not working. Particularly when `ResetAll` is called, the EnableVelocityChecks does not trigger the phyics system to populate the velocity components. This is a blocker for the current example. Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
So the above code should be able to train a RL model even on a potato. Currently I've got the algorithm to successfully balance a cart pole. There are some open issues however that will block this from being merged. Primarily, my main concern is that I've hacked together an API for running the gui client. |
* Adds support for Reset in test fixture This PR adds support for the Reset API to the test fixture. As `TestFixture` is one of the main ways one can get access to the ECM in python when trying to write some scripts for Deep Reinforcement Learning I realized that without `Reset` supported in the `TestFixture` API, end users would have a very hard time using our python APIs (which are actually quite nice). For reference I'm hacking a demo template here: #2667 --------- Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
Signed-off-by: Ubuntu <arjoc@intrinsic.ai>
Signed-off-by: Ubuntu <arjoc@intrinsic.ai>
…ample' into arjo/examples/rl_example
Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
@osrf-jenkins run tests |
Can someone please review this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool I was able to run this. Left a few minor comments
examples/scripts/reinforcement_learning/simple_cart_pole/README.md
Outdated
Show resolved
Hide resolved
examples/scripts/reinforcement_learning/simple_cart_pole/README.md
Outdated
Show resolved
Hide resolved
examples/scripts/reinforcement_learning/simple_cart_pole/cart_pole_env.py
Outdated
Show resolved
Hide resolved
examples/scripts/reinforcement_learning/simple_cart_pole/README.md
Outdated
Show resolved
Hide resolved
examples/scripts/reinforcement_learning/simple_cart_pole/README.md
Outdated
Show resolved
Hide resolved
examples/scripts/reinforcement_learning/simple_cart_pole/cart_pole.sdf
Outdated
Show resolved
Hide resolved
Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me!
Signed-off-by: Arjo Chakravarty <arjoc@intrinsic.ai>
This PR provides an incredibly basic example of how to use gazebo with StableBaselines3 for RL. This example is that of the classic cartpole which is commonly used as a "getting started" task in reinforcement learning. The python script trains a simple model using python to balance a cart pole. We use the gui to visualize it.