Skip to content

Commit e6ab08f

Browse files
committed
Merge branch 'main' of https://github.com/DojoSim/Dojo.jl into main
2 parents 89aa1fd + b26673b commit e6ab08f

File tree

32 files changed

+327
-72
lines changed

32 files changed

+327
-72
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -85,7 +85,7 @@ pkg> add Dojo#main
8585

8686
## Citing
8787
```
88-
@article{howell2022dojo,
88+
@article{howelllecleach2022dojo,
8989
title={Dojo: {A} {D}ifferentiable {S}imulator for {R}obotics},
9090
author={Howell, Taylor A. and Le Cleac'h, Simon and Kolter, J. Zico and Schwager, Mac and Manchester, Zachary},
9191
year={2022}

docs/make.jl

Lines changed: 2 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -22,23 +22,15 @@ makedocs(
2222
"define_controller.md",
2323
],
2424

25-
"Creating Environments" => [
26-
"define_environment.md",
25+
"Environments" => [
2726
"load_environment.md",
27+
"define_environment.md",
2828
],
2929

3030
"Gradients from Simulator" => [
3131
"gradients.md",
3232
],
3333

34-
"Environments" => [
35-
"atlas_env.md",
36-
"quadruped_env.md",
37-
"rexhopper_env.md",
38-
"classic_env.md",
39-
"gym_env.md",
40-
],
41-
4234
"Examples" => [
4335
"simulation.md",
4436
"trajectory_optimization.md",

docs/src/citing.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
If you find Dojo useful in your project, we kindly request that you cite the following paper:
44
```
5-
@article{howell2022dojo,
5+
@article{howelllecleach2022dojo,
66
title={Dojo: {A} {D}ifferentiable {S}imulator for {R}obotics},
77
author={Howell, Taylor A. and Le Cleac'h, Simon and and Kolter, J. Zico and Schwager, Mac and Manchester, Zachary},
88
year={2022}

docs/src/contact_models.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,10 +11,10 @@ Three contact models are implemented in Dojo:
1111
- [`LinearContact`](@ref) enforces contact with a linearized cone of friction (pyramidal cone).
1212
![linearized_cone](./assets/linearized_cone.png)
1313

14-
All 3 of these contact models implement hard contact i.e. no interpenetration. This means that for both the nonlinear and linearized cones, we concatenate the constraints resulting from friction with the impact constraints.
14+
All 3 of these contact models implement hard contact i.e., no interpenetration. This means that for both the nonlinear and linearized cones, we concatenate the constraints resulting from friction with the impact constraints.
1515

1616
### Implementation
17-
Dojo currently supports contact constraints occurring between a sphere and the ground i.e. a horizontal half-space of altitude 0.0. Each spherical contact is attached to a single [`Body`](@ref).
17+
Dojo currently supports contact constraints occurring between a sphere and the ground i.e., a horizontal half-space of altitude 0.0. Each spherical contact is attached to a single [`Body`](@ref).
1818

1919
To create a new point of contact, we need to define:
2020
- the [`Body`](@ref) to which the contact constraint is attached

docs/src/contributing.md

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,24 @@
11
# Contributing
22

3-
Contributions are always welcome:
3+
Contributions are always welcome!
44

55
* If you want to contribute features, bug fixes, etc, please take a look at our __Code Style Guide__ below
66
* Please report any issues and bugs that you encounter in [Issues](https://github.com/dojo-sim/Dojo.jl/issues)
77
* As an open source project we are also interested in any projects and applications that use Dojo. Please let us know via email to: thowell@stanford.edu or simonlc@stanford.edu
88

9+
## Potentially Useful Contributions
10+
Here are a list of current to-do's that would make awesome contributions:
11+
12+
- improved parsing of URDF files
13+
- joint limits, friction coefficients
14+
- improved collision detection
15+
- body-to-body contact
16+
- general convex shapes
17+
- curved surfaces
18+
- GPU support
19+
- nice REPL interface
20+
- interactive GUI
21+
922
## Code Style Guide
1023

1124
The code in this repository follows the naming and style conventions of [Julia Base](https://docs.julialang.org/en/v1.0/manual/style-guide/#Style-Guide-1) with a few modifications. This style guide is heavily "inspired" by the guides of [John Myles White](https://github.com/johnmyleswhite/Style.jl), [JuMP](http://www.juliaopt.org/JuMP.jl/latest/style), and [COSMO](https://github.com/oxfordcontrol/COSMO.jl)

docs/src/define_controller.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Defining a Controller
22
Here, we explain how to write a controller and simulate its effect on a dynamical system
3-
i.e. a [`Mechanism`](@ref).
3+
i.e., a [`Mechanism`](@ref).
44
We focus on a simple pendulum swing-up.
55

66
Load Dojo and use the pendulum mechanism with desired simulation time step, desired gravity and desired damping at the joint.
@@ -16,12 +16,12 @@ mechanism = get_mechanism(:pendulum,
1616

1717
Define the controller. This is a method that takes 2 input arguments:
1818
- a [`Mechanism`](@ref),
19-
- an integer `k` indicating the current simulation step.
19+
- an integer `t` indicating the current simulation step.
2020
The controller computes the control inputs based on the current state `x`, the goal state `x_goal` and a proportional gain `K`.
2121

2222

2323
```julia
24-
function controller!(mechanism, k)
24+
function controller!(mechanism, t)
2525
## Target state
2626
x_goal = [1.0 * π; 0.0]
2727

docs/src/define_environment.md

Lines changed: 234 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,234 @@
1-
# Defining Environment
1+
# Defining an Environment
2+
3+
An [`Environment`](@ref) is a convienient object for applications like reinforcement learning and trajectory optimization.
4+
5+
To demonstrate, we create the [`Dojo.Ant`](@ref) environment. First, we load (or [create](define_mechanism.md)) a mechanism:
6+
7+
```julia
8+
mechanism = get_mechanism(:ant)
9+
```
10+
11+
Next, we create an environment's other attributes.
12+
13+
Dimensions:
14+
```julia
15+
# set state dimension based on representation
16+
if representation == :minimal
17+
nx = minimal_dimension(mechanism)
18+
elseif representation == :maximal
19+
nx = maximal_dimension(mechanism)
20+
end
21+
# set control dimension
22+
nu = 8
23+
# set observation dimension
24+
no = nx
25+
```
26+
27+
Space (for limiting controls and observations):
28+
```julia
29+
# limit controls to [-1.0, 1.0]
30+
aspace = BoxSpace(nu,
31+
low=(-ones(nu)),
32+
high=(ones(nu)))
33+
# no limits on observations
34+
ospace = BoxSpace(no,
35+
low=(-Inf * ones(no)),
36+
high=(Inf * ones(no)))
37+
```
38+
39+
Random number:
40+
```julia
41+
rng = MersenneTwister(seed)
42+
```
43+
44+
Dynamics data:
45+
```julia
46+
# state vector
47+
z = get_maximal_state(mechanism)
48+
x = representation == :minimal ? maximal_to_minimal(mechanism, z) : z
49+
# dynamics Jacobians
50+
fx = zeros(nx, nx)
51+
fu = zeros(nx, nu)
52+
```
53+
54+
Control data:
55+
```julia
56+
# control vector (previous)
57+
u_prev = zeros(nu)
58+
# control map transforms inputs from control to dynamics space
59+
control_mask = [zeros(8, 6) I(nu)]
60+
control_scaling = Diagonal(timestep * 150.0 * ones(nu))
61+
control_map = control_mask' * control_scaling
62+
```
63+
64+
Visuals:
65+
```julia
66+
# create a visualizer
67+
vis = Visualizer()
68+
```
69+
70+
Solver options:
71+
```julia
72+
# simulation options
73+
opts_step = SolverOptions()
74+
# gradient options
75+
opts_grad = SolverOptions()
76+
```
77+
78+
Environment:
79+
```julia
80+
TYPES = [Ant, T, typeof(mechanism), typeof(aspace), typeof(ospace), typeof(info)]
81+
env = Environment{TYPES...}(
82+
mechanism,
83+
representation,
84+
aspace, ospace,
85+
x, fx, fu,
86+
u_prev,
87+
control_map,
88+
nx, nu, no,
89+
info,
90+
[rng],
91+
vis,
92+
opts_sim, opts_grad)
93+
```
94+
95+
With the environment instantiated, we can interact with it by overloading the following methods:
96+
97+
Simulate environment forward one time step:
98+
```julia
99+
function step(env::Environment{Ant}, x, u;
100+
diff=false)
101+
102+
# mechanism
103+
mechanism = env.mechanism
104+
105+
# timestep
106+
timestep = mechanism.timestep
107+
108+
# copy current state
109+
x0 = copy(x)
110+
111+
# cache current control
112+
env.input_previous .= u # for rendering in Gym
113+
u_scaled = env.control_map * u
114+
115+
# representation conversion
116+
z0 = env.representation == :minimal ? minimal_to_maximal(mechanism, x0) : x0
117+
118+
# simulate one step
119+
z1 = step!(mechanism, z0, u_scaled; opts=env.opts_step)
120+
121+
# representation conversion
122+
env.state .= env.representation == :minimal ? maximal_to_minimal(mechanism, z1) : z1
123+
124+
# cost/reward
125+
reward = cost(env, z1, u_scaled)
126+
127+
# check for done
128+
done = is_done(env, z1, u_scaled)
129+
130+
# gradients
131+
if diff
132+
if env.representation == :minimal
133+
fx, fu = get_minimal_gradients!(env.mechanism, z0, u_scaled, opts=env.opts_grad)
134+
elseif env.representation == :maximal
135+
fx, fu = get_maximal_gradients!(env.mechanism, z0, u_scaled, opts=env.opts_grad)
136+
end
137+
env.dynamics_jacobian_state .= fx
138+
env.dynamics_jacobian_input .= fu * env.control_map
139+
end
140+
141+
# information
142+
info = Dict()
143+
144+
return get_observation(env), reward, done, info
145+
end
146+
```
147+
148+
Return environment to nominal state:
149+
```julia
150+
function reset(env::Environment{Ant};
151+
x=nothing)
152+
153+
# initialize
154+
initialize!(env.mechanism, type2symbol(Ant))
155+
156+
if x != nothing
157+
env.state .= x
158+
else
159+
x = get_minimal_state(env.mechanism)
160+
if env.representation == :minimal
161+
set_maximal_state!(env.mechanism, minimal_to_maximal(env.mechanism, x))
162+
env.state .= x
163+
elseif env.representation == :maximal
164+
z = minimal_to_maximal(env.mechanism, x)
165+
set_maximal_state!(env.mechanism, z)
166+
env.state .= z
167+
end
168+
env.input_previous .= 0.0
169+
end
170+
171+
return get_observation(env)
172+
end
173+
```
174+
175+
Observation for current environment state:
176+
```julia
177+
function get_observation(env::Environment{Ant})
178+
contact_force = Float64[]
179+
for contact in env.mechanism.contacts
180+
push!(contact_force, max(-1.0, min(1.0, contact.impulses[2][1])))
181+
end
182+
# include contact forces with state for observation
183+
return [env.state; contact_force]
184+
end
185+
```
186+
187+
Cost/reward associated with simulation step:
188+
```julia
189+
function cost(env::Environment{Ant}, x, u)
190+
# forward reward
191+
v = x[4] # x-direction velocity
192+
forward_reward = 2.0 * v
193+
194+
# control cost
195+
ctrl_cost = (0.05 * u' * u)[1]
196+
197+
# contact cost
198+
contact_cost = 0.0
199+
200+
for contact in mechanism.contacts
201+
contact_cost += 0.5 * 1.0e-3 * max(-1.0, min(1.0, contact.impulses[2][1]))^2.0
202+
end
203+
204+
# survive_reward = 1.0
205+
survive_reward = 0.05
206+
207+
# total reward
208+
reward = forward_reward - ctrl_cost - contact_cost + survive_reward
209+
end
210+
```
211+
212+
Determine if simulation should terminate:
213+
```julia
214+
function is_done(env::Environment{Ant}, x)
215+
!(all(isfinite.(env.state)) && (env.state[3] >= 0.2) && (env.state[3] <= 1.0))
216+
end
217+
```
218+
219+
### Random controls
220+
221+
We apply random controls to the robot via the environment interface:
222+
```julia
223+
y = [copy(env.state)] # state trajectory
224+
for t = 1:100
225+
step(env, env.state, randn(env.num_inputs))
226+
push!(y, copy(env.state))
227+
end
228+
visualize(env, y)
229+
```
230+
231+
The result should be something like this:
232+
```@raw html
233+
<img src="./../../examples/animations/ant_random.gif" width="300"/>
234+
```

docs/src/define_simulation.md

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,9 @@
11
# Defining a Simulation
2-
Here, we explain how to simulate a dynamical system i.e. a [`Mechanism`](@ref) forward in time.
2+
Here, we explain how to simulate a dynamical system i.e., a [`Mechanism`](@ref) forward in time.
33
The example that we are trying to replicate the Dzhanibekov effect shown below.
44

55
![dzhanibekov](./assets/dzhanibekov_nasa.gif)
66

7-
87
Load the `Dojo` package.
98
```julia
109
using Dojo
@@ -25,13 +24,13 @@ mech = get_mechanism(:dzhanibekov,
2524

2625
We initialize the system with a given initial angular velocity.
2726
```julia
28-
initialize_dzhanibekov!(mech,
27+
initialize!(mech, :dzhanibekov,
2928
angular_velocity=[15.0; 0.01; 0.0])
3029
```
3130

32-
We simulate this system for 4 seconds, we record the resulting trajectory in `storage`,
31+
We simulate this system for 5 seconds, we record the resulting trajectory in `storage`,
3332
```julia
34-
storage = simulate!(mech, 4.00,
33+
storage = simulate!(mech, 5.0,
3534
record=true,
3635
verbose=false)
3736
```

docs/src/gradients.md

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,6 @@ In case ``(\partial r / \partial w)^{-1}`` is not well defined, (e.g., not full
2222

2323
Often, Newton's method is employed to find solutions to the implicit equation and custom linear-system solvers can efficiently compute search directions for this purpose. Importantly, the factorization of ``\partial r / \partial w`` used to find a solution can be reused to compute (1) at very low computational cost using only back-substitution. Additionally, each element of the problem-data sensitivity can be computed in parallel.
2424

25-
2625
## Dojo's Gradient
2726
At a solution point, ``w^*(\theta, \kappa)``, the sensitivity of the solution with respect to the problem data, i.e., ``\partial w^* / \partial \theta``, is efficiently computed using the implicit-function theorem (1) to differentiate through the solver's residual.
2827

@@ -32,11 +31,10 @@ The problem data for each simulation step include: the previous and current conf
3231

3332
In many robotics scenarios, we are interested in gradient information through contact events. Instead of computing gradients for hard contact with zero or very small central-path parameters, we use a relaxed value from intermediate solutions ``w^*(\theta, \kappa > 0)`` corresponding to a soft contact model. In practice, we find that these smooth gradients greatly improve the performance of gradient-based optimization methods.
3433

35-
3634
## Gradient Comparison
3735

3836
```@raw html
39-
<img src="./assets/gradient_comparison.png" width="600"/>
37+
<img src="./assets/gradient_comparison.png" width="500"/>
4038
```
4139

4240
Gradient comparison between randomized smoothing and Dojo's smooth gradients. The dynamics for a box in the ``XY`` plane that is resting on a flat surface and displaced an amount ``\Delta`` by an input ``F`` (top left). Its corresponding exact gradients are shown in black. Gradient bundles (right column) are computed using sampling schemes with varying covariances ``\Sigma`` and ``500`` samples. Dojo's gradients (middle column) are computed for different values of ``\kappa``, corresponding to the smoothness of the contact model. Compared to the 500-sample gradient bundle, Dojo's gradients are not noisy and are a 100 times faster to compute with a single worker.

0 commit comments

Comments
 (0)