dojo-sim
diff --git a/‎README.md
Lines changed: 1 addition & 1 deletion b/‎README.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/make.jl
Lines changed: 2 additions & 10 deletions b/‎docs/make.jl
Lines changed: 2 additions & 10 deletions
diff --git a/‎docs/src/citing.md
Lines changed: 1 addition & 1 deletion b/‎docs/src/citing.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/src/contact_models.md
Lines changed: 2 additions & 2 deletions b/‎docs/src/contact_models.md
Lines changed: 2 additions & 2 deletions
diff --git a/‎docs/src/contributing.md
Lines changed: 14 additions & 1 deletion b/‎docs/src/contributing.md
Lines changed: 14 additions & 1 deletion
diff --git a/‎docs/src/define_controller.md
Lines changed: 3 additions & 3 deletions b/‎docs/src/define_controller.md
Lines changed: 3 additions & 3 deletions
diff --git a/‎docs/src/define_environment.md
Lines changed: 234 additions & 1 deletion b/‎docs/src/define_environment.md
Lines changed: 234 additions & 1 deletion
diff --git a/‎docs/src/define_simulation.md
Lines changed: 4 additions & 5 deletions b/‎docs/src/define_simulation.md
Lines changed: 4 additions & 5 deletions
diff --git a/‎docs/src/gradients.md
Lines changed: 1 addition & 3 deletions b/‎docs/src/gradients.md
Lines changed: 1 addition & 3 deletions
@@ -85,7 +85,7 @@ pkg> add Dojo#main
 
 ## Citing
 ```
-@article{howell2022dojo,
+@article{howelllecleach2022dojo,
 	title={Dojo: {A} {D}ifferentiable {S}imulator for {R}obotics},
 	author={Howell, Taylor A. and Le Cleac'h, Simon and Kolter, J. Zico and Schwager, Mac and Manchester, Zachary},
 	year={2022}
 
@@ -22,23 +22,15 @@ makedocs(
             "define_controller.md",
            ],
 
-        "Creating Environments" => [
-            "define_environment.md",
+        "Environments" => [
             "load_environment.md",
+            "define_environment.md",
         ],
 
         "Gradients from Simulator" => [
             "gradients.md",
            ],
 
-        "Environments" => [
-            "atlas_env.md",
-            "quadruped_env.md",
-            "rexhopper_env.md",
-            "classic_env.md",
-            "gym_env.md",
-        ],
-
         "Examples" => [
             "simulation.md",
             "trajectory_optimization.md",
 
@@ -2,7 +2,7 @@
 
 If you find Dojo useful in your project, we kindly request that you cite the following paper:
 ```
-@article{howell2022dojo,
+@article{howelllecleach2022dojo,
 	title={Dojo: {A} {D}ifferentiable {S}imulator for {R}obotics},
 	author={Howell, Taylor A. and Le Cleac'h, Simon and and Kolter, J. Zico and Schwager, Mac and Manchester, Zachary},
 	year={2022}
 
@@ -11,10 +11,10 @@ Three contact models are implemented in Dojo:
 - [`LinearContact`](@ref) enforces contact with a linearized cone of friction (pyramidal cone).
 ![linearized_cone](./assets/linearized_cone.png)
 
-All 3 of these contact models implement hard contact i.e. no interpenetration. This means that for both the nonlinear and linearized cones, we concatenate the constraints resulting from friction with the impact constraints.
+All 3 of these contact models implement hard contact i.e., no interpenetration. This means that for both the nonlinear and linearized cones, we concatenate the constraints resulting from friction with the impact constraints.
 
 ### Implementation
-Dojo currently supports contact constraints occurring between a sphere and the ground i.e. a horizontal half-space of altitude 0.0. Each spherical contact is attached to a single [`Body`](@ref).
+Dojo currently supports contact constraints occurring between a sphere and the ground i.e., a horizontal half-space of altitude 0.0. Each spherical contact is attached to a single [`Body`](@ref).
 
 To create a new point of contact, we need to define:
 - the [`Body`](@ref) to which the contact constraint is attached
 
@@ -1,11 +1,24 @@
 # Contributing
 
-Contributions are always welcome:
+Contributions are always welcome!
 
 * If you want to contribute features, bug fixes, etc, please take a look at our __Code Style Guide__ below
 * Please report any issues and bugs that you encounter in [Issues](https://github.com/dojo-sim/Dojo.jl/issues)
 * As an open source project we are also interested in any projects and applications that use Dojo. Please let us know via email to: thowell@stanford.edu or simonlc@stanford.edu
 
+## Potentially Useful Contributions 
+Here are a list of current to-do's that would make awesome contributions:
+
+- improved parsing of URDF files
+    - joint limits, friction coefficients
+- improved collision detection 
+    - body-to-body contact 
+    - general convex shapes 
+    - curved surfaces 
+- GPU support 
+- nice REPL interface
+- interactive GUI
+
 ## Code Style Guide
 
 The code in this repository follows the naming and style conventions of [Julia Base](https://docs.julialang.org/en/v1.0/manual/style-guide/#Style-Guide-1) with a few modifications. This style guide is heavily "inspired" by the guides of [John Myles White](https://github.com/johnmyleswhite/Style.jl), [JuMP](http://www.juliaopt.org/JuMP.jl/latest/style), and [COSMO](https://github.com/oxfordcontrol/COSMO.jl)
 
@@ -1,6 +1,6 @@
 # Defining a Controller
 Here, we explain how to write a controller and simulate its effect on a dynamical system
-i.e. a [`Mechanism`](@ref).
+i.e., a [`Mechanism`](@ref).
 We focus on a simple pendulum swing-up.
 
 Load Dojo and use the pendulum mechanism with desired simulation time step, desired gravity and desired damping at the joint.
@@ -16,12 +16,12 @@ mechanism = get_mechanism(:pendulum,
 
 Define the controller. This is a method that takes 2 input arguments:
 - a [`Mechanism`](@ref),
-- an integer `k` indicating the current simulation step.
+- an integer `t` indicating the current simulation step.
 The controller computes the control inputs based on the current state `x`, the goal state `x_goal` and a proportional gain `K`.
 
 
 ```julia
-function controller!(mechanism, k)
+function controller!(mechanism, t)
     ## Target state
     x_goal = [1.0 * π; 0.0]
 
 
@@ -1 +1,234 @@
-# Defining Environment
+# Defining an Environment
+
+An [`Environment`](@ref) is a convienient object for applications like reinforcement learning and trajectory optimization. 
+
+To demonstrate, we create the [`Dojo.Ant`](@ref) environment. First, we load (or [create](define_mechanism.md)) a mechanism:
+
+```julia 
+mechanism = get_mechanism(:ant) 
+```
+
+Next, we create an environment's other attributes.
+
+Dimensions:
+```julia
+# set state dimension based on representation
+if representation == :minimal
+    nx = minimal_dimension(mechanism)
+elseif representation == :maximal
+    nx = maximal_dimension(mechanism)
+end
+# set control dimension
+nu = 8
+# set observation dimension
+no = nx
+```
+
+Space (for limiting controls and observations):
+```julia
+# limit controls to [-1.0, 1.0]
+aspace = BoxSpace(nu, 
+    low=(-ones(nu)), 
+    high=(ones(nu)))
+# no limits on observations
+ospace = BoxSpace(no, 
+    low=(-Inf * ones(no)), 
+    high=(Inf * ones(no)))
+```
+
+Random number:
+```julia
+rng = MersenneTwister(seed)
+```
+
+Dynamics data:
+```julia
+# state vector
+z = get_maximal_state(mechanism)
+x = representation == :minimal ? maximal_to_minimal(mechanism, z) : z
+# dynamics Jacobians
+fx = zeros(nx, nx)
+fu = zeros(nx, nu)
+```
+
+Control data: 
+```julia
+# control vector (previous)
+u_prev = zeros(nu)
+# control map transforms inputs from control to dynamics space
+control_mask = [zeros(8, 6) I(nu)]
+control_scaling = Diagonal(timestep * 150.0 * ones(nu))
+control_map = control_mask' * control_scaling
+```
+
+Visuals: 
+```julia 
+# create a visualizer
+vis = Visualizer() 
+```
+
+Solver options: 
+```julia
+# simulation options 
+opts_step = SolverOptions()
+# gradient options
+opts_grad = SolverOptions() 
+```
+
+Environment:
+```julia
+TYPES = [Ant, T, typeof(mechanism), typeof(aspace), typeof(ospace), typeof(info)]
+env = Environment{TYPES...}(
+    mechanism, 
+    representation, 
+    aspace, ospace,
+    x, fx, fu,
+    u_prev, 
+    control_map,
+    nx, nu, no,
+    info,
+    [rng], 
+    vis,
+    opts_sim, opts_grad)
+```
+
+With the environment instantiated, we can interact with it by overloading the following methods: 
+
+Simulate environment forward one time step:
+```julia
+function step(env::Environment{Ant}, x, u; 
+    diff=false)
+
+    # mechanism
+    mechanism = env.mechanism
+
+    # timestep 
+    timestep = mechanism.timestep
+
+    # copy current state
+    x0 = copy(x)
+
+    # cache current control
+    env.input_previous .= u  # for rendering in Gym
+	u_scaled = env.control_map * u
+
+    # representation conversion
+    z0 = env.representation == :minimal ? minimal_to_maximal(mechanism, x0) : x0
+
+    # simulate one step
+    z1 = step!(mechanism, z0, u_scaled; opts=env.opts_step)
+
+    # representation conversion
+    env.state .= env.representation == :minimal ? maximal_to_minimal(mechanism, z1) : z1
+
+    # cost/reward
+    reward = cost(env, z1, u_scaled)
+
+    # check for done
+    done = is_done(env, z1, u_scaled)
+
+    # gradients
+    if diff
+        if env.representation == :minimal
+            fx, fu = get_minimal_gradients!(env.mechanism, z0, u_scaled, opts=env.opts_grad)
+        elseif env.representation == :maximal
+            fx, fu = get_maximal_gradients!(env.mechanism, z0, u_scaled, opts=env.opts_grad)
+        end
+        env.dynamics_jacobian_state .= fx
+        env.dynamics_jacobian_input .= fu * env.control_map
+    end
+
+    # information
+    info = Dict()
+
+    return get_observation(env), reward, done, info
+end
+```
+
+Return environment to nominal state:
+```julia
+function reset(env::Environment{Ant}; 
+    x=nothing)
+
+    # initialize
+    initialize!(env.mechanism, type2symbol(Ant))
+
+    if x != nothing
+        env.state .= x
+    else
+        x = get_minimal_state(env.mechanism)
+        if env.representation == :minimal
+            set_maximal_state!(env.mechanism, minimal_to_maximal(env.mechanism, x))
+            env.state .= x
+        elseif env.representation == :maximal
+            z = minimal_to_maximal(env.mechanism, x)
+            set_maximal_state!(env.mechanism, z)
+            env.state .= z
+        end
+        env.input_previous .= 0.0
+    end
+
+    return get_observation(env)
+end
+```
+
+Observation for current environment state:
+```julia
+function get_observation(env::Environment{Ant})
+    contact_force = Float64[]
+    for contact in env.mechanism.contacts
+        push!(contact_force, max(-1.0, min(1.0, contact.impulses[2][1])))
+    end
+    # include contact forces with state for observation
+    return [env.state; contact_force]
+end
+```
+
+Cost/reward associated with simulation step:
+```julia 
+function cost(env::Environment{Ant}, x, u)
+    # forward reward
+    v = x[4] # x-direction velocity
+    forward_reward = 2.0 * v
+
+    # control cost
+	ctrl_cost = (0.05 * u' * u)[1]
+
+    # contact cost
+    contact_cost = 0.0
+
+    for contact in mechanism.contacts
+        contact_cost += 0.5 * 1.0e-3 * max(-1.0, min(1.0, contact.impulses[2][1]))^2.0
+    end
+
+	# survive_reward = 1.0
+    survive_reward = 0.05
+
+    # total reward
+    reward = forward_reward - ctrl_cost - contact_cost + survive_reward
+end
+```
+
+Determine if simulation should terminate:
+```julia 
+function is_done(env::Environment{Ant}, x) 
+    !(all(isfinite.(env.state)) && (env.state[3] >= 0.2) && (env.state[3] <= 1.0))
+end
+```
+
+### Random controls
+
+We apply random controls to the robot via the environment interface:
+```julia
+y = [copy(env.state)] # state trajectory
+for t = 1:100
+    step(env, env.state, randn(env.num_inputs))
+    push!(y, copy(env.state)) 
+end
+visualize(env, y)
+```
+
+The result should be something like this:
+```@raw html
+<img src="./../../examples/animations/ant_random.gif" width="300"/>
+```
@@ -1,10 +1,9 @@
 # Defining a Simulation
-Here, we explain how to simulate a dynamical system i.e. a [`Mechanism`](@ref) forward in time.
+Here, we explain how to simulate a dynamical system i.e., a [`Mechanism`](@ref) forward in time.
 The example that we are trying to replicate the Dzhanibekov effect shown below.
 
 ![dzhanibekov](./assets/dzhanibekov_nasa.gif)
 
-
 Load the `Dojo` package.
 ```julia
 using Dojo
@@ -25,13 +24,13 @@ mech = get_mechanism(:dzhanibekov,
 
 We initialize the system with a given initial angular velocity.
 ```julia
-initialize_dzhanibekov!(mech,
+initialize!(mech, :dzhanibekov,
     angular_velocity=[15.0; 0.01; 0.0])
 ```
 
-We simulate this system for 4 seconds, we record the resulting trajectory in `storage`,
+We simulate this system for 5 seconds, we record the resulting trajectory in `storage`,
 ```julia
-storage = simulate!(mech, 4.00,
+storage = simulate!(mech, 5.0,
     record=true,
     verbose=false)
 ```
 
@@ -22,7 +22,6 @@ In case ``(\partial r / \partial w)^{-1}`` is not well defined, (e.g., not full
 
 Often, Newton's method is employed to find solutions to the implicit equation and custom linear-system solvers can efficiently compute search directions for this purpose. Importantly, the factorization of ``\partial r / \partial w`` used to find a solution can be reused to compute (1) at very low computational cost using only back-substitution. Additionally, each element of the problem-data sensitivity can be computed in parallel.
 
-
 ## Dojo's Gradient
 At a solution point, ``w^*(\theta, \kappa)``, the sensitivity of the solution with respect to the problem data, i.e., ``\partial w^* / \partial \theta``, is efficiently computed using the implicit-function theorem (1) to differentiate through the solver's residual.
 
@@ -32,11 +31,10 @@ The problem data for each simulation step include: the previous and current conf
 
 In many robotics scenarios, we are interested in gradient information through contact events. Instead of computing gradients for hard contact with zero or very small central-path parameters, we use a relaxed value from intermediate solutions ``w^*(\theta, \kappa > 0)`` corresponding to a soft contact model. In practice, we find that these smooth gradients greatly improve the performance of gradient-based optimization methods.
 
-
 ## Gradient Comparison
 
 ```@raw html
-<img src="./assets/gradient_comparison.png" width="600"/>
+<img src="./assets/gradient_comparison.png" width="500"/>
 ```
 
 Gradient comparison between randomized smoothing and Dojo's smooth gradients. The dynamics for a box in the ``XY`` plane that is resting on a flat surface and displaced an amount ``\Delta`` by an input ``F`` (top left). Its corresponding exact gradients are shown in black. Gradient bundles (right column) are computed using sampling schemes with varying covariances ``\Sigma`` and ``500`` samples. Dojo's gradients (middle column) are computed for different values of ``\kappa``, corresponding to the smoothness of the contact model. Compared to the 500-sample gradient bundle, Dojo's gradients are not noisy and are a 100 times faster to compute with a single worker.