Skip to content

Model Batching/Domain Randomization in MuJoCoSolver #79

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
adenzler-nvidia opened this issue May 9, 2025 · 8 comments
Closed

Model Batching/Domain Randomization in MuJoCoSolver #79

adenzler-nvidia opened this issue May 9, 2025 · 8 comments

Comments

@adenzler-nvidia
Copy link
Contributor

We have this PR open for discussion right now that implements model batching/domain randomization in MjWarp: google-deepmind/mujoco_warp#231. We should figure out if we can work with this.

Summary of that proposal is that the MjWarp Model will have certain fields expanded by 1 dimension at the front by default, but use stride 0 in that dimension. So without any interaction from the user, the data will be the same scalar for all model replications. Think of something link joint limits ("range") which is a wp.array(dtype=wp.vec2i) with shape (njnt, ) right now. This would become shape (1, njnt), and use stride 0 in the first dimension so any indexing lookup will point to the first element.

The way users would randomize this now is to just replace that array with another wp.array that has shape (nworld, njnt) which contains potentially different data for each world. That array has the default strides then.

For newton as of right now, I can see a few issues I'd like to discuss:

  1. The MjWarp model is not directly exposed to users, so we cannot push the array replacement to the users. Even more so, there might be a potential remapping needed, which right now only happens at init time. My assumption is that we allow users to do this at any time.
  2. Do we need to discover randomized field during model building? Or are we ok with just letting users change the data later?
  3. Are we allowing stride 0 arrays in the selection API? How are we handling this in set_attribute where we copy?

I think this needs to work both with and without the selection API, so we cannot just build that functionality into the selection API. More so, we should not start pushing functionality other than "selection" into the API anyway.

@nvlukasz
Copy link
Contributor

nvlukasz commented May 9, 2025

Are we allowing stride 0 arrays in the selection API? How are we handling this in set_attribute where we copy?

Currently we just call wp.copy(), so stride 0 inputs probably won't work. We could add a check for stride 0 and handle it in a special way. Or maybe we can handle stride 0 arrays in wp.copy() itself, that's another possibility.

@nvlukasz
Copy link
Contributor

nvlukasz commented May 9, 2025

I think this needs to work both with and without the selection API

I tend to agree. I view the selection API as an optional feature that's there for convenience rather than core functionality.

@nvlukasz
Copy link
Contributor

nvlukasz commented May 9, 2025

One limitation of this batching approach is that it requires homogeneous environments. We could envision a more flexible approach that uses indexed arrays instead of a batch dimension. That comes with its own set of challenges, though. Indexed arrays are likely to be slower. Kernels may need to be overloaded to accept regular or indexed arrays. That's all doable, but it adds complexity.

@nvlukasz
Copy link
Contributor

nvlukasz commented May 9, 2025

Is the motivation for stride 0 arrays to save memory? We don't do this in Newton (unless I'm missing something) and we didn't do this in previous iterations of Isaac Gym/Lab. We just used regular arrays with copies. Stride 0 arrays are cool, but they also introduce some complexity.

@adenzler-nvidia
Copy link
Contributor Author

Thanks for your comments - some clarifications:

  1. yes - this requires homogeneous environments. It mostly follows what's possible/going to be possible in MjWarp, but obviously there is no decision there yet either. Everyone wants to go for heterogeneous environments, but the plan was to enable what's currently possible in MJX and then move towards a version the allows heterogeneity later. In that sense, the only thing we need to be concerned about in Newton is to not limit the API prematurely to homogeneous environments, which I think is ok with the modelBuilder approach.

  2. I did not think of an indexedArray approach, and there is no documentation about it.. Need to read some code. With that, you would essentially provide an indexArray with nWorld elements, that would then point to a specific instance of the model parameter? I guess that wouldn't change anything in how we write the kernel code, just that the index that in this version looks up the batch dim would then be the index lookup for the indexed array? Need to think about that. Could be a good solution indeed, but we should check perf.

  3. Yes, the stride 0 array approach was chosen to save memory. It was mostly influenced by the google folks as they are already used to it, and I never made any experiments to check perf difference. We tend to not care about it, but I know in PhysX/legacy IsaacLab we discussed compacting some stuff because it really was a lot of memory. I think we need experiments for this as well, yes.

In the end also for MjWarp this is a very awkward decision, which is why it has been out there for so long now, and we haven't really found something the just feels right.

One final comment that is more general and should be discussed here as well - how are we going to deal with the fact that if we change the newton model, we need to update the MjWarp model again. This is a pretty important thing to get right that is kind of independent of the actual API.

@eric-heiden
Copy link
Collaborator

One final comment that is more general and should be discussed here as well - how are we going to deal with the fact that if we change the newton model, we need to update the MjWarp model again. This is a pretty important thing to get right that is kind of independent of the actual API.

At the moment the user has to recreate the MuJoCo model from a Newton model from scratch using convert_to_mjc(). mujoco_warp relies on MuJoCo C to compile the model and compute auxiliary data structures used for simulation so until we have some helper functions that allow us to set inertia, joint properties, etc. on-the-fly directly through mujoco_warp we still need to go through the put_model route via MuJoCo C. Perhaps mujoco_warp could add a few setter functions for those common, handful of model properties that will update all related structures from Warp land (which will also make them differentiable). Maybe joint limits and some other properties don't need a complicated setter though?

@adenzler-nvidia
Copy link
Contributor Author

Yeah maybe it's time we think through some of that.. Ideally we don't need to use the MJC compiler at all anymore, but for sure we should aim for a solution where we will only need it during initial setup or if we change significant parameters. I can put together a doc of how this could work, and then we can see what needs changes in MjWarp and what parts are needed for newton.

For the continuous stuff that can be domain-randomized it would be ideal to just have the MjWarp model and the newton model come closer together, such that we could expose some of the MjWarp arrays directly in the newton model.

@adenzler-nvidia
Copy link
Contributor Author

closing, replaced by #116

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants