Efficient execution of multiple parallel Bevy App instances #5580

cswinter · 2022-08-05T02:33:38Z

cswinter
Aug 5, 2022

I'm trying to figure out a way to achieve maximum performance when running many (thousands) parallel Bevy App instances to train AIs in https://github.com/entity-neural-network/entity-gym-rs.

On a high level, three different systems need to run in every iteration:

Obs: system that creates a snapshot of the current observable game state and sends them to the agents
Act: updates the game state with the actions take by the agent
Physics: advance physics/game mechanics by one step

During deployment, this is very straightforward. "Obs" and "Act" can just be a single system that constructs the observations, calls a function with the observations that returns an action, and applies the action to the game state.

The issue is that when training an AI, the "Act" system performs a blocking call to await actions that are produced by an external Python process. The current solution I have ends up requiring one thread per App instance which blocks on a channel once per iteration to communicate with the Python process. Unfortunately, awaiting a channel and context switching between thousands of threads on each tick introduces massive overhead that drastically curtails the maximum achievable throughput.

In the ideal architecture, there would be a much smaller number of threads each owning multiple Apps. Each thread would call just the "Obs" system on every App, perform all synchronization with the Python process in a single batch, and then run the "Act" and "Physics" systems (testing this approach with a non-Bevy game yields > 20x throughput).

I haven't managed to find a good way to set something like this up yet. One approximation would be to reorder the systems to "Act" -> "Physics" -> "Obs", which moves the synchronization barrier in between iterations and allow multiple Apps to be single-stepped by one worker thread. This still has two issues. (1) "Obs" would not be able to observe entities created on that tick. So really we'd like the "Obs" system and the "Act"+Physics" systems to take turns. This seems doable by skipping every other system execution but I'm not sure how to make that ergonomic. (2) I haven't actually found a way to single-step an App. There is the ScheduleRunnerSettings::run_once() which seems to basically do what I want, but it only runs the systems on the first App::run call and every subsequent calls do nothing.

Another orthogonal issue is that when creating 1024 Apps, I get a panic in bevy_tasks-0.7.0/src/task_pool.rs:152:22 because the Linux thread limit is exceeded. Raising the thread limit is possible, but it would prevent the crate from working out of the box. Ideally there would be a way to prevent Bevy from creating any additional threads. (EDIT: this doesn't seem to happen anymore after upgrading to Bevy 0.8, but also throughput is now half of what it was in 0.7 🤔)

cswinter · 2022-09-30T16:40:15Z

cswinter
Sep 30, 2022
Author

Notes from offline meeting with @alice-i-cecile and Nisan:

There already is an App::update method which allows for single-stepping an app by one timestep.
Each Bevy will currently spawn multiple threads/executors. The Bevy's current architecture makes it difficult to implement a fully serial externally driven runner that completely eliminates threading overhead. This will be solved by System organization overhaul and the road to a stageless schedule #1375.
For now, overhead could probably be reduced by setting size of all task pools to 1.
Another potential short-term workaround might be using multiple worlds/subapps, but implementation/API is still immature and doesn't seem like the right solution in the long term.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Efficient execution of multiple parallel Bevy App instances #5580

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Efficient execution of multiple parallel Bevy App instances #5580

Uh oh!

Uh oh!

cswinter Aug 5, 2022

Replies: 1 comment

Uh oh!

cswinter Sep 30, 2022 Author

cswinter
Aug 5, 2022

cswinter
Sep 30, 2022
Author