Skip to content

Introduction of parameterizable workgroups for compute shaders and dispatch sizes for the renderer #31402

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: dev
Choose a base branch
from

Conversation

Spiri0
Copy link
Contributor

@Spiri0 Spiri0 commented Jul 13, 2025

Related issue: #31393

So far, the workgroup for the compute shader consists of a scalar and the dispatchSize always uses dispatch.x up to device.limits.maxComputeWorkgroupsPerDimension before dispatch.y is used. The default workgroup is now [ 64, 1, 1 ] instead of [ 64 ]. This ensures that all previous examples are treated as before.

Example on webgpu_compute_particles_rain, usual usage:

computeParticles = computeUpdate().compute( maxParticleCount );
renderer.compute( computeParticles );

Use with workgroup and dispatchSize

computeParticles = computeUpdate().compute( [ 16, 16, 1 ] );
renderer.compute( computeParticles, [ 10, 10, 1 ] );

Since the instanceCount in the example is 25,000, workgroup = [ 16, 16, 1 ] and dispatchSize = [ 10, 10, 1 ] = 16 * 10 * 16 * 10 * 1 = 25,600 threads, enough to cover everything. If you choose smaller values for dispatchSize, e.g., [ 9, 9, 1 ], you would see that not all raindrops are computed.

You now have full control over the thread distribution for the GPU, which was not possible.
I'm setting the PR to draft for now, even though it works, because I might think of a few more things I'd like to add. Maybe warnings if you enter something incorrectly.

Copy link

github-actions bot commented Jul 13, 2025

📦 Bundle size

Full ESM build, minified and gzipped.

Before After Diff
WebGL 338.28
78.93
338.28
78.93
+0 B
+0 B
WebGPU 558.52
154.58
559.36
154.9
+837 B
+314 B
WebGPU Nodes 557.45
154.37
558.28
154.68
+837 B
+313 B

🌳 Bundle size after tree-shaking

Minimal build including a renderer, camera, empty scene, and dependencies.

Before After Diff
WebGL 469.62
113.66
469.62
113.66
+0 B
+0 B
WebGPU 634.18
171.62
635.01
171.94
+837 B
+328 B
WebGPU Nodes 589.31
160.95
590.15
161.26
+837 B
+313 B

@Spiri0 Spiri0 marked this pull request as draft July 13, 2025 07:10
@Spiri0 Spiri0 force-pushed the workgroups_and_dispatch_extension branch from 73116e1 to 910ce62 Compare July 13, 2025 09:38
@Spiri0 Spiri0 marked this pull request as ready for review July 13, 2025 10:03
Attila Schroeder added 2 commits July 14, 2025 03:27
Comment on lines 228 to 235
let count = countOrWorkgroupSize;

if ( Array.isArray( countOrWorkgroupSize ) ) {

workgroupSize = countOrWorkgroupSize;
count = null;

}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming the project agrees with the API change I'm wondering if we should deprecate the old code path where "count" is passed here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've set it up so that count is used if no workgroup is specified, so that existing user code and examples continue to run as before. But yes, I would also be in favor of removing the count from the code in the long term, with the usual transition time. This also significantly simplifies the compute call in the backend.

Comment on lines 1327 to 1329
const isValid = dispatchSize[ 0 ] > 0 && dispatchSize[ 1 ] > 0 && dispatchSize[ 2 ] > 0;

dispatchSize = isValid ? dispatchSize : computeNodeData;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's valid to set @workgroup_size to 1 or 2 values when defining the parameters inline with the remaining dimensions implicitly being set to "1" (see the spec here) so it would be nice to allow for providing workgroup size of 1 or 2 to align with the native behavior and for the sake of ergonomics:

const computeShader = wgslFn( `fn fragmentShader() -> void {}` );

// all are equivalant
const kernel = computeShader().compute( [ 16 ] );
const kernel = computeShader().compute( [ 16, 1 ] );
const kernel = computeShader().compute( [ 16, 1, 1 ] );

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know that from the W3C documentation, I'll take a look at it, thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be implemented in two ways. Since it makes no difference in performance, I prefer to check the length of the workgroupSize array before calling the wgsl code builder to fill it with ones if smaller than a three-valued array

this.computeShader = this._getWGSLComputeCode( shadersData.compute, ( this.object.workgroupSize || [ 64 ] ).join( ', ' ) );
const workgroupSize = this.object.workgroupSize || [ 64, 1, 1 ];

if ( workgroupSize.length !== 3 ) throw new Error( 'workgroupSize must have 3 elements' );
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise dispatchWorkgroups can also take 1 or 2 parameters with the remainder defaulting to 1:

// all equivelant
renderer.computeAsync( kernel, [ 10 ] );
renderer.computeAsync( kernel, [ 10, 1 ] );
renderer.computeAsync( kernel, [ 10, 1, 1 ] );

I'm curious to hear other opinions

Copy link
Contributor Author

@Spiri0 Spiri0 Jul 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's simple. I can do it tonight ( CET ). If you have any other requests by then, I can include them as well

@@ -2351,6 +2352,38 @@ class Renderer {

}

if ( dispatchSize !== null ) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's leave this just to compute() node, so we avoid duplicate code.

Copy link
Contributor Author

@Spiri0 Spiri0 Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you mean move the entire validation to the compute() and just pass the dispatchSize through in the Renderer like before?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's better to just use the workgroupSize from the compute() node itself. I would also suggest reverting to the default value [ 64 ], and removing support for non-array numbers. Then we can add these details in the js-doc about auto-fill by 1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants