-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Description
Discussion: Goal and Tasks
The main goal is to efficiently cull objects that are fully hidden behind others, in order to avoid wasting GPU resources on unnecessary rendering.
Key tasks:
-
Reduce draw calls and pixel workload.
-
Organize hierarchical tests (Hi-Z) to reject large clusters of objects.
-
Consider API limitations: WebGL2 lacks full asynchronous access to GPU results, while WebGPU provides more flexible access through buffers and query sets.
Methods in WebGL2
-
Occlusion Query (EXT_occlusion_query_boolean / WebGL2 core)
Allows retrieving the number of fragments that passed the depth test. Problem: results are asynchronous and may stall the CPU when fetched.
Mostly useful for coarse-level culling (e.g., “object is completely invisible / at least one pixel is visible”). -
Hierarchical Z (Hi-Z) via textures and mipmaps
Implemented manually: render the current depth buffer into a texture, build a mip-chain (each level storing min-depth). In vertex/geometry shaders, AABBs can be tested against Hi-Z. This is a fully GPU-based solution, avoiding CPU queries.
Downside: requires an additional pipeline (depth render → mipmap generation → testing). -
Transform Feedback for GPU-based culling
Bounding volumes are tested against Hi-Z in vertex-like stages, with visible objects written back to a buffer usingtransformFeedback
. This allows the GPU to generate a list of visible primitives without CPU involvement.
Methods in WebGPU
-
Query Sets (Occlusion Queries)
WebGPU natively supports occlusion queries with query sets. Bounding proxy objects can be drawn, and their results collected in GPU buffers. Reading can still be asynchronous, but unlike WebGL2, it does not directly stall the CPU. -
Compute Shader-based Culling (GPU-driven rendering)
Compute shaders test bounding volumes against the Hi-Z depth buffer, with results written into an indirect argument buffer. Rendering of visible objects can then be launched directly by the GPU (GPU-driven). This is the modern preferred approach. -
Hierarchical Z (Hi-Z) on GPU
Similar to WebGL2, but more efficient: the depth mip-chain can be generated using compute shaders and then used in further compute passes for visibility testing.
Combined with indirect rendering, this provides a scalable solution.
- pc.EVENT_PRERENDER
- pc.EVENT_POSTRENDER_LAYER
Shader:
using uniform highp sampler2D uSceneDepthMap;