Skip to content

Fix panic when a mesh has skinning attributes but no SkinnedMesh component #18074

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

greeble-dev
Copy link
Contributor

@greeble-dev greeble-dev commented Feb 27, 2025

PR is in draft mode as it needs to be redone following other changes. Currently waiting on #18762 and #18763 to land.


Objective

Summary

My goal was to fix #16929 - a wgpu panic when a mesh with skinning attributes is instantiated as a Mesh3d but without a SkinnedMesh component.

The problem was:

  • setup_morph_and_skinning_defs chooses a skinning bind group layout if the mesh has skinning attributes.
  • SetMeshBindGroup::render chooses a skinning bind group if the mesh is in skin_uniforms.
  • This breaks if a mesh has skinning attributes but isn't in skin_uniforms (because it doesn't have a SkinnedMesh component).

The fix is to make skin_uniforms the source of truth.

This was mostly the case already - the exception was setup_morph_and_skinning_defs. So I changed the pipeline specialization functions to check skin_uniforms, set a new MeshPipelineKey::SKINNED, and then pick that up in setup_morph_and_skinning_defs.

While working this through I noticed a few more issues.

Incorrect Pipeline Caches

The pipeline specialization used RenderMeshInstanceFlags::HAS_PREVIOUS_SKIN to decide if the mesh needed motion blur. But that flag is set by the set_mesh_motion_vector_flags system, which can run after specialization. So in some cases the wrong pipeline was cached and skinned meshes had no motion blur.

This is fixed as a side effect of using skin_uniforms as the source of truth. It also means RenderMeshInstanceFlags::HAS_PREVIOUS_SKIN can be removed, and set_mesh_motion_vector_flags no longer has to iterate over all skinned entities.

The bad news is that HAS_PREVIOUS_MORPH follows a similar pattern, so I suspect motion blur is broken on morph targets in some cases.

Previous Buffers Too Small

When testing with skins_use_uniform_buffers forced on I got a panic about SkinUniforms::prev_buffer being too small for the mapped range. I think the problem is that prepare_skins grows current_buffer without also growing prev_buffer, so it's too small for one frame.

I've changed prepare_skins to grow both buffers. However, this will cause a single frame visual glitch as the new prev_buffer is initialised with the current frame's joints. For now I think reducing a panic to a glitch is sufficient.

I suspect there's also a bug where prev_buffer has wrong transforms the first frame after a skin is added. And at the last minute I realised MeshPipelineKey::HAS_PREVIOUS_SKIN is probably obsolete, since all skins now have a previous. I might chase these down in a follow-up PR.

System Parameter Limits

When adding a skin_uniforms parameter to systems like specialize_material_meshes I hit a limit on the number of parameters. I fixed this by moving some parameters into a SystemParam shared between specialize systems (85d1a00).

New Test

The PR adds a test that repros most of the issues. It's designed to be run on CI to detect panics, but can be visually inspected as well. I haven't actually added it to the CI - I wasn't sure whether to do that in this PR or a follow-up.

image

The test currently spams warnings as it repros a bug not addressed by this PR (SkinnedMesh without skinning attributes - the inverse of #16929).

Follow Ups

#9360 made the glTF importer try to skip any meshes that would triggered the bug. Arguably this is no longer necessary, but that's best handled in a separate PR as it may be contentious.

Performance

Tested with many_foxes. The cost of set_mesh_motion_vector_flags is much reduced (8.97us -> 0.3us).

Testing

cargo run --example invalid_skinned_mesh
cargo run --example animated_mesh
cargo run --example custom_skinned_mesh

Tested on Win10/Vulkan/Nvidia, Wasm/Chrome/WebGL2/Nvidia.

…fixed forward mode missing passes required for motion blur.
- Removed render mode cycling and motion blur toggling - all renderer paths can be tested just by enabling motion blur.
- Added a text overlay documenting each mesh.
- Various tidy ups.
…h for whether a mesh is skinned. This meant adding `MeshPipelineKey::SKINNED`.

- Fixed warning spam by making `extract_skins` check if the mesh has skinning attributes.
- Fixed panic when using uniform buffers for skinning. If `SkinUniforms::current_buffer` is grown the `SkinUniforms::prev_buffer` should grow too.
size: new_size,
mapped_at_creation: false,
});
// TODO: This is wrong. We should be using last frame's data for prev_buffer.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awkward TODO - see "Previous Buffers Too Small" section.

@alice-i-cecile alice-i-cecile added C-Bug An unexpected or incorrect behavior A-Rendering Drawing game state to the screen P-Crash A sudden unexpected crash A-Animation Make things move and change over time S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Feb 27, 2025
@alice-i-cecile alice-i-cecile added this to the 0.16 milestone Feb 27, 2025
@alice-i-cecile
Copy link
Member

The fix is to make extract_skins check the attributes. Unfortunately that adds complication and probably a performance cost (see below), which is annoying when it only fixes a case that's not actually useful. I can drop/separate this change if it's controversial.

Let's drop that; the rest of this is a clear improvement that we shouldn't let get bogged down.

@greeble-dev
Copy link
Contributor Author

I've reverted the attribute checks in extract_skins. This removes the performance issue. It also means the invalid_skinned_mesh example will now spam warnings, but I'm assuming that's a problem for a follow up PR.

I'll investigate using SystemParams tomorrow.

@greeble-dev
Copy link
Contributor Author

I've changed the specialize systems to take a SpecializeParams struct. This is mainly to dodge the limit on system parameters, but is arguably a bit cleaner too. I could have put more shared params in there but decided to keep things minimal for this PR.

Note that the struct is pub because specialize_material_meshes is pub. I suspect that's wrong - it should be pub(crate) at most? But again, keeping things minimal.

@alice-i-cecile alice-i-cecile removed this from the 0.16 milestone Mar 9, 2025
@greeble-dev greeble-dev marked this pull request as draft April 4, 2025 08:19
@greeble-dev
Copy link
Contributor Author

I've pulled one of the fixes out to #18715. Also put this PR into draft mode - once the other PR shakes out I'll need to update the description and fix conflicts.

github-merge-queue bot pushed a commit that referenced this pull request Apr 4, 2025
## Objective

Fix motion blur not working on skinned meshes.

## Solution

`set_mesh_motion_vector_flags` can set
`RenderMeshInstanceFlags::HAS_PREVIOUS_SKIN` after specialization has
already cached the material. This can lead to
`MeshPipelineKey::HAS_PREVIOUS_SKIN` never getting set, disabling motion
blur.

The fix is to make sure `set_mesh_motion_vector_flags` happens before
specialization.

Note that the bug is fixed in a different way by #18074, which includes
other fixes but is a much larger change.

## Testing

Open the `animated_mesh` example and add these components to the
`Camera3d` entity:

```rust
MotionBlur {
    shutter_angle: 5.0,
    samples: 2,
    #[cfg(all(feature = "webgl2", target_arch = "wasm32", not(feature = "webgpu")))]
    _webgl2_padding: Default::default(),
},
#[cfg(all(feature = "webgl2", target_arch = "wasm32", not(feature = "webgpu")))]
Msaa::Off,
```

Tested on `animated_mesh`, `many_foxes`, `custom_skinned_mesh`,
Win10/Nvidia with Vulkan, WebGL/Chrome, WebGPU/Chrome. Note that testing
`many_foxes` WebGL requires #18715.
mockersf pushed a commit that referenced this pull request Apr 4, 2025
## Objective

Fix motion blur not working on skinned meshes.

## Solution

`set_mesh_motion_vector_flags` can set
`RenderMeshInstanceFlags::HAS_PREVIOUS_SKIN` after specialization has
already cached the material. This can lead to
`MeshPipelineKey::HAS_PREVIOUS_SKIN` never getting set, disabling motion
blur.

The fix is to make sure `set_mesh_motion_vector_flags` happens before
specialization.

Note that the bug is fixed in a different way by #18074, which includes
other fixes but is a much larger change.

## Testing

Open the `animated_mesh` example and add these components to the
`Camera3d` entity:

```rust
MotionBlur {
    shutter_angle: 5.0,
    samples: 2,
    #[cfg(all(feature = "webgl2", target_arch = "wasm32", not(feature = "webgpu")))]
    _webgl2_padding: Default::default(),
},
#[cfg(all(feature = "webgl2", target_arch = "wasm32", not(feature = "webgpu")))]
Msaa::Off,
```

Tested on `animated_mesh`, `many_foxes`, `custom_skinned_mesh`,
Win10/Nvidia with Vulkan, WebGL/Chrome, WebGPU/Chrome. Note that testing
`many_foxes` WebGL requires #18715.
github-merge-queue bot pushed a commit that referenced this pull request Apr 6, 2025
## Objective

Fix  #18714.

## Solution

Make sure `SkinUniforms::prev_buffer` is resized at the same time as
`current_buffer`.

There will be a one frame visual glitch when the buffers are resized,
since `prev_buffer` is incorrectly initialised with the current joint
transforms.

Note that #18074 includes the same fix. I'm assuming this smaller PR
will land first.

## Testing

See repro instructions in #18714. Tested on `animated_mesh`,
`many_foxes`, `custom_skinned_mesh`, Win10/Nvidia with Vulkan,
WebGL/Chrome, WebGPU/Chrome.
mockersf pushed a commit that referenced this pull request Apr 8, 2025
## Objective

Fix  #18714.

## Solution

Make sure `SkinUniforms::prev_buffer` is resized at the same time as
`current_buffer`.

There will be a one frame visual glitch when the buffers are resized,
since `prev_buffer` is incorrectly initialised with the current joint
transforms.

Note that #18074 includes the same fix. I'm assuming this smaller PR
will land first.

## Testing

See repro instructions in #18714. Tested on `animated_mesh`,
`many_foxes`, `custom_skinned_mesh`, Win10/Nvidia with Vulkan,
WebGL/Chrome, WebGPU/Chrome.
@greeble-dev greeble-dev added S-Waiting-on-Author The author needs to make changes or address concerns before this can be merged and removed S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels May 5, 2025
github-merge-queue bot pushed a commit that referenced this pull request Jul 7, 2025
## Objective

Add a test that would have caught #16929 and #18712.

## Solution

The PR adds a `test_invalid_skinned_mesh` example that creates various
valid and invalid skinned meshes. This is designed to catch panics via
CI, and can be inspected visually. It also tests skinned meshes + motion
blur.


![417958065-37df8799-b068-48b8-87a4-1f144e883c9c](https://github.com/user-attachments/assets/22b2fa42-628b-48a4-9013-76d6524cecf5)

The screenshot shows all the tests, but two are currently disabled as
they cause panics. #18074 will re-enable them.

### Concerns

- The test is not currently suitable for screenshot comparison.
- I didn't add the test to CI. I'm a bit unsure if this should be part
of the PR or a follow up discussion.
- Visual inspection requires understanding why some meshes are
deliberately broken and what that looks like.
- I wasn't sure about naming conventions. I put `test` in the name so
it's not confused with a real example.

## Testing

```
cargo run --example test_invalid_skinned_mesh
```

Tested on Win10/Nvidia, across Vulkan, WebGL/Chrome, WebGPU/Chrome.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Animation Make things move and change over time A-Rendering Drawing game state to the screen C-Bug An unexpected or incorrect behavior P-Crash A sudden unexpected crash S-Waiting-on-Author The author needs to make changes or address concerns before this can be merged
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

Bevy Panics when trying to load a single mesh from gltf file containing armature
2 participants