Ray traced sound propagation for bevy #11322

TirushOne · 2024-01-13T03:23:44Z

TirushOne
Jan 13, 2024

So I have thinking about this for a little while but before I commit to anything substantial I want to get feedback from more experienced contributors.

I know it's long

But if you could do me the kindness of reading through what I have to say and replying with your thoughts on the topic I would greatly appreciate it.

I find the idea of real time, fully dynamic sound propagation (whenever I say "propagation" from here on I mean "occlusion, reverberation and optionally, delay (caused by the speed of sound)) very exciting and enticing. Now most of you are probably thinking "Gee that sure does sound complex", and you may have a point there, but I think with a little bit of clever code reuse of other parts of bevy and it's libraries and the already existing knowledge in the community, the scope of the work to be done can be reduced to something quite manageable. "But ray tracing sound sounds performance intensive!" and yes, that to, buuuuut... I have some ideas.

"But why do we even need this?" Well I don't actually use bevy, but I imagine, like most game engines out there without ray traced audio or something like it, that the audio tools in bevy are limited to volumes with parameters like reverb and direct occlusion. In my opinion, especially if you are going for a immersive game or a game where audio localisation is important, this kind of audio just doesn't cut it (source Escape From Tarkov's audio system). The reason I mention tarkov is because it tries to do what is described above, with a combination of bounding volumes, direct occlusion queries and internal logic about how the volumes relate to each other. While a better implementation of such a tequique is certainly possible, if Battlestate's (Escape From Tarkov's developer) army of 200 + level designers cannot create a decent implementation of this style of spatial audio, despite constant complaints from the player base, then I think expecting the target audience of bevy, solo or very small team indie devs, to do so with such crude tools is ridiculous. So in much the same way that @JMS55 made an argument for a meshlet render on the basis of it's reduction in workload for the game developer (among numerous other benefits), so have I making the argument that a fully dynamic system for sound propagation will take a significant workload off the developer if implemented correctly and if that is something they want in their game.

Some basics about sound

This is not required to read and a bit much to take in all at once, but if you do you may have an easier time following my implementation ideas later on.

Since, in it's final form, this system is ultimately intended to allow a human player to locate a sound a space, I firmly believe in using physical principles in the architecture of the system so that a human brain that evolved for the real, physical world will be able to intuitively process the incoming sound and localise it appropriately. For that reason I am going to quickly run over some of the physical principals of sound and how it behaves for anyone unaware
.
This here is an illustration of a sound wave :

Note the following:
Sound is unidirectional, unless it is blocked or redirected by some force or object.
Sound has a wavefront, this is the outer edge of the sound wave. The wave front will continue to expand, technically, indefinitely. However at a certain point the sound will become undetectable among the background noise, and before that it will become inaudible to the human ear.

In this image is feature a wave front from a large explosion, I use this image because it clearly shows the exitance of the wave front expanding omnidirectionally where there is no interference from objects.
Last thing about waves and wave fronts, any point on a wave front can be considered a sub wavefront also expanding unidirectionally. Now if the sound is moving through an empty space, that means means very little. However when there are obstacles involved this fact allows waves to expand around corners as sub fronts that were moments ago constrained by the spherical shape of the wave front, now have empty low pressure space to start flooding into and so they will. Refer back to the first image and you can see an illustration of this as the sound bends around the corner.

Implementation and optimization

Ok so now I am going to lay out my high level idea of how this system would work in principle, I acknowledge that there are holes in even my high level conception that I don't have the necessary breath of knowledge to fill, which is exactly why I am creating this post, I want dissection and suggestions, at least as long as there is interest in this idea to begin with.

Steps:

In these illustrations the lister is red, the rays are black, and surfaces are yellow.

Trace some number of rays in all directions around the camera/player/listener. These in reality would be a "Sound Path" data structure
will links to the previous and next steps in the sounds path but none the less.

At some interval in space traveled by the ray, cast more rays unidirectionally from that point. This allows some simulation of that "bending" behavior I discussed earlier that real world sound exhibits.

If and when the ray intersects a solid object, it will scatter yet more rays omnidirectionally in a hemisphere aliened with the surface normal.

Eventually this has to stop, what the exact mechanism should be used stop the exponential growth of sound paths I an not yet sure. Likely some max amount of subdivisions/bounces a path can make before stopping.

Finally there is some kind of resolution step where data is aggregated from all of the sound paths and used to attenuate reverb volume and so on. This step I am not so clear on at the moment. But the sound paths themselves would be traced backwards from a source once found calculating the contribution of that source at each split in the sound path along the way to ensure correct volume'

Optimization

So that sound quite computationally expensive doesn't it? Well I have a few ideas on how to fix that.

asset preprocessing: for every mesh that should be included when tracing sound paths, generate a specific version of that mesh for tracing sound rays against using a decimation algorithm (such as QEM) offline. By keep the triangle count of the collision tests low, we decrease computational load and since sound can only be heard and not seen, and due to sounds "bending" nature as discussed earlier, the propagation effect small details on a mesh have on a sound is insignificant, only the large elements matter, meaning we can decimate those meshes quite a bit and have not discernible difference in propagation quality. This principle is not at all different to the concept of a collision mesh. which I assume bevy already has an implementation of.
Using well thought out, well constructed HBV's will also yield significant performance improvements for this system. This is a part of this whole idea that I definitely need some expert input on, the fastest BVH implementations and and other acceleration structures, most of which I imagine are used by bevy's physics engine, input from those familiar with such topics will be greatly apricated.
Not to many rays. I think, considering the size of sound waves, not a lot of rays will be required to get decent quality since rays not have to directly intersect a source, a ray just needs to be visible a some point alone it path directly by the source, from there we can assume sound from the source will reach and connect with that sound path.
Use of importance sampling. While the exact workings of importance sampling and distribution sampling techniques are complex. The idea can be easily theoretically transposed from ray tracing and light transport, to sound tracing. If implemented well, this should result in Signiant gains in quality for little or not additional cost. However this may require significant experimentation to get right.
the speed of sound. by limiting the amount of space a ray can travel in a single game tick to the speed of sound in that time frame, this can elevate some issues with performance but introduces complications over the basic model I proposed easier, such as how to handle sounds that get vary far away for the player or sound rays that pile up over time and slow down the propagation.

So that, finally, is it for now, I have some other ideas like sound functions for objects that act like materials for sound, defining the acoustic properties at various points on the object. But for now I think some core ideas need to be nailed down lets do that first. I am once again asking for any and all feedback and questions so we can figure out the scope and viability of this feature together.

JMS55 · 2024-01-13T05:20:08Z

JMS55
Jan 13, 2024
Collaborator

You can also use the hardware raytracing APIs for sound. See https://www.eurogamer.net/digitalfoundry-2023-avatar-frontiers-of-pandora-and-snowdrop-the-big-developer-tech-interview.

0 replies

Ixentus · 2024-01-13T13:25:23Z

Ixentus
Jan 13, 2024

This is very cool technology and if implemented well, it can give many cool features "for free" like natural echo's/reverb and Doppler effects. However building this is a huge undertaking at the scale of building the graphical renderer, but with much less prior art to learn from.

However like JMS55 pointed out, some of the graphical rendering pipeline can be reused for audio rendering and run on the GPU.

0 replies

TirushOne · 2024-01-15T08:56:33Z

TirushOne
Jan 15, 2024
Author

I have considered gpu ray tracing as an option for sure, espesally if a cpu implimentation can be made robust and useful, then have an option to use the same system but with hardware ray tracing I think would be a huge win for performance in some circumstances, as you get many many more rays in and much higher quality. But an advantage of a cpu implimentation is that it can run after physics and gameplay updates, while/after draw calls, and once the gpu is done rendering the frame, and the cpu is doing tracing sound waves, the game state is updated, Making better overall use of system resourses. I also would like to have propogation through surfaces at some point but that has big performance implications if using ray tracing. For that reason I have also considered ray matching but for me this has many more unknowns so i'll have to look into it.

I'll spend some time reading and learning about engine architecture and previous work in the space and I'll come back with updates on what I'v learned. Feel free to post anymore links for suggestions though.

But my biggest question is still about BVH's, because if bevy already has a robust implimentation of BVH's (likley for ray casting and physics) that can handle istnacing, many objects, movible objects and so on, then being able to reuse that code would save me alot of time and make things a lot easier.

0 replies

HyperCodec · 2025-04-06T16:11:49Z

HyperCodec
Apr 6, 2025

Someone recently made a raytracing audio plugin for unreal and godot, they have a short video with nice little animations of this: https://www.youtube.com/watch?v=u6EuAUjq92k

I personally don't believe this should be a part of Bevy's core repository, but rather some external crate/plugin, maybe even one that interacts with a physics engine crate like avian for the colliders. First-class support for this would be nice though.

0 replies

evelant · 2025-05-05T22:14:10Z

evelant
May 5, 2025

Very interesting idea! I have practically no experience in this area so my take on it is probably pretty far off, but might it be possible to make this sort of thing fast by modeling it from the point of each sound source instead of from the player/listener? That might enable some optimizations based on physics. Given a sound source and the speed of sound (assuming air as the medium) couldn't you immediately discard processing any sounds that could not have reached any listeners yet at the speed of sound? After that when sufficient time has passed that a sound could have propagated to a listener cast a ray from the sound to the listener. If it intersects with nothing the listener hears the sound "as is" (given the attenuation effects of distance and mixing with other sounds). If it does intersect with something between the sound and a listener calculate how much attenuation the barrier adds. If it's enough that the sound would be inaudible you're done. Otherwise cast a new ray from the edge(s) of the first obstacle back to the listener and repeat. From my naive understanding it seems that might be a lot more efficient than trying to cast rays from a listener in all directions then backpropagate when they intersect with sound sources. I guess that might not capture reverb where a sound bounces off of several objects before reaching the listener however. Either way, interesting idea, kinda like PBR but for sound!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Ray traced sound propagation for bevy #11322

Uh oh!

{{title}}

Uh oh!

Replies: 5 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Uh oh!

Ray traced sound propagation for bevy #11322

Uh oh!

TirushOne Jan 13, 2024

I know it's long

Some basics about sound

Implementation and optimization

Optimization

Replies: 5 comments

Uh oh!

JMS55 Jan 13, 2024 Collaborator

Uh oh!

Ixentus Jan 13, 2024

Uh oh!

Uh oh!

TirushOne Jan 15, 2024 Author

Uh oh!

Uh oh!

HyperCodec Apr 6, 2025

Uh oh!

Uh oh!

evelant May 5, 2025

TirushOne
Jan 13, 2024

JMS55
Jan 13, 2024
Collaborator

Ixentus
Jan 13, 2024

TirushOne
Jan 15, 2024
Author

HyperCodec
Apr 6, 2025

evelant
May 5, 2025