Replies: 1 comment
-
This is being tracked in #2745. Another thing we should look into is offering an option to use 3 shadow splits instead of 4, and making it the default. I find that in most scenes, this is more practical from a performance standpoint. 2 splits is usually too few for desktop games though (though it can work nicely in smaller scenes). |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I have been looking into the rendering performance of gd4 recently, and dynamic shadows were one of the major bottlenecks.
Below is a profiler screenshot from one of the scenes I investigated. I am well aware that this scene is not perfect (e.g. LODs could be used for more models), but the same scene runs with more than double the framerate in various other engines using identical (or as close as possible) settings.
Profiler Screenshot
(I added some additional draw command labels to make my life easier)
While Godot already uses some tricks to speed up rendering the shadows it still takes almost a third of the frame time, something that really shouldn't take that long. Below are several ideas I came up with while thinking about how to increase performance:
As far as I am aware the current state of the art is to use virtual shadow maps. This has the benefit of requiring less memory while also resulting in better resolution. In addition, we can use the tiled structure of virtual shadow maps to only update part of the map every frame, allowing us to cull away a larger part of the scene while also spreading the cost over multiple frames.
In a pure GPU renderer, we could alternatively cull the scene on the fly on the GPU. We would start by first rendering the shadows of some of the larger objects before we take a break and compare the bounding boxes of the remaining objects with our interim depth. This would allow us to cull a far larger percentage than we currently can.
Sadly this requires a GPU-driven renderer, which is probably not something we will have in the near future.
The final idea I had was to simply not update all the cascades every frame. While it makes sense (and is relatively cheap) to update cascades 0 and 1 every frame doing the same for 2 and 3 is significantly more expensive since they include a far larger amount of objects. So what if instead of rendering them every frame we only do so every second one? If we change how the shadow atlas is cleared and we could update them alternatingly.
The downside to this is that distant shadows would now only have half the frame rate, but honestly, I doubt anyone will notice. In my example from above this would net me 2ms.
I would appreciate any feedback or alternative ideas as to how shadow performance could be increased.
Beta Was this translation helpful? Give feedback.
All reactions