Camera Driven Rendering #19700
tychedelia
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Bevy's rendering APIs have been described as "camera driven" several times, but it's not always clear exactly what that means. In Bevy's rendering system, the "camera" entity has a privileged position and provides the following behaviors/data:
ViewTarget
)RenderLayers
Mixed metaphors
Some discussion in #16248 brings up a number of metaphors for what a camera is:
RenderTarget
is the film, although notes the conceptual imprecision due to the hdr field which hints at the presence of the internal texture, and further introduces the idea that the camera is split into two parts, the camera and the lens.UI presents some other challenges to the metaphor. While UI does have an implicit orthographic projection and is superficially similar to 2d rendering in many ways, it raises questions as to what a UI camera is "looking at" and being rendered on.
Currently, a UI camera is a virtual view ("subview") that is tied to a 2d/3d camera. In this sense, if the lens determines "how" the scene looks, the UI camera is like an additional filter or color gel that is place in front of the camera, i.e. not really a camera at all.
In 15256 when discussing world-space UI, aevyrie argues that the camera metaphor obscures possible implementation, where it could make sense to parent a render surface to some other entity already in worldspace and have things "just work."
Other proposals, such as a hypothetical
CameraFullscreen
, which would be a way to run a render graph with no geometry, i.e. a simple way for users to write fullscreen shaders, continue to stretch the metaphor.The problem with compositing
I'd like to argue that the idea of camera driven rendering is fundamentally sound, but suffers from a critical conceptual ambiguity with respect to what the film medium is. More precisely, the fact that the camera both captures and composites is a significant problem for the API, particularly when using multi-camera setups.
The hidden "internal texture"
Importantly,
RenderTarget
is not the film, it is something more like the print the film is developed onto.CameraOutputMode
is the developer/fixer. The film is, in our current API, not directly exposed to the user.Every
ExtractedView
has aViewTarget
, which contains two logical textures: the "main" texture, which is used as the color attachment for most render passes, and the "out" texture, which is theRenderTarget
typically a swapchain texture. Importantly, the out texture is only used in the final step of the render graph, where the upscaling node blits (i.e. composites) the main texture to the out texture. In other words, the user never sees the main texture itself, which is why it can be said to be "internal."Jasmine notes in #16248 this is particularly confusing because, for example, the
hdr
field onCamera
actually has nothing to do with theRenderTarget
. This has also lead to the proliferation of some more niche components likeCameraMainTextureUsages
that allow configuring the internal texture for other uses in the render graph.The sharp edges of multi-cam
Users consistently run into issues with using multiple cameras. When two cameras share the same HDR and MSAA settings, the renderer will "helpfully" re-use the same cached texture for both of them, including disabling clearing the texture for all cameras after the first. This is potentially a performance win and in many cases results in the behavior users expect, where one camera can easily draw on top of another camera's output, but has a number of unfortunate consequences:
Additionally, this texture is generally not configurable, which poses issues for more niche uses that require different texture formats or would like to use the texture in other contexts.
Proposal: Camera Graph
My proposal is that we embrace camera driven rendering by understanding compositing as another kind of camera. More specifically, I want to argue that we should understand cameras as forming a kind of graph that has both inputs and outputs.
Another way to put this is that a camera should be considered as a logical render pass . This is the
CameraSubGraph
component / the "lens" of the camera. Rather than imagining that the user should configure a single monolithic render graph that accomplishes all their needs in a single camera, we should be encouraging users to create multiple cameras.Making the relationship between cameras itself a graph can help define how textures (and potentially other resources) should flow through rendering at a more coarse grained level and makes creative decisions with respect to compositing explicit. Users who want fine grained control for maximum performance and resource efficiency can still configure a single camera/render graph.
By having cameras accept texture inputs and making compositing a separate step, we can drastically simplify the conceptual model: camera's have film, they can also accept film from another camera to do double-exposure. By making the actual render texture explicit, I think it will be easier to teach patterns for multi-camera rendering. And, while configuring multiple cameras may be a bit of a pain today, this kind of pattern is well suited for asset driven configuration (BSN) and editor tooling.
API Sketch
This isn't intended as a concrete proposal but just a sketch of what an API might look like:
As a logical graph:
Drawbacks
CompositingCamera
in the scene, by default a camera's output goes to that input), but it makes the default case a bit more complicated.CameraSubGraph
and passes storage buffers into the next camera, i.e. making cameras also accept buffers as inputs/outputs.Beta Was this translation helpful? Give feedback.
All reactions