-
Notifications
You must be signed in to change notification settings - Fork 2.1k
buffer update opt: Initialize UboManager class #9331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…)" This reverts commit 49c4a5d.
# Conflicts: # filament/src/details/MaterialInstance.cpp
filament/src/details/UboManager.cpp
Outdated
|
|
||
| // Traverse all MIs and see which of them need slot allocation. | ||
| bool needReallocation = | ||
| updateMaterialInstanceAllocations(materialInstances, /*forceAllocateAll*/ false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of the boolean parameter you could use a two state enum class{} with names are more clear about what action will be taken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So maybe @pixelflinger can elaborate. But did you mean the "parameter" (i.e. forceAllocateAll) or the return value (i.e. allocationResult).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed both of the parameter and the return value to be enum
filament/src/details/UboManager.cpp
Outdated
| if (!needReallocation) { | ||
| // No need to grow the buffer, so we can just map the buffer for writing and return. | ||
| mMmbHandle = driver.mapBuffer(mUbHandle, 0, mUboSize, | ||
| MapBufferAccessFlags::WRITE_BIT | MapBufferAccessFlags::INVALIDATE_RANGE_BIT, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the INVALIDATE_RANGE_BIT will make the whole mapped range invalid, meaning you'll have to update/write the whole range. Is it your intention here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I misunderstood the usage of this flag. I think we don't need this flag here, thanks for the catch!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A little bit worried about the performance here, not sure if I remember correctly, but I read some articles about using it without this flag would lead to significant performance issue (The driver will assume you want to read the data)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have another idea is that we cached the entire buffer content on the CPU side, so whenever we want to update the slots, we update the cached one. Then we can use INVALIDATE_RANGE_BIT when we're mapping. After all the updates are done and before unmapping, we copy the whole cached buffer into the mapped location. wdyt?
filament/src/details/UboManager.cpp
Outdated
|
|
||
| // Traverse all MIs and see which of them need slot allocation. | ||
| bool needReallocation = | ||
| updateMaterialInstanceAllocations(materialInstances, /*forceAllocateAll*/ false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So maybe @pixelflinger can elaborate. But did you mean the "parameter" (i.e. forceAllocateAll) or the return value (i.e. allocationResult).
filament/src/details/UboManager.cpp
Outdated
| materialInstances.forEach([this, &driver](const FMaterialInstance* mi) { | ||
| const AllocationId allocationId = mi->getAllocationId(); | ||
| assert_invariant(BufferAllocator::isValid(allocationId)); | ||
| updateSlot(driver, allocationId, mi->getUniformBuffer().toBufferDescriptor(driver)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to update the allocationId of each material instance here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The allocation id of each instance is updated in updateMaterialInstanceAllocations. At this point what we need to do is copy the data to the new location in the new UBO.
| allocationIds.insert(id); | ||
| }); | ||
|
|
||
| mFenceAllocationList.emplace_back( driver.createFence(), std::move(allocationIds) ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so if a reallocation happens between two frames. Will the ids of the first frame still be meaningful in the context of the allocator? Because line 168 will try to release it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow that's a great point. Then I think we should destroy the existing fences during reallocation. Thanks!
85e27e7 to
218e85c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm with minor comments
filament/src/details/UboManager.cpp
Outdated
| // Migrate all MI data to the new allocated slots. | ||
| for (const auto& materialInstance : materialInstances) { | ||
| materialInstance.second.forEach([this, &driver](const FMaterialInstance* mi) { | ||
| if (!mi->isUsingUboBatching()) return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: for consistency, here and elsewhere, bracket with next line for these early returns.
if (!mi->isUsingUboBatching()) {
return;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| // For MIs whose parameters have been updated, aside from the slot it is being | ||
| // occupied by the GPU, we need to preserve an additional slot for it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider the scenario:
- There are N material instances
- All instances are updated every frame
- The GPU has not finished the previous frame
- The number of free slots is now insufficient for the N new updates
If we reallocate and only add 1 * uboSize for each of the N instances, we will be in the exact same situation on the very next frame. The buffer will be full again, and we'll be forced to reallocate every single frame.
To prevent this, when a material instance fails to find a slot due to Ubo update, I allocate twice its needed space.
This is also why I don't use an early break in the loop (what you ask in the other comment). I need to iterate through all material instances to get a total count of how many failed by update, so I can calculate the correct new buffer size (accommodating 2x space for all of them).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it is possible that for some devices the latency is fairly high and 2 * uboSize is not enough. It is worth trying to use (number of pending fences + 1) in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if you just did something simple like always allocate max(oldSize * 1.5, newEstimatedRequiredSize). Thinking about frame delays and needed space here can get really complicated fast imo.
But my comment is more a of personal preference. This section is basically the heuristics part of the work. So I'm open to anything that's more memory efficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about frame delays and needed space here can get really complicated fast imo.
Yeah I agree...so I would prefer we still have *2 for those updating material instances as a buffer, and see how it works on the real apps and adjust it as needed.
I'm uncertain about under what situation would the
FenceStatusbeError, need some suggestions about how to handleErrorstatus incheckFenceAndUnlockSlots.Other notes:
Only partial functions have been completed,All of them have been donebeginFrameandfinishBeginFrameis still WIPUboManager, will try to figure out if there's a easier way to test the logic and add the tests later.