Skip to content

Possible race condition on VmaAllocator between defragmentation and regular allocations #313

Open
@sbourasse

Description

@sbourasse

We've been using VMA extensively for our memory management on projects with heavy memory load. However, upon integrating the defragmentation utility, we started getting frequent memory corruption on internal VMA data. We narrowed down the source of this corruption to concurrent accesses on our VmaAllocator and (apparently) succeeded in preventing it by blocking allocations for the duration of the defragmentation. More context follows, with a couple questions at the end.

We use a single VmaAllocator from two separate threads. Most of our allocations are done from the rendering thread, which is also the one that runs the defragmentation. We also allocate from the main thread, although less frequently.

Our defragmentation integration looks something like this (it has the same structure as what can be found in the online documentation) :

vmaBeginDefragmentation(mainAllocator, ...);

for (;;)
{
	if (vmaBeginDefragmentationPass(mainAllocator, ...) == VK_SUCCESS)
	   break;

	CreateResourcesAndCopy();
	WaitForCompletion();

	if (vmaEndDefragmentationPass(mainAllocator, ...) == VK_SUCCESS)
		break;
}

vmaEndDefragmentation(mainAllocator, ...);

Running this bit of code on heavy workloads from our rendering thread while our main thread is free to use mainAllocator for other allocations, we reliably trigger memory corruptions, specifically located on the first 4 bytes of random VmaAllocation from our working VmaDefragmentationPassMoveInfo::pMoves array.

Regardless of this particular symptom, we managed to prevent such corruption by implementing a mutex targeting the region between vmaBeginDefragmentation() and vmaEndDefragmentation() as well as other VMA calls on mainAllocator, specifically those that could be made concurrently to defragmentation.

As a side note, we first prevented memory corruption by enabling VMA_DEBUG_GLOBAL_MUTEX, changing VmaMutex::m_Mutex type to std::recursive_mutex, and locking it from our side right after vmaBeginDefragmentation() and releasing it right before vmaEndDefragmentation().

Does it make sense to treat the region from vmaBeginDefragmentation() to vmaEndDefragmentation() as a critical section with regards to the used VmaAllocator ? If so, is it something that could be enforced from within the library ?

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinginvestigatingStill to be determined whether we work on this

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions