Possible race condition when unhooking #12

m417z · 2025-04-04T18:41:35Z

Consider two threads running in parallel as follows:

void (*pOutputDebugStringWOriginal)(PCWSTR lpOutputString) = OutputDebugStringW;

void OutputDebugStringWHook(PCWSTR lpOutputString) {
    pOutputDebugStringWOriginal(L"B\n");
}

void thread1() {
    while (1) {
        OutputDebugStringW(L"A\n");
    }
}

void thread2() {
    Sleep(1000);
    SlimDetoursInlineHook(
        TRUE,
        (PVOID*)&pOutputDebugStringWOriginal,
        OutputDebugStringWHook
    );
    Sleep(1000);
    SlimDetoursInlineHook(
        FALSE,
        (PVOID*)&pOutputDebugStringWOriginal,
        OutputDebugStringWHook
    );
}

If all goes right, the output should be:

A
...
A
B
...
B
A
...

And indeed that's what happens most of the time, but there's a problem here.

Consider what happens when these two parts run in parallel:

Thread 1

    pOutputDebugStringWOriginal(L"B\n");

Thread 2

    SlimDetoursInlineHook(
        FALSE,
        (PVOID*)&pOutputDebugStringWOriginal,
        OutputDebugStringWHook
    );

What can happen is:

Thread 1 reads the value of pOutputDebugStringWOriginal, but doesn't yet use it for the actual call.
Thread 2 freezes all threads, sets pOutputDebugStringWOriginal back to OutputDebugStringW and freezes the trampoline.
Thread 1 is resumed and calls the trampoline, which has already been freed.

I noticed this problem only on ARM64. I think that the reason is that while in x86/x64, the call can be a single instruction:

call [pOutputDebugStringWOriginal]

On ARM64, the call is done with multiple instructions which load the address first, something like:

adrp    x16, Page(pOutputDebugStringWOriginal)
add     x16, x16, PageOffset(pOutputDebugStringWOriginal)
br      x16

On the library consumer side, one solution is to use a lock (perhaps a read/write lock) when accessing the original function pointer and when setting or removing hooks. However, this is not optimal or convenient, and currently isn't documented as a requirement.

On the SlimDetours side, I'm not sure what's the best solution.

If we want to keep the library usage unchanged, we have to make sure that the pointer to call the hook isn't freed. The only way I can think of that's truly guaranteed to be correct is to just never free the trampoline, which is of course suboptimal.
If we want to consider changing the library, perhaps we can introduce a helper function to call the original function, such as:
```
SlimDetoursCallOriginal(pOutputDebugStringWOriginal, L"A\n");
```
Then, we can handle the case of a thread being suspended within this function. This approach is also not optimal, both due to the library change and the non-trivial implementation.

Let me know if you come up with a solution I didn't think of.

The text was updated successfully, but these errors were encountered:

RatinCN · 2025-04-16T15:17:48Z

one solution is to use a lock (perhaps a read/write lock) when accessing the original function pointer and when setting or removing hooks

It's okay for me to add a function to access the original function pointer if it's optional. The caller could choose use it for more safety, or don't use it for more convenient and keep existing code. We could use lock, ref count, ... to make it safer in different scenarios if needed, what do you think?

m417z · 2025-04-26T21:54:49Z

Using a lock can be done without additional changes in SlimDetours, for example:

static FN_EqualRect* g_pfnEqualRect = NULL;
static SRWLOCK g_srwLock = SRWLOCK_INIT;

static
BOOL
WINAPI
Hooked_EqualRect(
    _In_ CONST RECT *lprc1,
    _In_ CONST RECT *lprc2)
{
    AcquireSRWLockShared(&g_srwLock);
    // ...
    BOOL ret = g_pfnEqualRect(lprc1, lprc2);
    // ...
    ReleaseSRWLockShared(&g_srwLock);
    return ret;
}

void TwiceSimpleHook()
{
    AcquireSRWLockExclusive(&g_srwLock);
    hr = SlimDetoursInlineHook(TRUE, (PVOID)&g_pfnEqualRect, Hooked_EqualRect);
    ReleaseSRWLockExclusive(&g_srwLock);
    // ...
    AcquireSRWLockExclusive(&g_srwLock);
    hr - SlimDetoursInlineHook(FALSE, (PVOID)&g_pfnEqualRect, Hooked_EqualRect);
    ReleaseSRWLockExclusive(&g_srwLock);
}

The solution I selected for my specific needs with specific constraints is a combination of #19 (pending review/merge) and m417z/thread-call-stack-scanner.

In any case, since there's no easy way to address this in the library without API changes, I think it'd be a good idea to document this limitation (and #15) in the TechWiki.

m417z mentioned this issue Apr 10, 2025

Add SlimDetoursDetachEx to allow to free the trampoline manually #19

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Possible race condition when unhooking #12

Possible race condition when unhooking #12

m417z commented Apr 4, 2025

RatinCN commented Apr 16, 2025

Uh oh!

m417z commented Apr 26, 2025

Uh oh!

Possible race condition when unhooking #12

Possible race condition when unhooking #12

Comments

m417z commented Apr 4, 2025

Thread 1

Thread 2

RatinCN commented Apr 16, 2025

Uh oh!

m417z commented Apr 26, 2025

Uh oh!