Skip to content

[GEN][ZH] Fix Mismatch in Release Build when Logging is enabled #759

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

helmutbuhler
Copy link

This PR disables some logging in Releasebuild when Logging is enabled that causes a Mismatch. It'll be nice to have logging work in release builds to debug mismatch issues.

With this fix applied, you can load a retail replay with AI players and play it back in a Release build with logging without mismatches.

@xezon
Copy link

xezon commented Apr 24, 2025

This is wild. The log is not modifying anything. Is this perhaps related to compiler optimizations that the added logging will affect?

@xezon xezon added Minor Severity: Minor < Major < Critical < Blocker Debug Is mostly debug functionality Fix Is fixing something labels Apr 24, 2025
@helmutbuhler
Copy link
Author

Yes, sin and cos are internally implemented with doubles. If the logging is not there, the optimizer removes the casts to float and uses the doubles in the final calculation. If the logging is there, it doesn't remove the cast because the float is logged, and then also uses the float in the gamelogic.
This happens in quite a few places, but luckily only this log call seems to cause a gamelogic change.

@xezon
Copy link

xezon commented Apr 24, 2025

This is outrageous lol. Nice. Can this perhaps also be fixed by moving the logging a few lines?

@helmutbuhler
Copy link
Author

I highly doubt it. Also keep in mind that some other log calls are already guarded by #if defined(_DEBUG) || defined(_INTERNAL). One of them in the same file. The original devs were possibly aware of this issue and just missed this one Log call.

@roossienb
Copy link

Does this mean that after the fix, the debug version of the game mismatches against the release version of the game? That could hamper testing efforts. Or does it mismatch already?

@helmutbuhler
Copy link
Author

Yes, VC6 Debug and Internal Version immediately mismatch against retail. Fixing that would be very difficult. I initially planned to do that and published that work here: #540
But it's so much work, it's not really feaseble.

@xezon
Copy link

xezon commented Apr 25, 2025

I tested this change with replay of https://github.com/roossienb/GeneralsWiki/blob/e52ca6618f90cef9ece00cbb7abacc293f580b12/SourceCode/Debug/files/Zerohour-skirmish.zip and it mismatched with vs6 release build + RTS_DEBUG_LOGGING=ON + RTS_DEBUG_CRASHING=ON. Did I do something wrong?

@helmutbuhler
Copy link
Author

I tested this change with replay of https://github.com/roossienb/GeneralsWiki/blob/e52ca6618f90cef9ece00cbb7abacc293f580b12/SourceCode/Debug/files/Zerohour-skirmish.zip and it mismatched with vs6 release build + RTS_DEBUG_LOGGING=ON + RTS_DEBUG_CRASHING=ON. Did I do something wrong?

Thanks for testing! You are right, looks like there is something else that causes mismatches. I'll try to find that issue as well. It's a bummer though because it's really annoying to find these things. Took me almost an entire day to find this one in this PR.

Btw I used the attached replay for this PR, in case you wanna check. Someone on Discord provided it.
tansooo.zip
TANSO AI 1v1 GLA Mirror.zip

@helmutbuhler
Copy link
Author

I tested this change with replay of https://github.com/roossienb/GeneralsWiki/blob/e52ca6618f90cef9ece00cbb7abacc293f580b12/SourceCode/Debug/files/Zerohour-skirmish.zip and it mismatched with vs6 release build + RTS_DEBUG_LOGGING=ON + RTS_DEBUG_CRASHING=ON. Did I do something wrong?

Just pushed a fix for it. This time it's likely different code due to different inlining.

throw ERROR_BAD_ARG;
} else {
DEBUG_LOG(("Located default state to recover.\n"));
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this else block because it's redundant

@@ -499,8 +499,20 @@ State *StateMachine::internalGetState( StateID id )

if (i == m_stateMap.end())
{
DEBUG_CRASH(( "StateMachine::internalGetState(): Invalid state" ));
throw ERROR_BAD_ARG;
// TheSuperHackers @info helmutbuhler 26/04/2025
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replaced this with the ZH version

Copy link

@xezon xezon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you come to the conclusion that these logs cause mismatch in particular?

DEBUG_CRASH( ("StateMachine::internalGetState(): Invalid state for object %s using state %d", m_owner->getTemplate()->getName().str(), id) );
DEBUG_LOG(("Transisioning to state #d\n", (Int)id));
#if defined(_DEBUG) || defined(_INTERNAL)
DEBUG_LOG(("Transitioning to state #d\n", (Int)id));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

%d

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you fix this while at it?

@@ -518,15 +518,19 @@ State *StateMachine::internalGetState( StateID id )

if (i == m_stateMap.end())
{
// TheSuperHackers @info helmutbuhler 26/04/2025
// These logs cause mismatch on Release when logging is enabled.
DEBUG_CRASH( ("StateMachine::internalGetState(): Invalid state for object %s using state %d", m_owner->getTemplate()->getName().str(), id) );
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be included?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DEBUG_CRASH isn't enabled on Release, even when logging is enabled, so it should be fine

Copy link

@xezon xezon Apr 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DEBUG_CRASH is enabled when using the RTS_DEBUG_CRASHING cmake option, which is useful to get the assert message for Replay mismatch. It also works in Release.

@xezon
Copy link

xezon commented Apr 26, 2025

https://github.com/user-attachments/files/19917484/TANSO.AI.1v1.GLA.Mirror.zip

This replay cannot be used because the map is unknown.

@xezon
Copy link

xezon commented Apr 26, 2025

I tested this updated change against https://github.com/roossienb/GeneralsWiki/blob/e52ca6618f90cef9ece00cbb7abacc293f580b12/SourceCode/Debug/files/Zerohour-skirmish.zip and it still mismatches.

Edit: The above replay also goes out of sync without Release logging. Perhaps that replay is bad?

How do you determine what logging would be mismatching?

@roossienb
Copy link

roossienb commented Apr 26, 2025

I tested this updated change against https://github.com/roossienb/GeneralsWiki/blob/e52ca6618f90cef9ece00cbb7abacc293f580b12/SourceCode/Debug/files/Zerohour-skirmish.zip and it still mismatches.

Edit: The above replay also goes out of sync without Release logging. Perhaps that replay is bad?

How do you determine what logging would be mismatching?

I just tested the replay against the latest MAIN build and have no mismatches.

@helmutbuhler
Copy link
Author

How do you determine what logging would be mismatching?

I basically do a binary search over the cpp files. I disable the DEBUG_LOG macro in individual files and pin point to the file, then to the log call.

@xezon
Copy link

xezon commented Apr 26, 2025

I basically do a binary search over the cpp files. I disable the DEBUG_LOG macro in individual files and pin point to the file, then to the log call.

Uff that is tough. I was hoping that the CRC logging would help pinpointing it.

@helmutbuhler
Copy link
Author

I basically do a binary search over the cpp files. I disable the DEBUG_LOG macro in individual files and pin point to the file, then to the log call.

Uff that is tough. I was hoping that the CRC logging would help pinpointing it.

CRC logging can also cause mismatches ;) The problem isn't that something is written to disk, but that the optimizer produces different machinecode.

@xezon
Copy link

xezon commented Apr 28, 2025

I was unable to confirm that this change fixes mismatching when logging is enabled. But we can go ahead and merge it I guess.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Debug Is mostly debug functionality Fix Is fixing something Minor Severity: Minor < Major < Critical < Blocker
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants