Skip to content

Fix module context object re-usage in scripting engines #2358

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: unstable
Choose a base branch
from

Conversation

rjd15372
Copy link
Member

This commit refactors the scripting engine to support multiple cached module contexts per engine, rather than relying on a single cached ValkeyModuleCtx object.

Previously, having only one cached context object caused data races over the state stored in the context object, because it's possible that a script that is running for a long time to yield and the server event loop may call the scriptingEngineCallGetMemoryInfo function to get the scripting engine memory information, which re-uses the same cached context object. Another possible data-race is caused by the asynchronous scripts flush, which calls the scriptingEngineCallFreeFunction function in an background thread, and also re-uses the cached context object.

To address this, a cache array of module contexts was introduced in the scripting engine structure, with each slot dedicated to a specific use case—such as script execution, memory info queries, or function freeing.

This commit refactors the scripting engine to support multiple cached module
contexts per engine, rather than relying on a single cached `ValkeyModuleCtx`
object.

Previously, having only one cached context object caused data races over the
state stored in the context object, because it's possible that a script that is
running for a long time to yield and the server event loop may call the
`scriptingEngineCallGetMemoryInfo` function to get the scripting engine memory
information, which re-uses the same cached context object.
Another possible data-race is caused by the asynchronous scripts flush,
which calls the `scriptingEngineCallFreeFunction` function in an
background thread, and also re-uses the cached context object.

To address this, a cache array of module contexts was introduced in the scripting
engine structure, with each slot dedicated to a specific use case—such as script
execution, memory info queries, or function freeing.

Signed-off-by: Ricardo Dias <ricardo.dias@percona.com>
@rjd15372 rjd15372 requested a review from madolson July 15, 2025 16:11
@rjd15372 rjd15372 added the bug Something isn't working label Jul 15, 2025
Copy link

codecov bot commented Jul 15, 2025

Codecov Report

❌ Patch coverage is 70.37037% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 71.34%. Comparing base (de76586) to head (8c3f028).
⚠️ Report is 10 commits behind head on unstable.

Files with missing lines Patch % Lines
src/scripting_engine.c 70.37% 8 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #2358      +/-   ##
============================================
- Coverage     71.41%   71.34%   -0.08%     
============================================
  Files           123      123              
  Lines         67132    67163      +31     
============================================
- Hits          47942    47915      -27     
- Misses        19190    19248      +58     
Files with missing lines Coverage Δ
src/scripting_engine.c 74.79% <70.37%> (-0.21%) ⬇️

... and 23 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@rjd15372 rjd15372 moved this to In Progress in Valkey 9.0 Jul 21, 2025
@madolson madolson added the release-notes This issue should get a line item in the release notes label Jul 21, 2025
@madolson
Copy link
Member

This theoretically should be backported, but we probably won't since the impact doesn't seem that high. This is not required for RC1 but should be merged before 9.0.

Copy link
Member

@enjoy-binbin enjoy-binbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may not be familiar with the details, but overall LGTM.

Signed-off-by: Ricardo Dias <ricardo.dias@percona.com>
@rjd15372
Copy link
Member Author

@enjoy-binbin thanks for the review!

Copy link
Member

@madolson madolson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't originally approve it because I was confused about the code call path. It's odd to me that we call engineSetupModuleCtx(engine, NULL); even on the non-module call path. I sort of understand the fix, but it sort of feels like there might be a more structural issue.

Signed-off-by: Ricardo Dias <ricardo.dias@percona.com>
@rjd15372
Copy link
Member Author

I didn't originally approve it because I was confused about the code call path. It's odd to me that we call engineSetupModuleCtx(engine, NULL); even on the non-module call path. I sort of understand the fix, but it sort of feels like there might be a more structural issue.

We setup a module context for all callback functions as a conservative approach because the callback implementation might use the context object for some unforeseen reason.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working release-notes This issue should get a line item in the release notes
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

3 participants