Skip to content

Conversation

Deepam02
Copy link

@Deepam02 Deepam02 commented Oct 2, 2025

Fixes #2873

What this PR does / why we need it:

Quick fix for the data race when calling RuntimeInfo concurrently.

The issue was a global syncPodSets variable getting hammered by multiple goroutines at once. Moved it to an instance field in the Info struct so each call gets its own copy - no more shared state, no more race conditions.

Pretty straightforward change but I did an intensive code review to make sure it actually solves the problem. Traced through all the call sites and execution flows to verify each RuntimeInfo() creates an isolated instance with no shared mutable state between concurrent calls.

Which issue(s) this PR fixes:
Fixes #2873

Checklist:

  • Docs included if any changes are user facing

@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign terrytangyuan for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@Deepam02 Deepam02 changed the title fix(runtime): make RuntimeInfo thread-safe by using instance field fix(operator): make RuntimeInfo thread-safe by using instance field Oct 2, 2025
@Deepam02 Deepam02 force-pushed the fix/runtime-info-thread-safety branch from 5239894 to 4afa87b Compare October 2, 2025 13:25
@Deepam02
Copy link
Author

Deepam02 commented Oct 2, 2025

Test Failures

The unit tests are failing because cmp.Diff cannot compare the new unexported syncPodSets field.

I'm not really sure how to proceed - should I change the tests or export it as SyncPodSets so tests work without changes?

@andreyvelich
Copy link
Member

@Deepam02 You might need to explore why unit tests are failing

@tenzen-y
Copy link
Member

Test Failures

The unit tests are failing because cmp.Diff cannot compare the new unexported syncPodSets field.

I'm not really sure how to proceed - should I change the tests or export it as SyncPodSets so tests work without changes?

You can leverage the cmp ignore options to ignore syncPods functions in tests.

Fixes kubeflow#2873

Signed-off-by: Deepam02 <116721751+Deepam02@users.noreply.github.com>
@Deepam02 Deepam02 force-pushed the fix/runtime-info-thread-safety branch from 4afa87b to 5f9a7f5 Compare October 17, 2025 16:33
@google-oss-prow google-oss-prow bot added size/L and removed size/S labels Oct 17, 2025
Signed-off-by: Deepam02 <116721751+Deepam02@users.noreply.github.com>
@Deepam02 Deepam02 force-pushed the fix/runtime-info-thread-safety branch from 5f9a7f5 to ab4ef92 Compare October 17, 2025 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make RuntimeInfo thread safe

3 participants