-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Fix balancing groups incorrectly removed from negative lookarounds #120856
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/RegexNode.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes a bug where balancing groups inside negative lookarounds were incorrectly removed during regex optimization, causing patterns like ()(?'-1')(?!(?'-1'))
to fail to match when they should. The fix preserves balancing groups while still removing regular capture groups from negative lookarounds.
- Modified
RemoveCaptures
method to check if a capture node is a balancing group before removing it - Added test cases to verify correct matching behavior for balancing groups in negative lookarounds
- All existing tests continue to pass, confirming the optimization for regular captures still works
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
File | Description |
---|---|
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/RegexNode.cs |
Added check to preserve balancing groups (N != -1 ) while removing regular captures (N == -1 ) from negative lookarounds |
src/libraries/System.Text.RegularExpressions/tests/FunctionalTests/Regex.Match.Tests.cs |
Added test case verifying single match behavior with balancing groups in negative lookarounds |
src/libraries/System.Text.RegularExpressions/tests/FunctionalTests/Regex.Count.Tests.cs |
Added test case verifying correct match count for pattern with balancing groups in negative lookarounds |
Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions |
Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
/backport to release/10.0-staging |
Started backporting to release/10.0-staging: https://github.com/dotnet/runtime/actions/runs/18609604713 |
@stephentoub an error occurred while backporting to "release/10.0-staging", please check the run log for details! Error: The specified backport target branch "release/10.0-staging" wasn't found in the repo. |
i believe there's no release/10.0-staging yet? |
That's what the message above says, yes. |
/backport to release/10.0 |
Started backporting to release/10.0: https://github.com/dotnet/runtime/actions/runs/18639594553 |
Summary
Fixed a bug where balancing groups were incorrectly being removed from negative lookarounds during regex tree reduction, causing patterns like
()(?'-1')(?!(?'-1'))
to incorrectly return 0 matches instead of matching at every position.Root Cause
The issue was introduced by an optimization in the
RemoveCaptures
method withinReduceLookaround
inRegexNode.cs
. This method removes capture groups from negative lookarounds since captures inside negative lookarounds are undone after the lookaround completes. However, it was incorrectly removing ALL capture groups, including balancing groups.Balancing groups (e.g.,
(?'-1')
) have semantic meaning that affects matching behavior - they require a specific capture group to have been captured before they can succeed. Removing them changes the match semantics.The Fix
Modified the
RemoveCaptures
method to preserve balancing groups by checking ifN != -1
(where N stores the uncapture group number). The code now uses pattern matching for cleaner syntax:if (node is { Kind: RegexNodeKind.Capture, N: -1 })
. Updated comments to clarify that captures that don't rely on or impact persisted state can be removed, which includes backreferences and balancing groups.Changes Made
RemoveCaptures
to checknode.N == -1
before removing captures, using pattern matchingTest Results
()(?'-1')(?!(?'-1'))
now correctly matches at every position in the input stringOriginal prompt
Fixes #120849
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.