Skip to content

fix: group small C# nodes to meet minimum block size for indexing #6049

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

roomote[bot]
Copy link

@roomote roomote bot commented Jul 22, 2025

Summary

This PR fixes an issue where C# code indexing was only capturing using directives and not the actual code content (classes, methods, properties, etc.).

Problem

The code parser was filtering out individual tree-sitter nodes that were smaller than the minimum block size (50 characters). This meant that:

  • Individual using directives (e.g., using System;) were too small to be indexed on their own
  • When no nodes met the size requirement, the parser would fall back to chunking the entire file content
  • This resulted in only using directives being indexed

Solution

Added a _groupSmallNodes method that:

  • Groups consecutive nodes of the same type that are individually too small
  • Ensures semantically related constructs (like using directives) are indexed together
  • Only groups nodes that are close together (within 2 lines)
  • Preserves the original behavior for nodes that already meet the size requirement

Testing

  • Added comprehensive unit tests to verify the grouping behavior
  • Tests cover both successful grouping and cases where nodes shouldn't be grouped
  • All existing tests continue to pass

Related Issue

Fixes #6048


Important

Introduces _groupSmallNodes in parser.ts to group small C# nodes, ensuring they meet the minimum block size for indexing, with tests in parser-csharp-fix.spec.ts.

  • Behavior:
    • Adds _groupSmallNodes method in parser.ts to group small nodes, ensuring they meet the minimum block size.
    • Groups consecutive using_directive nodes within 2 lines in C# files.
    • Preserves original behavior for nodes meeting size requirements.
  • Testing:
    • Adds parser-csharp-fix.spec.ts with tests for grouping behavior.
    • Tests verify successful grouping and cases where nodes shouldn't be grouped.
    • All existing tests continue to pass.

This description was created by Ellipsis for 0df020f. You can customize this summary. It will automatically update as commits are pushed.

- Added _groupSmallNodes method to group consecutive small nodes of the same type
- This ensures using directives and other small constructs are properly indexed
- Fixes issue where only using directives were being indexed in C# files
- Added comprehensive tests to verify the fix

Fixes #6048
@roomote roomote bot requested review from mrubens, cte and jr as code owners July 22, 2025 05:08
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. bug Something isn't working labels Jul 22, 2025
@hannesrudolph hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Jul 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. size:L This PR changes 100-499 lines, ignoring generated files.
Projects
Status: Triage
Development

Successfully merging this pull request may close these issues.

Code base Indexing in C#/.Net Not Extracting Valuable Code Segments
2 participants