Skip to content

Emit R6Class methods as workspace symbols #861

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jul 21, 2025

Conversation

lionel-
Copy link
Contributor

@lionel- lionel- commented Jun 30, 2025

Addresses posit-dev/positron#6549

  • Add a bunch of tests for the workspace indexers which were not tested before.

  • Add a new indexed type Method. This is used to include R6Class methods in workspace symbols but exclusing them from completions.

  • Indexers now push symbols onto a list instead of returning a single symbol to their callers. This allows a single indexer to handle multiple symbols.

  • The assignment handler now pushes R6Class method symbols in addition to the assigned object.

QA Notes

Add the following to a file:

class <- R6::R6Class(
    'class',
    public = list(
      initialize = function() 'initialize',
      foo = function() {
        1
      }
    ),
    private = list(
      bar = function() {
        2
      },
      not_indexed1 = NULL
    ),
    other = list(
      not_indexed2 = function() {
        3
      }
    )
  )

These symbols should now be available as Workspace symbols (# prefix in command palette): initialize(), foo(), bar() and you should be able to jump straight to the definition of these methods from any file.

The symbols starting with not_indexed should not be available.

All this is tested on the backend side.

@@ -267,6 +267,7 @@ fn completions_from_workspace_arguments(
return Ok(None);
},
indexer::IndexEntryData::Variable { .. } => return Ok(None),
indexer::IndexEntryData::Method { .. } => return Ok(None),
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be worth trying to give these a unique name? Like, trying to sniff out the R6 class name to include as well?

Imagine if I implemented two of three of these similar interfaces in the same file, then I'd have no way of telling them apart

What does Rust do for something like this?

https://github.com/DavisVaughan/almanac/blob/main/R/cache-rrule.R
https://github.com/DavisVaughan/almanac/blob/main/R/cache-radjusted.R

Image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rust doesn't emit methods as Workspace symbols so this issue doesn't arise.

I think the file name is good enough to disambiguate most of the time? But we could probably make your suggestion work like this: MyClass$get_events()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would be kind of cool to also make it distinct as an R6 method? Maybe no parens, so MyClass$get_events

Comment on lines +634 to +659
for m in self.cursor.matches(&self.query, node, contents) {
for cap in m.captures.iter() {
let cap_name = &self.query.capture_names()[cap.index as usize];
if *cap_name == capture_name {
result.push(cap.node);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want cursor.matches() or the flat cursor.captures()?

If I remember right, captures() is a flat list, but matches() has some structure that it doesn't look like you use

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm that sounded plausible but it looks like I can't access the capture name with the captures() iterator

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No matter what you use, I think you still get the name via self.query.capture_names()[cap.index as usize], this just avoids the double loop?

if crate::treesitter::node_is_call(&rhs, "R6Class", contents) ||
crate::treesitter::node_is_namespaced_call(&rhs, "R6", "R6Class", contents)
{
index_r6_class(path, contents, &rhs, entries)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should probably early return after this right? If not, maybe a comment about why fallthrough is good?

Copy link
Contributor Author

@lionel- lionel- Jul 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've renamed to make it clear only methods are indexed, and added a comment explaining the fallthrough:

        index_r6_class_methods(path, contents, &rhs, entries)?;
        // Fallthrough to index the variable to which the R6 class is assigned

Comment on lines 317 to 334
// Tree-sitter query to match individual methods in R6Class public/private lists
let query_str = r#"
(argument
name: (identifier) @access
value: (call
function: (identifier) @_list_fn
arguments: (arguments
(argument
name: (identifier) @method_name
value: (function_definition) @method_fn
)
)
)
(#match? @access "public|private")
(#eq? @_list_fn "list")
)
"#;
let mut ts_query = crate::treesitter::TSQuery::new(query_str)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I will note is that tree-sitter recommends that you "compile" your static Query objects exactly once ahead of time, because they do take a non zero amount of overhead to parse and compile.

Would it be possible to rearrange things to end up with a Lazy<Query> that is initialized once and then fed to your TSQuery helper?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that we then need to split the query and the query cursor because the latter is stateful. I wasn't too worried about the overhead and thought I'd favour ergonomics.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we then need to split the query and the query cursor

tbh that sounds like a feature and is how tree-sitter is supposed to work

Comment on lines +336 to +342
// We'll switch from Rope to String in the near future so let's not
// worry about this conversion now
let contents_str = contents.to_string();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, should be nicer all around

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you wanted to be super fancy I believe you could implement TextProvider for Rope

https://github.com/tree-sitter/tree-sitter/blob/618b9dd66e1bd529d9757202cca5a99d333ce394/lib/binding_rust/lib.rs#L3092

https://github.com/tree-sitter/tree-sitter/blob/618b9dd66e1bd529d9757202cca5a99d333ce394/lib/binding_rust/lib.rs#L349-L355

It's probably not worth it but i think the idea is that you can somehow provide some iterator when you don't have a complete contiguous buffer available

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it might be easier than that? It looks like they already provide an implementation of TextProvider for FnMut(Node) -> <iterator over bytes>

https://github.com/tree-sitter/tree-sitter/blob/618b9dd66e1bd529d9757202cca5a99d333ce394/lib/binding_rust/lib.rs#L3561-L3580

So you could probably provide something like

mut |node: Node| {
  iter::once(contents.node_slice(&node).unwrap().to_string().as_bytes())
}

which would materialize a node worth of text in to_string() but not the whole document

It's not allowed to fail so I guess you have to unwrap()

May or may not be worth it, just a thought

Copy link
Contributor Author

@lionel- lionel- Jul 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I didn't look into these more complex setups since we're going to switch to a flat document string soon that's straightforward to iterate over.

@lionel- lionel- force-pushed the feature/workspace-symbols-r6class branch from ee3801d to 95bf05d Compare July 21, 2025 11:18
@lionel- lionel- merged commit 69f49c4 into main Jul 21, 2025
6 checks passed
@lionel- lionel- deleted the feature/workspace-symbols-r6class branch July 21, 2025 12:40
@github-actions github-actions bot locked and limited conversation to collaborators Jul 21, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants