data explorer: "Copy as Code" comms #8536

isabelizimm · 2025-07-15T20:39:23Z

related issue: #8531
related pr for Ark: posit-dev/ark#873

I'm not exactly sure of the flow of these PRs with Ark, but I believe we'll want to merge that PR, release Ark, and then pick up the new release in this PR before this PR is merged.

This PR implements makes 3 major changes to the comms:

add convert_to_code as a supported feature: there are other supported features, such as row filters and column filters. This structure includes if it is supported as well as supported syntaxes, as an array of values.
suggest_code_syntax message: for a particular data explorer instance, decide what options a user has for generating code. I would expect the return value of this to be base-r or something to that effect.
convert_to_code message: for a particular data explorer instance, this message sends over the filters and sorts applied to the data and the desired syntax. The expectation is that this will return a code snippet string. The front end will (eventually) apply code highlighting and formatting.

Below is a diagram of the general flow through comms. Note that these comm messages will not call each other; they will be triggered by a modal in Positron core.

flowchart TD
    A["Data explorer instance is created"] --> B["Get all potential syntaxes from language pack via the **supported_features**." ]
    B -->  C["Also set suggested syntax with **suggest_code_syntax**"]
    C --> D[Open modal]
    D --> E["Use **dataExplorerInstance.suggestedSyntax**"]
    E --> F["Translate data view (filters, sorts) to code with **_translate_to_code_**"]
    F --> G[Display code for user to copy]
    G -->|user chooses another syntax from modal dropdown| F

There is an open PR that builds off this branch to use these comms, so you can actually fire them off/interact with them #8537

Release Notes

New Features

N/A

Bug Fixes

N/A

QA Notes

It doesn't look like we have much testing for comms, but open to ideas! Currently I plan on adding unit tests on each language runtime and then e2e tests for the UI pieces.

Co-authored-by: Dhruvi Sompura <dhruvisompura@users.noreply.github.com>

github-actions · 2025-07-15T20:39:41Z

E2E Tests 🚀
This PR will run tests tagged with: @:critical

^readme ^{valid tags}

Copilot

Pull Request Overview

This PR adds support for exporting the current data explorer view as code by fetching available syntaxes and translating applied filters/sorts into user-copyable code.

Introduce new RPC methods GetCodeSyntaxes and TranslateToCode in the DuckDB backend, comm layer, and client instances.
Add TypeScript interfaces and Python stubs for the new code export functionality.
Extend the OpenRPC specification to include the new methods.

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
src/vs/workbench/services/positronDataExplorer/common/positronDataExplorerDuckDBBackend.ts	Added backend RPC methods for code syntax retrieval and code translation
src/vs/workbench/services/positronDataExplorer/browser/positronDataExplorerInstance.ts	Implemented instance methods and UI logic for code export
src/vs/workbench/services/positronDataExplorer/browser/interfaces/positronDataExplorerInstance.ts	Extended the public instance interface with code export methods
src/vs/workbench/services/languageRuntime/common/positronDataExplorerComm.ts	Defined new comm interfaces and request types for code export
src/vs/workbench/services/languageRuntime/common/languageRuntimeDataExplorerClient.ts	Added client‐side wrappers for the new RPCs
positron/comms/data_explorer-backend-openrpc.json	Extended the OpenRPC spec with `translate_to_code` and `get_code_syntaxes`
extensions/positron-python/python_files/posit/positron/data_explorer_comm.py	Added Python Pydantic models and request types for code export
extensions/positron-python/python_files/posit/positron/data_explorer.py	Added abstract stubs and example implementations for code export in Python

Comments suppressed due to low confidence (4)

src/vs/workbench/services/positronDataExplorer/browser/positronDataExplorerInstance.ts:434

[nitpick] The JSDoc summary for getCodeSyntaxes is incorrect. Update it to describe fetching available syntaxes rather than exporting filters to clipboard.

	 * Export filters as code.

src/vs/workbench/services/positronDataExplorer/browser/positronDataExplorerInstance.ts:447

[nitpick] The JSDoc for translateToCode refers to clipboard copying. It should describe returning generated code for a desired syntax.

	 * Export filters as code.

src/vs/workbench/services/languageRuntime/common/positronDataExplorerComm.ts:1402

The supported_features array should list the actual RPC method key (translate_to_code), not export_as_code, to correctly reflect available backend capabilities.

*/

positron/comms/data_explorer-backend-openrpc.json:262

The code_syntax parameter is defined as a string, so it should not include an items schema. Remove the items block here.

						"items": {

src/vs/workbench/services/positronDataExplorer/common/positronDataExplorerDuckDBBackend.ts

positron/comms/data_explorer-backend-openrpc.json

jennybc

I've got some early comments and questions to help me get better oriented to the surroundings. I am very new to the data explorer neighborhood.

Two issues I raised inline:

Should we return the code as a single string or an array of strings?
The type of the object being explored is not explicit over on this side. Which is perhaps to be expected (?), but I could use some help understanding this better.

A couple of other high-level thoughts:

It feels like the code syntax should be more of an enum than a free string. Maybe namespaced within language? Something like python:polars, r:tidyverse, etc.
The current PR treats available code syntaxes as a matter to be determined every time that a user asks for code. But this is really a static capability that the kernel has when it fires up and it never changes during the session. Then the syntaxes available to a specific data explorer instance are always a subset of that.
When I imagine the syntaxes being displayed in the modal as a dropdown or similar: it feels like the set of choices should be fixed (e.g. for R, "base", "tidyverse", "data.table") and appear in a fixed order. And then the default selection could be influenced by the type of data in the explorer (best match). And a syntax that is truly impossible/unsupported might be greyed out (*). Instead of the backend returning a set of possibilities (such as ['base-r', 'dplyr']), it feels more like we should pass back a TRUE/FALSE for each syntax (can we do it?) and a default (the syntax that seems like the best match).

(*) That would never really be the case for R, because we have as.whatever() functions to coerce things back and forth. I gather the situation for Python might be different?

jennybc · 2025-07-15T23:07:08Z

positron/comms/data_explorer-backend-openrpc.json

+					],
+					"properties": {
+						"code": {
+							"type": "string",


How sure are we that we want a string? As opposed to an array of strings, i.e. one per line. I can't say exactly if/why this matters, but somehow sending lines of code back to the front end feels more proper than a single collapsed string.

Oh, I like this idea! It does feel more proper, somehow. An array would probably be easier for formatting anyway; I'll move over to this structure. 👍

jennybc · 2025-07-15T23:15:42Z

positron/comms/data_explorer-backend-openrpc.json

+{
+			"name": "get_code_syntaxes",
+			"summary": "Get code syntaxes supported for code translation",
+			"description": "Get all available code syntaxes supported for translation for a data view",


I feel like there's some notion that the backend is expected to reply using knowledge of the type of data being viewed. In the R world, i.e. that the backend will know the class(es) of the object, in this sense:

class(palmerpenguins::penguins) #> [1] "tbl_df" "tbl" "data.frame"

which would effect which syntaxes are supported or which syntax is the default.

I don't see this information being sent over. Does the backend obtain that info another way? I do understand that the object in question must already exist in the runtime and, therefore, the backend already knows (or can know) this. And I even have some sense that all of this is unfolding inside a data explorer instance and therefore the target data object is implicitly available. I just don't understand the information flow yet.

The table lives on the RDataExplorer, which is created when the data explorer is opened. We should be able to use that table to get information like class without explicitly passing it over the comm.

isabelizimm · 2025-07-16T15:51:38Z

It feels like the code syntax should be more of an enum than a free string. Maybe namespaced within language? Something like python:polars, r:tidyverse, etc.

This could be done on the language pack side, but I do want to stay away from storing any knowledge on the language pack side. Theoretically, if another, non-R/Python language pack wanted to implement this, it would be ideal to not have to make any changes to Positron core.

The current PR treats available code syntaxes as a matter to be determined every time that a user asks for code. But this is really a static capability that the kernel has when it fires up and it never changes during the session. Then the syntaxes available to a specific data explorer instance are always a subset of that.

It's true that the full scope of options would be available when the kernel fires up. At the moment when a data explorer instance is created, we could get these options before the user asks for code. That would at least allow us to make one less async call in the modal, which might make it feel a bit snappier.

When I imagine the syntaxes being displayed in the modal as a dropdown or similar: it feels like the set of choices should be fixed (e.g. for R, "base", "tidyverse", "data.table") and appear in a fixed order. And then the default selection could be influenced by the type of data in the explorer (best match). And a syntax that is truly impossible/unsupported might be greyed out (*). Instead of the backend returning a set of possibilities (such as ['base-r', 'dplyr']), it feels more like we should pass back a TRUE/FALSE for each syntax (can we do it?) and a default (the syntax that seems like the best match).

With a dropdown, I think it makes the most sense to just omit the options that are not available, rather than greying them out and adding prose that these are the options available for this data type. I'm not sure I understand the benefit of a true/false vs listing what is possible in order. It would add another comm call with essentially the same information, since this would be created per data explorer instance?

jennybc

This feels like a big step forward. I'm definitely learning a lot about this neck of the woods just by reviewing this!

In my inline comments, there are 2 big themes:

Vocabulary questions. On one hand, this is an internal matter and is not a hill I will die on. And yet I do really think "translate" and "guess" don't quite have the right vibes.
Type choices. It feels like we still need to adjust some of the interfaces. Ideally we'd use an enum for the syntaxes, but I do understand why we can't, if we want that to be completely owned by the backends. The way we're handling "no syntax" doesn't feel quite right yet, i.e. that shouldn't just be another arbitrary syntax string. Hopefully my suggestions will get us moving in a good direction.

jennybc · 2025-07-17T17:49:33Z

positron/comms/data_explorer-backend-openrpc.json

+			"name": "translate_to_code",
+			"summary": "Translates the current data view into a code snippet.",
+			"description": "Translate filters and sort keys as code in different syntaxes like pandas, polars, data.table, dplyr",


I think "translate" is not a great verb here, because it could have so many meanings. Translate from English to another language? Translate from R to Python? I realize there's a lot of context to help someone clarify this, but I suspect it would be even better to use a different verb.

For example, I think "convert" or "generate" are better.

This choice obviously affects a lot of locations (every instance of "translate", "translation", "translated", etc.), so I won't repeat the comment elsewhere.

I had started with generate and moved away from it since I felt it was too AI-y, but I'm on board with convert instead 💯

I hear you re: "generate" and "AI-y" in terms of the UI. But internally, I think "generate" is a very viable option. I'm thinking about how nice "generated code" feels versus "converted code".

In any case, both "generate" and "convert" might be handy for actual names and then also for docstrings.

jennybc · 2025-07-17T17:53:56Z

positron/comms/data_explorer-backend-openrpc.json

+			}
+		},
+{
+			"name": "guess_code_syntax",


I think "suggest" fits the intent better. I feel like "guess" implies there is a right answer and we're trying our best to guess it. Whereas "suggest" seems to fit the situation better. When I see "guessing", I think about "type guessing". Whereas we're suggesting a syntax based on the type.

I feel less strongly about this than about "translate".

Oh! I like this idea. Will change from guess -> suggest

positron/comms/data_explorer-backend-openrpc.json

jennybc · 2025-07-17T19:20:26Z

src/vs/workbench/services/languageRuntime/common/languageRuntimeDataExplorerClient.ts

+		this.guessCodeSyntax().then(syntax => {
+			this.guessedSyntax = syntax;
+		});


Possibly paranoid, but should this check that the suggested (guessed, in the current vocabulary) syntax is among the ones that this data explorer instance claims to support? Given that syntax can't be an actual enum on this side, it feels like we should be a bit pedantic re: checking that the backend is internally consistent.

Added early erroring/returns if 1) the backend has stated it does not support any syntaxes to convert or 2) if the desired syntax is not one of the options available. This definitely feels a lot safer!

jennybc · 2025-07-17T19:25:48Z

src/vs/workbench/services/languageRuntime/common/languageRuntimeDataExplorerClient.ts

+	 * Guess the code syntax for export.
+	 * @returns A promise that resolves to the available code syntaxes.
+	 */
+	async guessCodeSyntax(): Promise<CodeSyntax> {


Perhaps this is a better place to check that the returned syntax is among those that the backend claims to support.

jennybc · 2025-07-17T19:43:37Z

src/vs/workbench/services/languageRuntime/common/positronDataExplorerComm.ts

+	/**
+	 * Best guess of syntax to use for code translation
+	 */
+	code_syntax?: string;


I feel like this being nullable creates problems elsewhere and is, I'm guessing, why we don't then use CodeSyntax in all of the other places I'd expect later in this file. Something feels slightly off about the current situation, where it feels like 'No available syntaxes' could be an actual syntax as far as the frontend knows. Whereas you'd really like to make undefined be the signal for that situation.

I chatted with copilot about my discomfort and this is an interesting suggestion. At the very least, it starts to get at my concern:

/** * Represents a code syntax name supported by the backend * This is a string at runtime, but using a distinct type for type safety */ export type CodeSyntaxName = string; /** * Best guess of syntax to use for code translation */ export interface CodeSyntax { /** * Best guess of syntax to use for code translation */ code_syntax?: CodeSyntaxName; } // Update the SupportedFeatures interface export interface SupportedFeatures { // ...other properties /** * List of supported code syntax names */ code_syntaxes: Array<CodeSyntaxName>; } // Update TranslateToCodeParams export interface TranslateToCodeParams { /** * Zero or more column filters to apply */ column_filters: Array<ColumnFilter>; /** * Zero or more row filters to apply */ row_filters: Array<RowFilter>; /** * Zero or more sort keys to apply */ sort_keys: Array<ColumnSortKey>; /** * The code syntax to use for translation */ code_syntax: CodeSyntaxName; } // Update the translateToCode method translateToCode(columnFilters: Array<ColumnFilter>, rowFilters: Array<RowFilter>, sortKeys: Array<ColumnSortKey>, codeSyntax: CodeSyntaxName): Promise<TranslatedCode> { return super.performRpc('translate_to_code', ['column_filters', 'row_filters', 'sort_keys', 'code_syntax'], [columnFilters, rowFilters, sortKeys, codeSyntax]); }

Something feels slightly off about the current situation, where it feels like 'No available syntaxes' could be an actual syntax as far as the frontend knows. Whereas you'd really like to make undefined be the signal for that situation.

Ah, yeah that is not ideal 😬 I think the only time we really want a string to designate that there are no syntaxes is in the actual modal itself, not anywhere in this PR. Updated to reflect this!

jennybc · 2025-07-17T19:48:16Z

src/vs/workbench/services/positronDataExplorer/browser/positronDataExplorerInstance.ts

+	 * Translates the data explorer view to code in the desired syntax.
+	 * @returns A Promise<string> that resolves when generated code is returned.
+	 */
+	async translateToCode(desiredSyntax: string): Promise<string> {


Placing this here for lack of another concrete location: it feels like somewhere there should be an early exit for this entire feature if the backend state has an empty array of supported syntaxes. AFAICT that would be the expected way for the frontend to learn that the backend does not support "export to code".

I added a isSyntaxSupported() function, as well as a feature support flag (much like the other supported_features on the comm) to give some flexibility as we build out different syntax support! We can now early exit when we're converting code, or perhaps even earlier when trying to get a suggestion.

jennybc · 2025-07-17T19:51:28Z

src/vs/workbench/services/languageRuntime/common/languageRuntimeDataExplorerClient.ts

+	/**
+	 * The guessed code syntax for the data explorer instance, if we generate code.
+	 */
+	public guessedSyntax: CodeSyntax = { code_syntax: 'No available syntaxes' };


I say more about this below, but having a special string indicate that this piece of information is unavailable doesn't feel right.

Co-authored-by: Jennifer (Jenny) Bryan <jenny.f.bryan@gmail.com> Signed-off-by: Isabel Zimmerman <54685329+isabelizimm@users.noreply.github.com>

jennybc

@isabelizimm and I just did another round of live review on zoom. I'm going to approve this now, with the expectation that some more tweaks are coming based on that conversation. It would be great to update the main PR description to reflect the current PR state.

This feels really close to an MVP we can merge and then start to refine. It's a good time to get a 3rd set of 👀 on this.

Signed-off-by: Isabel Zimmerman <54685329+isabelizimm@users.noreply.github.com>

softwarenerd

Jumping in a little late, and without a ton of "comms" definition experiences, it looks like a great start to me. I have some thoughts on how the UI of this could work, when that time comes.

isabelizimm and others added 12 commits July 2, 2025 15:03

make action and button

02a2c28

Merge branch 'main' into export-to-code

dc9ef57

plumb through comms

6f502e0

export -> copy

dcf380b

add modal on filter bar

32eb2c0

ui updates

f9a2c01

Co-authored-by: Dhruvi Sompura <dhruvisompura@users.noreply.github.com>

make calls from modal

2b9f962

clean up comms

5083889

clean up modal

5be588b

linting for python

d59fd66

nit to modal

cc2d069

Merge branch 'export-as-code-comms' into copy-as-code-comms

bae6ced

isabelizimm requested a review from Copilot July 15, 2025 20:48

Copilot AI reviewed Jul 15, 2025

View reviewed changes

src/vs/workbench/services/positronDataExplorer/common/positronDataExplorerDuckDBBackend.ts Outdated Show resolved Hide resolved

positron/comms/data_explorer-backend-openrpc.json Outdated Show resolved Hide resolved

updates from copilot

4d624bc

isabelizimm marked this pull request as ready for review July 15, 2025 21:33

This was referenced Jul 15, 2025

WIP: copy as code comms #8492

Closed

data explorer: create "Copy as Code" modal #8537

Open

isabelizimm added 2 commits July 15, 2025 18:22

nit

ac12c66

cleanup

0d6e7d9

isabelizimm requested review from softwarenerd and jennybc July 15, 2025 22:54

jennybc reviewed Jul 15, 2025

View reviewed changes

jennybc mentioned this pull request Jul 15, 2025

data explorer: update comms with convert to code messages posit-dev/ark#873

Merged

isabelizimm added 4 commits July 16, 2025 18:03

updates from review

c33df7c

guess syntax immediately

45cd402

reorder guessing

1adfc02

no undefined for guessing

7a0d40b

python lint

9cbe02a

jennybc reviewed Jul 17, 2025

View reviewed changes

isabelizimm and others added 5 commits July 17, 2025 16:56

Update positron/comms/data_explorer-backend-openrpc.json

51893f0

Co-authored-by: Jennifer (Jenny) Bryan <jenny.f.bryan@gmail.com> Signed-off-by: Isabel Zimmerman <54685329+isabelizimm@users.noreply.github.com>

rename, early exits

9517202

use supported feature flag for syntaxes

36bf5f5

refactor

da70e52

clean up comm names

b6bbe5c

jennybc previously approved these changes Jul 18, 2025

View reviewed changes

isabelizimm added 2 commits July 18, 2025 13:38

updates from review

ccd15c4

tweaks to runtime

0cc6f14

isabelizimm dismissed jennybc’s stale review via 0cc6f14 July 18, 2025 18:16

isabelizimm and others added 2 commits July 18, 2025 12:34

Merge branch 'main' into copy-as-code-comms

0c0ffac

Signed-off-by: Isabel Zimmerman <54685329+isabelizimm@users.noreply.github.com>

lint post merge from main

a85ef9f

softwarenerd approved these changes Jul 18, 2025

View reviewed changes

isabelizimm merged commit b60b0bd into main Jul 21, 2025
30 checks passed

isabelizimm deleted the copy-as-code-comms branch July 21, 2025 20:39

github-actions bot locked and limited conversation to collaborators Jul 21, 2025

data explorer: "Copy as Code" comms #8536

data explorer: "Copy as Code" comms #8536

Conversation

isabelizimm commented Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Release Notes

New Features

Bug Fixes

QA Notes

Uh oh!

github-actions bot commented Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

jennybc left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jennybc Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

isabelizimm commented Jul 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jennybc left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jennybc left a comment

Choose a reason for hiding this comment

Uh oh!

softwarenerd left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

isabelizimm commented Jul 15, 2025 •

edited

Loading

github-actions bot commented Jul 15, 2025 •

edited

Loading

jennybc left a comment •

edited

Loading

jennybc Jul 15, 2025 •

edited

Loading

isabelizimm commented Jul 16, 2025 •

edited

Loading