Skip to content

Assistant: Initial pass at implementing a data summary tool for Python #8208

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 33 additions & 0 deletions extensions/positron-assistant/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -336,6 +336,39 @@
"positron-assistant"
]
},
{
"name": "getTableSummary",
"displayName": "Get Table Summary",
"modelDescription": "Get structured information about table variables in the current session.",
"inputSchema": {
"type": "object",
"properties": {
"sessionIdentifier": {
"type": "string",
"description": "The identifier of the session that contains the tables."
},
"accessKeys": {
"type": "array",
"description": "An array of table variables to summarize.",
"items": {
"type": "array",
"description": "A list of access keys that identify a variable by specifying its path.",
"items": {
"type": "string",
"description": "An access key that uniquely identifies a variable among its siblings."
}
}
}
},
"required": [
"sessionIdentifier",
"accessKeys"
]
},
"tags": [
"positron-assistant"
]
},
{
"name": "getProjectTree",
"displayName": "Get Project Tree",
Expand Down
17 changes: 17 additions & 0 deletions extensions/positron-assistant/src/md/prompts/chat/agent.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,23 @@ results, generate the code and return it directly without trying to execute it.
<package-management>
You adhere to the following workflow when dealing with package management:

**Data Object Information Workflow:**

When the user asks questions that require detailed information about tabular
data objects (DataFrames, arrays, matrices, etc.), use the `getTableSummary`
tool to retrieve structured information such as data summaries and statistics.

To use the tool effectively:

1. First ensure you have the correct `sessionIdentifier` from the user context
2. Provide the `accessKeys` array with the path to the specific data objects
- Each access key is an array of strings representing the path to the variable
- If the user references a variable by name, determine the access key from context or previous tool results
3. Do not call this tool when:
- The variables do not appear in the user context
- There is no active session
- The user only wants to see the structure/children of objects (use `inspectVariables` instead)

Comment on lines +26 to +42
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it intentional to keep this within the <package-management></<package-management> section of the prompt? Should a separate section be used for this?

**Package Management Workflow:**

1. Before generating code that requires packages, you must first use the appropriate tool to check if each required package is installed. To do so, first determine the target language from the user's request or context
Expand Down
50 changes: 50 additions & 0 deletions extensions/positron-assistant/src/tools.ts
Original file line number Diff line number Diff line change
Expand Up @@ -299,6 +299,56 @@ export function registerAssistantTools(

context.subscriptions.push(inspectVariablesTool);

const getTableSummaryTool = vscode.lm.registerTool<{ sessionIdentifier: string; accessKeys: Array<Array<string>> }>(PositronAssistantToolName.GetTableSummary, {
/**
* Called to get a summary information for one or more tabular datasets in the current session.
* @param options The options for the tool invocation.
* @param token The cancellation token.
* @returns A vscode.LanguageModelToolResult containing the data summary.
*/
invoke: async (options, token) => {

// If no session identifier is provided, return an empty array.
if (!options.input.sessionIdentifier || options.input.sessionIdentifier === 'undefined') {
return new vscode.LanguageModelToolResult([
new vscode.LanguageModelTextPart('[[]]')
]);
Comment on lines +313 to +315
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we throw an error here?

}

// temporarily only enable for Python sessions
let session: positron.LanguageRuntimeSession | undefined;
const sessions = await positron.runtime.getActiveSessions();
if (sessions && sessions.length > 0) {
session = sessions.find(
(session) => session.metadata.sessionId === options.input.sessionIdentifier,
);
}
if (!session) {
return new vscode.LanguageModelToolResult([
new vscode.LanguageModelTextPart('[[]]')
]);
}

if (session.runtimeMetadata.languageId !== 'python') {
return new vscode.LanguageModelToolResult([
new vscode.LanguageModelTextPart('[[]]')
]);
}
Comment on lines +318 to +336
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on filtering out this tool when there's no session and (temporarily) if it's not a python session? Maybe we can also temporarily add a note to the tool description that the tool is only available for Python. This way, the tool shouldn't be available or run at all.

We have some tool filtering logic here:

// List of tools for use by the language model.
const tools: vscode.LanguageModelChatTool[] = vscode.lm.tools.filter(
tool => {
// Don't allow any tools in the terminal.
if (this.id === ParticipantID.Terminal) {
return false;
}
// Define more readable variables for filtering.
const inChatPane = request.location2 === undefined;
const inEditor = request.location2 instanceof vscode.ChatRequestEditorData;
const hasSelection = inEditor && request.location2.selection?.isEmpty === false;
const isAgentMode = this.id === ParticipantID.Agent;
// If streaming edits are enabled, don't allow any tools in inline editor chats.
if (isStreamingEditsEnabled() && this.id === ParticipantID.Editor) {
return false;
}
// If the tool requires a workspace, but no workspace is open, don't allow the tool.
if (tool.tags.includes(TOOL_TAG_REQUIRES_WORKSPACE) && !isWorkspaceOpen()) {
return false;
}
switch (tool.name) {
// Only include the execute code tool in the Chat pane; the other
// panes do not have an affordance for confirming executions.
//
// CONSIDER: It would be better for us to introspect the tool itself
// to see if it requires confirmation, but that information isn't
// currently exposed in `vscode.LanguageModelChatTool`.
case PositronAssistantToolName.ExecuteCode:
return inChatPane &&
// The execute code tool does not yet support notebook sessions.
positronContext.activeSession?.mode !== positron.LanguageRuntimeSessionMode.Notebook &&
isAgentMode;
// Only include the documentEdit tool in an editor and if there is
// no selection.
case PositronAssistantToolName.DocumentEdit:
return inEditor && !hasSelection;
// Only include the selectionEdit tool in an editor and if there is
// a selection.
case PositronAssistantToolName.SelectionEdit:
return inEditor && hasSelection;
// Only include the edit file tool in edit or agent mode i.e. for the edit participant.
case PositronAssistantToolName.EditFile:
return this.id === ParticipantID.Edit || isAgentMode;
// Only include the documentCreate tool in the chat pane and if the user is an agent.
case PositronAssistantToolName.DocumentCreate:
return inChatPane && isAgentMode;
// Otherwise, include the tool if it is tagged for use with Positron Assistant.
// Allow all tools in Agent mode.
default:
return isAgentMode ||
tool.tags.includes('positron-assistant');
}
}
);

Otherwise, we could throw an Error noting that this is only available for Python or return a string instead of returning an empty text part, just so it's clear to the user and the model why we were unable to grab the table summary info?


// Call the Positron API to get the session variable data summaries
const result = await positron.runtime.querySessionTables(
options.input.sessionIdentifier,
options.input.accessKeys,
['summary_stats']);

// Return the result as a JSON string to the model
return new vscode.LanguageModelToolResult([
new vscode.LanguageModelTextPart(JSON.stringify(result))
]);
}
});
context.subscriptions.push(getTableSummaryTool);

const installPythonPackageTool = vscode.lm.registerTool<{
packages: string[];
}>(PositronAssistantToolName.InstallPythonPackage, {
Expand Down
1 change: 1 addition & 0 deletions extensions/positron-assistant/src/types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ export enum PositronAssistantToolName {
DocumentEdit = 'documentEdit',
EditFile = 'positron_editFile_internal',
ExecuteCode = 'executeCode',
GetTableSummary = 'getTableSummary',
GetPlot = 'getPlot',
InstallPythonPackage = 'installPythonPackage',
InspectVariables = 'inspectVariables',
Expand Down
Loading
Loading