-
Notifications
You must be signed in to change notification settings - Fork 105
Assistant: Basic Anthropic prompt caching #8246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
E2E Tests 🚀 |
I seem to have broken the context stuff a bit, will fix. EDIT: Fixed. |
3699e0d
to
cf0548e
Compare
If I'm understanding the code right, I think that the cache point on the last user message will not actually be read from, for a couple of reasons:
I don't think it's necessary to cache the tools separately from the system prompt. I think it makes sense to just put a cache point at the end of the system prompt. I believe that the cached content always contains the entire request up the breakpoint. Even if there are multiple breakpoints, each one will store in the cache everything from the beginning to that breakpoint, not just the content between breakpoints. The Anthropic documentation isn't super clear about this, so I asked Claude about this and pointed it to the docs, and it says the same thing: https://claude.ai/share/a85ae7e7-c3df-46be-9308-9a35a4a96705 Finally, for chat participants that follow the VS Code API, they can't change the real system prompt. Instead, they are supposed to insert a first user message with the same content that would go in a system prompt. It would be nice if the extension author could designate that message as being a cache breakpoint -- without breaking compatibility with the VS Code API. I understand that it may be outside the scope of this PR, but I just want to mention that. |
Thanks for the detailed feedback @wch!
At this point, this PR only adds a cache control point after the system prompt, which should be a big win for Databot users, as you initially suggested. I'd like to get that merged ASAP in time for RC. Will sync up with you about further improvements. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
2025-06-25 10:01:41.634 [debug] [anthropic] Adding cache control point to system prompt
2025-06-25 10:01:45.975 [debug] [anthropic] SEND messages.stream [req_011CQVJGyw7CudpFoPpSQXEg]: model: claude-sonnet-4-20250514; cache options: default; tools: executeCode, getAttachedPythonPackages, getAttachedRPackages, getInstalledPythonPackageVersion, getInstalledRPackageVersion, getPlot, getProjectTree, inspectVariables, notebook_install_packages, notebook_list_packages, positron_editFile_internal, positron_findTextInProject_internal, positron_getFileContents_internal, vscode_fetchWebPage_internal, vscode_searchExtensions_internal; tool choice: default; system chars: 26495; user messages: 2; user message characters: 6724; assistant messages: 0; assistant message characters: 2
This PR uses Anthropic's prompt caching API to reduce costs and alleviate organization rate limit pressure – particularly in Databot.
By default, we add a cache control point after the system prompt. Callers (e.g. Databot) can disable that if needed.
This PR also adds more Assistant logging:
Release Notes
New Features
Bug Fixes
QA Notes
Try out Assistant and Databot and check the cache read/write debug logs in the "Assistant" output channel.