Add prompt cache support and implement for Anthropic#716
Open
arunkumarry wants to merge 4 commits intocrmne:mainfrom
Open
Add prompt cache support and implement for Anthropic#716arunkumarry wants to merge 4 commits intocrmne:mainfrom
arunkumarry wants to merge 4 commits intocrmne:mainfrom
Conversation
Author
|
There are existing rubocop offenses in - |
# This is the 1st commit message: Add prompt caching support for Anthropic # The commit message crmne#2 will be skipped: # uncommit unnecessary file # The commit message crmne#3 will be skipped: # remove rubocop and flay fixes as they are unrelated to this issue # The commit message crmne#4 will be skipped: # remove rubocop ignore for anthropic complete method
# This is the 1st commit message: Add prompt caching support for Anthropic # The commit message crmne#2 will be skipped: # Add prompt caching support for Anthropic
d474ab2 to
1779bf1
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this does
Adds prompt caching support for Anthropic Claude models via a
cache_point:keyword onChat#with_instructionsandChat#ask. When a message is marked as a cache point, the gem injects Anthropic'scache_control: { type: 'ephemeral' }header on the last content block of that message and automatically adds the requiredanthropic-beta: prompt-caching-2024-07-31request header. The static portion of the prompt is cached server-side by Anthropic for 5 minutes, reducing input token costs on repeated calls.Fixes #706
Usage
Multiple cache points are supported (up to Anthropic's limit of 4 per request):
Cache points on ask are also supported for caching user messages:
chat.ask(large_static_user_context, cache_point: true)Future extensibility
The
cache_pointattribute on Message is provider agnostic. Adding support for other providers requires only provider-specific formatting logic:For providers with inline cache markers (like Anthropic):
Override
completemethod to add any required headers/beta flags whenmessages.any?(&:cache_point?)Add an
inject_cache_*helper in the provider'sChatmodule that modifies content blocksCall the helper in the message formatting methods when
msg.cache_point?For providers with separate cache APIs (like Gemini's Context Caching):
Override
completeto manage the cache lifecycle (create → reuse → retry on expiry)Modify
render_payloadto accept acached_content_name:parameterWhen the name is present, split messages at the last cache point and only send the dynamic suffix inline
For providers without caching support:
No changes needed
cache_pointflags are silently ignored, preserving existing behaviorThe core Message and Chat changes are already in place. Future PRs for Gemini, OpenAI (when they add caching), or other providers only need to touch their respective provider modules.
Type of change
Scope check
Required for new features
Quality check
overcommit --installand all hooks passThere are existing rubocop offenses in - spec/ruby_llm/generators/chat_ui_generator_spec.rb
There are existing Flay offenses in following files - spec/ruby_llm/generators/chat_ui_generator_spec.rb and lib/ruby_llm/error.rb
bundle exec rake vcr:record[provider_name]bundle exec rspecmodels.json,aliases.json)AI-generated code
API changes