-
Notifications
You must be signed in to change notification settings - Fork 3k
Open
Description
Description of the feature request:
Currently, when using the google-gemini/gemini-fullstack-langgraph-quickstart library, there isn't an apparent way to directly access the number of input or output tokens consumed by the LLM for a given prompt/response. It would be highly beneficial to expose this information.
I propose adding a method or attribute to the LLM response object, or to the invoke result, that allows users to easily retrieve the input_token_count and output_token_count.
What problem are you trying to solve with this feature?
Without access to token counts, it's difficult to:
- Monitor API costs: Users cannot accurately track or estimate their Gemini API usage expenses.
- Implement rate limiting/budget management: For applications requiring strict control over LLM consumption, token count is essential.
- Optimize prompt engineering: Understanding token usage helps in refining prompts for efficiency.
Any other information you'd like to share?
No response
Metadata
Metadata
Assignees
Labels
No labels