fix: resolve 'TODO: last event is not final' error in streaming mode …#638
fix: resolve 'TODO: last event is not final' error in streaming mode …#638mahendrarathore1742 wants to merge 2 commits intogoogle:mainfrom
Conversation
…oogle#600) When using streaming mode (e.g., with a2a remote agents), Flow.Run() would return an error 'TODO: last event is not final' if the last event from runOneStep had Partial=true. This is a legitimate state that occurs when: - The model reaches max token limit during streaming - A sub-agent (e.g., a2a remote agent) emits partial streaming events The turn is complete in these cases, so we simply return instead of erroring. Added regression test TestFlowRunPartialLastEvent with 3 sub-cases covering single partial, multiple partial, and mixed partial/final response scenarios. Fixes google#600
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the robustness of the streaming functionality by addressing an issue where Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request correctly addresses an error that occurred in streaming mode when the last event was a partial response. The fix involves simply returning instead of erroring out, which is the correct behavior for a completed turn. The addition of a comprehensive regression test with multiple sub-cases ensures this scenario is well-covered. My review includes a minor suggestion to improve the clarity and maintainability of the new test code.
| // partialOnlyModel is a mock LLM that always returns partial responses in streaming mode. | ||
| // This simulates the scenario where the model's last streaming chunk has Partial=true, | ||
| // for example when reaching a max token limit. | ||
| type partialOnlyModel struct { | ||
| responses []*model.LLMResponse | ||
| } | ||
|
|
||
| func (m *partialOnlyModel) Name() string { return "partial-mock" } | ||
| func (m *partialOnlyModel) GenerateContent(_ context.Context, _ *model.LLMRequest, _ bool) iter.Seq2[*model.LLMResponse, error] { | ||
| return func(yield func(*model.LLMResponse, error) bool) { | ||
| for _, r := range m.responses { | ||
| if !yield(r, nil) { | ||
| return | ||
| } | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
The name partialOnlyModel and its associated comment are slightly misleading. The comment states it "always returns partial responses", but it's used in a test case that includes a non-partial response. To improve clarity and future maintainability, I've suggested renaming it to mockLLM and updating the comment. You will also need to update its usage on line 666.
| // partialOnlyModel is a mock LLM that always returns partial responses in streaming mode. | |
| // This simulates the scenario where the model's last streaming chunk has Partial=true, | |
| // for example when reaching a max token limit. | |
| type partialOnlyModel struct { | |
| responses []*model.LLMResponse | |
| } | |
| func (m *partialOnlyModel) Name() string { return "partial-mock" } | |
| func (m *partialOnlyModel) GenerateContent(_ context.Context, _ *model.LLMRequest, _ bool) iter.Seq2[*model.LLMResponse, error] { | |
| return func(yield func(*model.LLMResponse, error) bool) { | |
| for _, r := range m.responses { | |
| if !yield(r, nil) { | |
| return | |
| } | |
| } | |
| } | |
| } | |
| // mockLLM is a mock LLM that returns a canned sequence of responses. | |
| // This simulates the scenario where the model's last streaming chunk has Partial=true, | |
| // for example when reaching a max token limit. | |
| type mockLLM struct { | |
| responses []*model.LLMResponse | |
| } | |
| func (m *mockLLM) Name() string { return "mock-llm" } | |
| func (m *mockLLM) GenerateContent(_ context.Context, _ *model.LLMRequest, _ bool) iter.Seq2[*model.LLMResponse, error] { | |
| return func(yield func(*model.LLMResponse, error) bool) { | |
| for _, r := range m.responses { | |
| if !yield(r, nil) { | |
| return | |
| } | |
| } | |
| } | |
| } |
Address code review feedback: the mock type name was misleading since it's also used in a test case with non-partial responses. Renamed to mockStreamingLLM to avoid both the naming confusion and a conflict with the existing mockLLM in the same package.
|
Thanks for the review! Good catch the name was indeed misleading. I renamed it to mockStreamingLLM instead of mockLLM because mockLLM already exists in outputschema_processor_test.go in the same package and would cause a compilation error. Updated and pushed. |
…(#600)
When using streaming mode (e.g., with a2a remote agents), Flow.Run() would return an error 'TODO: last event is not final' if the last event from runOneStep had Partial=true. This is a legitimate state that occurs when:
The turn is complete in these cases, so we simply return instead of erroring.
Added regression test TestFlowRunPartialLastEvent with 3 sub-cases covering single partial, multiple partial, and mixed partial/final response scenarios.
Fixes #600
Please ensure you have read the contribution guide before creating a pull request.
Link to Issue or Description of Change
1. Link to an existing issue (if applicable):
2. Or, if no issue exists, describe the change:
If applicable, please follow the issue templates to provide as much detail as
possible.
Problem:
A clear and concise description of what the problem is.
Solution:
A clear and concise description of what you want to happen and why you choose
this solution.
Testing Plan
Please describe the tests that you ran to verify your changes. This is required
for all PRs that are not small documentation or typo fixes.
Unit Tests:
Please include a summary of passed go test results.
Manual End-to-End (E2E) Tests:
Please provide instructions on how to manually test your changes, including any
necessary setup or configuration. Please provide logs or screenshots to help
reviewers better understand the fix.
Checklist
Additional context
Add any other context or screenshots about the feature request here.