deep researcher example #530

margaretlange · 2025-04-18T23:24:03Z

In the project's discord server, I discussed the possibility of contributing an example I wrote as a PR with the maintainers, so I did.

Changes

I added an example to the folder of examples.

How I tested this

I installed from local requirements.txt and ran the application.py file in examples.

Notes

Checklist

[ x] PR has an informative and human-readable title (this will be pulled into the release notes)
[ x] Changes are limited to a single goal (no scope creep)
Code passed the pre-commit check & code is left cleaner/nicer than when first encountered.
Any change in functionality is tested
New functions are documented (with a description, list of inputs, and expected output)
Placeholder code is flagged / future TODOs are captured in comments
Project documentation has been updated if adding/changing functionality.

Important

Adds a deep-researcher example using the Burr framework, demonstrating a research assistant workflow with OpenAI and Tavily API integration.

New Example:
- Adds deep-researcher example in examples/.
- Implements a research assistant using Burr framework in application.py.
Functionality:
- query_openai() in application.py to interact with OpenAI API.
- generate_query(), web_research(), summarize_sources(), reflect_on_summary(), finalize_summary() actions in application.py for research workflow.
- Uses tavily_search() in utils.py to perform web searches.
Prompts and Utilities:
- prompts.py contains instructions for query generation, summarization, and reflection.
- utils.py provides functions for formatting and deduplicating search results.
Documentation:
- README.md provides setup and usage instructions.
- requirements.txt lists dependencies.

^{This description was created by}^{for 4f34744. You can customize this summary. It will automatically update as commits are pushed.}

ellipsis-dev

❌ Changes requested. Reviewed everything up to 4f34744 in 2 minutes and 27 seconds

More details

Looked at 442 lines of code in 6 files
Skipped 1 files when reviewing.
Skipped posting 12 drafted comments based on config settings.

1. examples/deep-researcher/application.py:23

Draft comment:
Using openai.Client() is nonstandard; official usage usually calls openai.ChatCompletion.create() directly.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 0% vs. threshold = 50%
The comment is actually incorrect. The code is using the newer, recommended OpenAI API client style. As of OpenAI's Python client v1.0.0, the recommended way is indeed to use openai.Client(). The old style openai.ChatCompletion.create() is deprecated. The comment is suggesting to use an older, deprecated approach.
Could there be a specific reason the reviewer prefers the older style? Maybe there are compatibility or stability concerns with the newer API?
While there might be legacy reasons to use the old API, we should follow OpenAI's current best practices and documentation which clearly recommends the new Client() approach.
The comment should be deleted as it's suggesting a deprecated approach over the current recommended way of using the OpenAI API.

2. examples/deep-researcher/utils.py:43

Draft comment:
Optional: Remove unused loop variable 'i' in deduplicate_and_format_sources for clarity.
Reason this comment was not posted:
Confidence changes required: 30% <= threshold 50%
None

3. examples/deep-researcher/application.py:23

Draft comment:
openai.Client() may not be the intended usage. Consider using openai.ChatCompletion.create instead.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 0% vs. threshold = 50%
The comment is incorrect. openai.Client() is actually the newer, preferred way to use the OpenAI API as of late 2023. The comment suggests using an older API style (openai.ChatCompletion.create) which is now deprecated. The code is using the API correctly.
Could there be a specific reason the reviewer suggested the older API style? Maybe they're working with an older version of the OpenAI library?
Even if using an older version, suggesting a deprecated API pattern would be unhelpful. The code is using the current best practice.
The comment should be deleted as it suggests using a deprecated API pattern instead of the current correct usage that's already in the code.

4. examples/deep-researcher/application.py:132

Draft comment:
In reflect_on_summary, the fallback_query is computed but then overwritten. The state is updated with 'query' regardless, so the fallback is never used.
Reason this comment was not posted:
Marked as duplicate.

5. examples/deep-researcher/application.py:183

Draft comment:
Unpacking the state with '*_, state = app.run(...)' is non-standard; consider using a more descriptive assignment for clarity.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50%
None

6. examples/deep-researcher/prompts.py:47

Draft comment:
Consider removing extra spaces in tag names (e.g., '< /REQUIREMENTS >') for consistency.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50%
None

7. examples/deep-researcher/utils.py:50

Draft comment:
Use logging instead of print for warnings to better integrate with production logging practices.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50%
None

8. examples/deep-researcher/application.py:53

Draft comment:
Consider adding error handling around json.loads to guard against invalid JSON responses from the OpenAI API.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50%
None

9. examples/deep-researcher/application.py:162

Draft comment:
Typographical error: Consider changing 'getting a job in datascience' to 'getting a job in data science' for better readability.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50%
This is a purely cosmetic change to example code. While "data science" is technically the more common spelling, "datascience" is also used and the meaning is clear. This doesn't affect functionality. The comment doesn't point out a bug or suggest an important improvement. It's more of a style preference.
The spelling "datascience" could potentially cause issues if the research query is being matched against formal terms or if consistency with other parts of the codebase is important.
Even if there were matching issues, the OpenAI API and search functions would likely handle both spellings equivalently. Without seeing evidence of problems, this is too minor to warrant a comment.
Delete the comment as it suggests a purely cosmetic change to example code that doesn't meaningfully impact functionality.

10. examples/deep-researcher/prompts.py:47

Draft comment:
Typographical error: The closing tag "< /REQUIREMENTS >" on line 47 has extra spaces. It should be "" to match the proper format used in other sections.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50%
While technically correct, this is a very minor formatting issue. The XML tags here are just part of a string template that will be shown to an LLM. The extra spaces won't affect functionality since this is just prompt text. The LLM will understand the tags either way. This feels like an unnecessarily pedantic comment that doesn't improve code quality in any meaningful way.
The inconsistency in formatting could potentially indicate a copy-paste error or lead to confusion for other developers reading the code. Consistent formatting is generally good practice.
However, in this specific case, the spaces in XML tags within a prompt string won't affect functionality and fixing it provides no real value. This is the kind of minor nitpick we want to avoid in code reviews.
Delete this comment as it's too minor and doesn't meaningfully improve code quality. It's the kind of nitpick that clutters code reviews without adding value.

11. examples/deep-researcher/prompts.py:51

Draft comment:
Typographical error: The closing tag "< /FORMATTING >" on line 51 has extra spaces. It should be "" to maintain consistency.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50%
While the comment is technically correct about the inconsistency, this is a prompt template file where the XML-style tags are just for visual structure in the prompt text. The spaces don't affect functionality since these aren't real XML tags being parsed. The comment is more about style than substance. According to the rules, we shouldn't make purely informative comments or comments about obvious/unimportant issues.
The inconsistent formatting could make the code less readable and maintainable. Consistent style is generally good practice.
While consistency is good, this is a minor formatting issue in prompt text that doesn't affect functionality. The rules explicitly say not to make comments that are obvious or unimportant.
Delete this comment as it addresses a minor formatting inconsistency that doesn't impact functionality and falls under the category of "obvious or unimportant" issues that we should not comment on.

12. examples/deep-researcher/requirements.txt:3

Draft comment:
The entry 'burr[start]' on line 3 looks unusual. If you intended to specify an extra for the 'burr' package (e.g., 'burr[start]'), please confirm that this is correct. Otherwise, it might be a typographical error that should be corrected.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 0% vs. threshold = 50%
This comment violates several rules. It's asking for confirmation ("please confirm that this is correct"), it's speculative ("might be a typographical error"), and it's about a dependency specification which we're explicitly told to ignore. The [extras] syntax is valid pip requirements.txt syntax.
The syntax could genuinely be incorrect, and catching typos early could prevent issues.
We're explicitly told not to comment on dependency changes or versions we don't recognize. Additionally, asking for confirmation is against the rules.
Delete the comment as it violates rules about commenting on dependencies and asking for confirmation.

Workflow ID: wflow_tX91jTFwZE69b5Gz

Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can teach Ellipsis what to look for with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

examples/deep-researcher/application.py

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>

…deep-researcher

skrawcz · 2025-04-28T18:33:20Z

@margaretlange sorry been a bit busy. Will try to take a look at this coming weekend.

margaretlange · 2025-05-01T01:34:15Z

@margaretlange sorry been a bit busy. Will try to take a look at this coming weekend.

thanks. I have a version now with the basic deep researcher added to the UI/examples. Is that something you might be interested in? (I was getting it set up as a preliminary to testing the planned streaming version since I think visualizing through react is helpful.)

skrawcz · 2025-05-01T15:51:09Z

nice, sure. Happy to take a look at that as well.

…

On Wed, Apr 30, 2025 at 6:34 PM Margaret Lange ***@***.***> wrote: *margaretlange* left a comment (asf-transfer/burr#530) <#530 (comment)> @margaretlange <https://github.com/margaretlange> sorry been a bit busy. Will try to take a look at this coming weekend. thanks. I have a version now with the basic deep researcher added to the UI/examples. Is that something you might be interested in? (I was getting it set up as a preliminary to testing the planned streaming version since I think visualizing through react is helpful.) — Reply to this email directly, view it on GitHub <#530 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AARYMB63O5X5PDQFRVIB54T24F223AVCNFSM6AAAAAB3N3YMOGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQNBTHA4DAOBVGA> . You are receiving this because you commented.Message ID: ***@***.***>

margaretlange · 2025-05-01T20:28:39Z

OK, I added that change. If you aren't interested in it, but do want the original example, the last commit without the server/npm integration is 9df9c82.
By the way, when I recreated the react api code with scripts/client-gen.sh, which I needed to do to add my changes easily, I did run the linters and formatters afterwards. But I still am seeing a bunch of changes in react files that are just indentations. If you can tell me how to get rid of those I can clean up the commit.

skrawcz · 2025-05-05T15:23:42Z

I'm sorry a few things came up this weekend that I needed to handle and want able to get to this. This week 🤞.

…

On Thu, May 1, 2025, 1:29 PM Margaret Lange ***@***.***> wrote: *margaretlange* left a comment (asf-transfer/burr#530) <#530 (comment)> OK, I added that change. By the way, when I recreated the react api code with scripts/client-gen.sh, which I needed to do to add my changes easily, I did run the linters and formatters afterwards. But I still am seeing a bunch of changes in react files that are just indentations. If you can tell me how to get rid of those I can clean up the commit. — Reply to this email directly, view it on GitHub <#530 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AARYMBY7FQJET7YHVNDA3IL24J7Y3AVCNFSM6AAAAAB3N3YMOGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQNBVG4YDGMZXGU> . You are receiving this because you commented.Message ID: ***@***.***>

margaretlange · 2025-05-05T19:45:37Z

thanks. let me know if there's anything you need from me.

margaretlange · 2025-05-12T01:30:02Z

would it help if I walked you through this? I could schedule a time. let me know.

skrawcz

This is amazing @margaretlange ! Apologies for not getting to this sooner.

It all seems to work great. I ran the notebook and server locally without issue:

Only a couple of minor things:

You can move the MIT license to the files that it pertains to.
What's the reason for the whitespace change in the tsx files? I'm fine just want to know whether we missed it, or something else changed?
Can you revert model_costs.json ? Unless you want to run the update script for it?

Otherwise this is a great example. Thanks for taking the time to do it.

skrawcz · 2025-05-12T04:15:16Z

examples/deep-researcher/README.md

+> MIT License
+>
+> Copyright (c) 2025 Lance Martin
+>
+> Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+>
+> The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+>
+> THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.


Thanks for calling this out. I think we can just mention it here, and then in the tops of those files have this license in them.

skrawcz · 2025-05-12T04:19:12Z

telemetry/ui/scripts/model_costs.json

did you mean to update this file?

skrawcz · 2025-05-12T04:34:10Z

telemetry/ui/src/api/core/ApiError.ts

@@ -6,20 +6,20 @@ import type { ApiRequestOptions } from './ApiRequestOptions';
 import type { ApiResult } from './ApiResult';

 export class ApiError extends Error {
-    public readonly url: string;


what's the reason for the whitespace changes?

skrawcz · 2025-05-12T04:36:46Z

examples/deep-researcher/server.py

+
+
+@router.get("/validate")
+def validate_environment() -> Optional[str]:


This didn't seem to trigger for me when I forgot to add the API keys to my env. It failed in the burr step for that. I didn't look into see what's missing here.

skrawcz · 2025-05-12T04:43:34Z

Oh and one more request -- could you please modify the description of this PR with an explanation of this PR and whether there were any "design decisions" you made with it? (i.e. what would you want to mention about this code to someone else coming after you?) I'll then use this as part of the commit message when I do a squash merge (right now there are too many commits so we'd squash them into one on merge).

margaretlange · 2025-05-12T04:54:44Z

Thanks for looking at this. I'll get back to you tomorrow.

skrawcz · 2025-05-12T04:58:17Z

no worries @margaretlange!

Also another thing to fix up for the pre-commit failure:

black....................................................................Failed
- hook id: black
- files were modified by this hook

reformatted examples/deep-researcher/application.py

So if you run black on that then the pre-commit hooks should pass too.

…ucts, with google and Open AI both offering one. This fact suggests the possibility of open source implementations. This example's particular research flow was adapted from the langchain Local Deep Researcher (https://github.com/langchain-ai/local-deep-researcher), replacing langgraph with Burr as the graph framework of choice. In the Burr implementation, calls to a local model are also replaced by api calls, for the convenience of those who don't currently have the hardware to host a local model. The patterns in the code are basic Burr graph patterns, as described in the online documentation (https://burr.dagworks.io/). The UI is written in React and based on the example email assistant UI and the example chatbot UI. Here I consulted the developer guide (https://burr.dagworks.io/contributing/) quite a bit. The React wrapper for the python API can be regenerated using the script telemetry/ui/scripts/client-gen.sh Future developments of the workflow and the UI could include adding concurrency and streaming to the model, both to increase its power and to improve the user experience.

margaretlange · 2025-05-13T20:38:35Z

Sorry for the delay. I tried to address all your comments.

model_costs.json has been reverted to the latest file on the main branch. I just downloaded it and overwrote my own.
the MIT license has been moved from the README to the two files I used from the other repository. I mention the other codebase in the README but don't include the license, as you suggest.
I did some additions to the react code for the deep researcher based on the email assistant and it should show an error message now when one or both of the keys are not present.
I ran black on application.py
I wrote a longer commit message for commit 053b87c. Feel free to use this as the message when you squash my commits. Or if it's not quite what you wanted, you can edit it or let me know.

re: the whitespace. When I generated the code using client-gen.sh, I got some changes based on there being four rather than two spaces in the generated API wrapper. So I ran npm run format:fix, which changed all the api files to have two spaces. However, some came in with four spaces, so there's a change. This is a minor inconsistency in the existing codebase, see for instance BackendSpec.ts (2 spaces) versus BeginEntryModel.ts (4 spaces) on the main branch. I wasn't quite sure how to resolve it. Let me know if you want me to try something else.

Thanks for all these comments and great documentation online that let me write this.

skrawcz

Thanks @margaretlange sorry for the delay!

margaretlange · 2025-05-27T03:59:41Z

thanks that's great!

deep researcher example

4f34744

ellipsis-dev bot reviewed Apr 18, 2025

View reviewed changes

examples/deep-researcher/application.py Outdated Show resolved Hide resolved

examples/deep-researcher/application.py Outdated Show resolved Hide resolved

examples/deep-researcher/application.py Outdated Show resolved Hide resolved

margaretlange and others added 8 commits April 18, 2025 16:28

Update examples/deep-researcher/application.py

c9fb9de

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>

added a notebook to the deep researcher example

9154152

Merge branch 'deep-researcher' of github.com:margaretlange/burr into …

2a8d322

…deep-researcher

added fastapi server to deep researcher example

458a96a

starting a deep resarcher component

5ebb726

adding deep researcher to the tracking api

981d388

removed arguments from validation of environment

a5e261c

fixed updates to state variables to reflect original langgraph version

9df9c82

basic deep researcher now can be viewed in UI

0cb2ccb

margaretlange added 2 commits May 1, 2025 13:05

Merge branch 'deep-researcher-react' into deep-researcher

afb9df4

fixing format

b93f1ec

fixed docstrings in application.py

77d7216

Making sure standalone examples work as well as server

ea12145

skrawcz requested changes May 12, 2025

View reviewed changes

margaretlange added 3 commits May 13, 2025 12:41

responded to reviewer's comments

f991334

reverting model_costs.json

c3c7b92

skrawcz approved these changes May 27, 2025

View reviewed changes

skrawcz merged commit 448f380 into asf-transfer:main May 27, 2025
11 of 12 checks passed



		@router.get("/validate")
		def validate_environment() -> Optional[str]:

deep researcher example #530

deep researcher example #530

Uh oh!

Conversation

margaretlange commented Apr 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

How I tested this

Notes

Checklist

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

skrawcz commented Apr 28, 2025

Uh oh!

margaretlange commented May 1, 2025

Uh oh!

skrawcz commented May 1, 2025 via email

Uh oh!

margaretlange commented May 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

skrawcz commented May 5, 2025 via email

Uh oh!

margaretlange commented May 5, 2025

Uh oh!

margaretlange commented May 12, 2025

Uh oh!

skrawcz left a comment

Choose a reason for hiding this comment

Uh oh!

skrawcz May 12, 2025

Choose a reason for hiding this comment

Uh oh!

skrawcz May 12, 2025

Choose a reason for hiding this comment

Uh oh!

skrawcz May 12, 2025

Choose a reason for hiding this comment

Uh oh!

skrawcz May 12, 2025

Choose a reason for hiding this comment

Uh oh!

skrawcz commented May 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

margaretlange commented May 12, 2025

Uh oh!

skrawcz commented May 12, 2025

Uh oh!

margaretlange commented May 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

skrawcz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

margaretlange commented May 27, 2025

Uh oh!

Uh oh!

margaretlange commented Apr 18, 2025 •

edited

Loading

margaretlange commented May 1, 2025 •

edited

Loading

skrawcz commented May 12, 2025 •

edited

Loading

margaretlange commented May 13, 2025 •

edited

Loading