Skip to content

Eval driven system design cookbook #1875

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 2, 2025
Merged

Eval driven system design cookbook #1875

merged 1 commit into from
Jun 2, 2025

Conversation

shikhar-cyber
Copy link
Contributor

Summary

Briefly describe the changes and the goal of this PR. Make sure the PR title summarizes the changes effectively.

Motivation

This cookbook provides a practical, end-to-end guide on how to effectively use evals as the core process in creating a production-grade autonomous system to replace a labor-intensive human workflow.

Making evals the core process prevents poke-and-hope guesswork and impressionistic judgments of accuracy, instead demanding engineering rigor. This means we can make principled decisions about cost trade-offs and investment.

Why are these changes necessary? How do they improve the cookbook?
Building and deploying an LLM application is just the beginning—the real value comes from ongoing improvement. Once your system is live, prioritize continuous monitoring: log traces, track outputs, and proactively sample real user interactions for human review using smart sampling techniques.


For new content

When contributing new content, read through our contribution guidelines, and mark the following action items as completed:

  • I have added a new entry in registry.yaml (and, optionally, in authors.yaml) so that my content renders on the cookbook website.
  • I have conducted a self-review of my content based on the contribution guidelines:
    • Relevance: This content is related to building with OpenAI technologies and is useful to others.
    • Uniqueness: I have searched for related examples in the OpenAI Cookbook, and verified that my content offers new insights or unique information compared to existing documentation.
    • Spelling and Grammar: I have checked for spelling or grammatical mistakes.
    • Clarity: I have done a final read-through and verified that my submission is well-organized and easy to understand.
    • Correctness: The information I include is correct and all of my code executes successfully.
    • Completeness: I have explained everything fully, including all necessary references and citations.

We will rate each of these areas on a scale from 1 to 4, and will only accept contributions that score 3 or higher on all areas. Refer to our contribution guidelines for more details.

Copy link
Contributor

@erikakettleson-openai erikakettleson-openai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code samples all worked, typos updated, notes added on model selection & ZDR! good to go !

@shikhar-cyber shikhar-cyber merged commit f92933b into main Jun 2, 2025
1 check passed
@shikhar-cyber shikhar-cyber deleted the skybranch branch June 2, 2025 23:07
@shikhar-cyber shikhar-cyber restored the skybranch branch June 2, 2025 23:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants