Eval driven system design cookbook #1875

shikhar-cyber · 2025-06-01T22:35:40Z

Summary

Briefly describe the changes and the goal of this PR. Make sure the PR title summarizes the changes effectively.

Motivation

This cookbook provides a practical, end-to-end guide on how to effectively use evals as the core process in creating a production-grade autonomous system to replace a labor-intensive human workflow.

Making evals the core process prevents poke-and-hope guesswork and impressionistic judgments of accuracy, instead demanding engineering rigor. This means we can make principled decisions about cost trade-offs and investment.

Why are these changes necessary? How do they improve the cookbook?
Building and deploying an LLM application is just the beginning—the real value comes from ongoing improvement. Once your system is live, prioritize continuous monitoring: log traces, track outputs, and proactively sample real user interactions for human review using smart sampling techniques.

For new content

When contributing new content, read through our contribution guidelines, and mark the following action items as completed:

I have added a new entry in registry.yaml (and, optionally, in authors.yaml) so that my content renders on the cookbook website.
I have conducted a self-review of my content based on the contribution guidelines:
- Relevance: This content is related to building with OpenAI technologies and is useful to others.
- Uniqueness: I have searched for related examples in the OpenAI Cookbook, and verified that my content offers new insights or unique information compared to existing documentation.
- Spelling and Grammar: I have checked for spelling or grammatical mistakes.
- Clarity: I have done a final read-through and verified that my submission is well-organized and easy to understand.
- Correctness: The information I include is correct and all of my code executes successfully.
- Completeness: I have explained everything fully, including all necessary references and citations.

We will rate each of these areas on a scale from 1 to 4, and will only accept contributions that score 3 or higher on all areas. Refer to our contribution guidelines for more details.

erikakettleson-openai

Code samples all worked, typos updated, notes added on model selection & ZDR! good to go !

eval driven system design cookbook

c8926c2

shikhar-cyber requested a review from erikakettleson-openai June 1, 2025 22:36

erikakettleson-openai approved these changes Jun 2, 2025

View reviewed changes

shikhar-cyber merged commit f92933b into main Jun 2, 2025
1 check passed

shikhar-cyber deleted the skybranch branch June 2, 2025 23:07

shikhar-cyber restored the skybranch branch June 2, 2025 23:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Eval driven system design cookbook #1875

Eval driven system design cookbook #1875

Uh oh!

shikhar-cyber commented Jun 1, 2025

Uh oh!

erikakettleson-openai left a comment

Uh oh!

Uh oh!

Uh oh!

Eval driven system design cookbook #1875

Eval driven system design cookbook #1875

Uh oh!

Conversation

shikhar-cyber commented Jun 1, 2025

Summary

Motivation

For new content

Uh oh!

erikakettleson-openai left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!