From aafd5b05d5b3805725c8cf73705f97cdba1bb6ec Mon Sep 17 00:00:00 2001 From: Bill Chen Date: Sun, 15 Jun 2025 16:53:57 -0700 Subject: [PATCH] added some notes on addressing lazy behavior --- examples/o-series/o3o4-mini_prompting_guide.ipynb | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/examples/o-series/o3o4-mini_prompting_guide.ipynb b/examples/o-series/o3o4-mini_prompting_guide.ipynb index 991ad10882..5370726b55 100644 --- a/examples/o-series/o3o4-mini_prompting_guide.ipynb +++ b/examples/o-series/o3o4-mini_prompting_guide.ipynb @@ -163,6 +163,18 @@ "Validate arguments against the format before sending the call; if you are unsure, ask for clarification instead of guessing.\n", "```\n", "\n", + "3. Another note on lazy behavior\n", + "We are aware of rare instances of lazy behavior from o3, such as stating it does not have enough time to complete a task, promising to follow up separately, or giving terse answers even when explicitly prompted to provide more detail. We have found that the following steps help ameliorate this behavior:\n", + "\n", + " a. Start a new conversation for unrelated topics:\n", + " When switching to a new or unrelated topic, begin a fresh conversation thread rather than continuing in the same context. This helps the model focus on the current subject and prevents it from being influenced by previous, irrelevant context, which can sometimes lead to incomplete or lazy responses. For example, if you were previously discussing code debugging and now want to ask about documentation best practices, which does not require previous conversation context, start a new conversation to ensure clarity and focus.\n", + "\n", + " b. Discard irrelevant past tool calls/outputs when the list gets too long, and summarize them as context in the user message:\n", + " If the conversation history contains a long list of previous tool calls or outputs that are no longer relevant, remove them from the context. Instead, provide a concise summary of the important information as part of the user message. This keeps the context manageable and ensures the model has access to only the most pertinent information. For instance, if you have a lengthy sequence of tool outputs, you can summarize the key results and include only that summary in your next message.\n", + "\n", + " c. We are constantly improving our models and expect to have this issue addressed in future versions.\n", + "\n", + "\n", "### Avoid Chain of Thought Prompting\n", "Since these models are reasoning models and produce an internal chain of thought, they do not have to be explicitly prompted to plan and reason between toolcalls. Therefore, a developer should not try to induce additional reasoning before each function call by asking the model to plan more extensively. Asking a reasoning model to reason more may actually hurt the performance. \n", "\n",