Skip to content
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 15 additions & 6 deletions docs/content/docs/(configuration)/config.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -463,18 +463,27 @@ At least one provider (legacy key or custom provider) must be configured.
| `compactor` | string | `anthropic/claude-haiku-4.5-20250514` | Model for summarization |
| `cortex` | string | `anthropic/claude-haiku-4.5-20250514` | Model for system observation |
| `rate_limit_cooldown_secs` | integer | 60 | How long to deprioritize a rate-limited model |
| `channel_thinking_effort` | string | `"auto"` | Internal reasoning effort hint for channel model (`"auto"`, `"max"`, `"high"`, `"medium"`, `"low"`) |
| `branch_thinking_effort` | string | `"auto"` | Internal reasoning effort hint for branch model (`"auto"`, `"max"`, `"high"`, `"medium"`, `"low"`) |
| `worker_thinking_effort` | string | `"auto"` | Internal reasoning effort hint for worker model (`"auto"`, `"max"`, `"high"`, `"medium"`, `"low"`) |
| `compactor_thinking_effort` | string | `"auto"` | Internal reasoning effort hint for compactor model (`"auto"`, `"max"`, `"high"`, `"medium"`, `"low"`) |
| `cortex_thinking_effort` | string | `"auto"` | Internal reasoning effort hint for cortex model (`"auto"`, `"max"`, `"high"`, `"medium"`, `"low"`) |

Routing selects providers by the prefix before the first `/` in the model name.

`*_thinking_effort` is provider-specific:
- Anthropic adaptive-thinking models use the value directly.
- `openai-chatgpt/*` models map this to Responses API `reasoning.effort` (`max|high -> high`, `medium -> medium`, `low -> low`, `auto -> omitted`).

Comment on lines +466 to +477
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Align the documented mapping with the runtime contract.

This section now documents max -> high for all openai-chatgpt/* models and omits minimal entirely. That will mislead users: the runtime already treats minimal as an alias, and GPT-5.4-family models need model-specific normalization rather than the generic max -> high rule. Please either document those exceptions here or tighten the parser to match the docs exactly.

Based on learnings: In src/llm/model.rs, GPT-5.4-family models intentionally support xhigh, and gpt-5.4-pro requires model-specific normalization rather than the generic low/medium/high mapping.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/content/docs/`(configuration)/config.mdx around lines 466 - 477, Update
the docs to match the runtime behavior: explicitly list "minimal" as an accepted
alias for `*_thinking_effort`, document that `openai-chatgpt/*` models map
values to Responses API `reasoning.effort` with the noted normalization
(max|high -> high, medium -> medium, low -> low, auto -> omitted) but add
exceptions for GPT-5.4-family (which supports `xhigh`) and call out that
`gpt-5.4-pro` uses model-specific normalization rather than the generic mapping;
reference the runtime normalization logic in src/llm/model.rs so readers can see
the exact behavior or, if you prefer code-first, tighten the parser in
src/llm/model.rs to enforce the documented mapping instead of allowing the extra
aliases.

```toml
[defaults.routing]
channel = "my_openai/gpt-4o-mini"
worker = "custom_anthropic/claude-3-5-sonnet"
channel = "openai-chatgpt/gpt-5.3-codex"
worker = "openai-chatgpt/gpt-5.3-codex"
channel_thinking_effort = "high"
worker_thinking_effort = "medium"

[llm.provider.my_openai]
api_type = "openai_completions"
base_url = "https://api.openai.com"
api_key = "env:OPENAI_API_KEY"
[llm]
openai_key = "env:OPENAI_API_KEY"

[llm.provider.custom_anthropic]
api_type = "anthropic"
Expand Down
53 changes: 48 additions & 5 deletions src/llm/model.rs
Original file line number Diff line number Diff line change
Expand Up @@ -545,18 +545,33 @@ impl CompletionModel for SpacebotModel {
}

impl SpacebotModel {
fn configured_thinking_effort(&self) -> &str {
let Some(routing) = self.routing.as_ref() else {
return "auto";
};

if let Some(process_type) = self.process_type.as_deref() {
return match process_type {
"channel" => routing.channel_thinking_effort.as_str(),
"branch" => routing.branch_thinking_effort.as_str(),
"worker" => routing.worker_thinking_effort.as_str(),
"compactor" => routing.compactor_thinking_effort.as_str(),
"cortex" => routing.cortex_thinking_effort.as_str(),
_ => routing.thinking_effort_for_model(&self.model_name),
};
}

routing.thinking_effort_for_model(&self.model_name)
}
Comment on lines +558 to +570
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like thinking_effort_for_model() is comparing against provider/model strings from routing, but this fallback is passing self.model_name (suffix only). Using self.full_model_name should make the fallback work when process_type is unknown/None.

Suggested change
if let Some(process_type) = self.process_type.as_deref() {
return match process_type {
"channel" => routing.channel_thinking_effort.as_str(),
"branch" => routing.branch_thinking_effort.as_str(),
"worker" => routing.worker_thinking_effort.as_str(),
"compactor" => routing.compactor_thinking_effort.as_str(),
"cortex" => routing.cortex_thinking_effort.as_str(),
_ => routing.thinking_effort_for_model(&self.model_name),
};
}
routing.thinking_effort_for_model(&self.model_name)
}
if let Some(process_type) = self.process_type.as_deref() {
return match process_type {
"channel" => routing.channel_thinking_effort.as_str(),
"branch" => routing.branch_thinking_effort.as_str(),
"worker" => routing.worker_thinking_effort.as_str(),
"compactor" => routing.compactor_thinking_effort.as_str(),
"cortex" => routing.cortex_thinking_effort.as_str(),
_ => routing.thinking_effort_for_model(&self.full_model_name),
};
}
routing.thinking_effort_for_model(&self.full_model_name)


async fn call_anthropic(
&self,
request: CompletionRequest,
provider_config: &ProviderConfig,
) -> Result<completion::CompletionResponse<RawResponse>, CompletionError> {
let api_key = provider_config.api_key.as_str();

let effort = self
.routing
.as_ref()
.map(|r| r.thinking_effort_for_model(&self.model_name))
.unwrap_or("auto");
let effort = self.configured_thinking_effort();
let anthropic_request = crate::llm::anthropic::build_anthropic_request(
self.llm_manager.http_client(),
api_key,
Expand Down Expand Up @@ -766,6 +781,12 @@ impl SpacebotModel {
body["stream"] = serde_json::json!(true);
}

if self.provider == "openai-chatgpt"
&& let Some(effort) = map_openai_reasoning_effort(self.configured_thinking_effort())
{
body["reasoning"] = serde_json::json!({ "effort": effort });
}

if !request.tools.is_empty() {
let tools: Vec<serde_json::Value> = request
.tools
Expand Down Expand Up @@ -2859,6 +2880,16 @@ fn provider_display_name(provider_id: &str) -> String {
}
}

fn map_openai_reasoning_effort(config_effort: &str) -> Option<&'static str> {
match config_effort.trim().to_ascii_lowercase().as_str() {
"auto" => None,
"max" | "high" => Some("high"),
"medium" => Some("medium"),
"low" | "minimal" => Some("low"),
_ => None,
}
Comment on lines +2938 to +2945
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Keep the OpenAI mapper model-aware.

This helper now collapses max to high for every openai-chatgpt model. That regresses the existing GPT-5.4 behavior: max should stay xhigh for the GPT-5.4 family, and gpt-5.4-pro also needs its inputs normalized to the subset it actually supports. Since Line 785 now funnels all ChatGPT reasoning through this helper, those configs will either undershoot or become invalid.

Proposed fix
-fn map_openai_reasoning_effort(config_effort: &str) -> Option<&'static str> {
-    match config_effort.trim().to_ascii_lowercase().as_str() {
+fn map_openai_reasoning_effort(model_name: &str, config_effort: &str) -> Option<&'static str> {
+    let normalized_model_name = model_name.trim().to_ascii_lowercase();
+    let is_gpt_5_4_pro = normalized_model_name.starts_with("gpt-5.4-pro");
+    let is_gpt_5_4_family = normalized_model_name.starts_with("gpt-5.4");
+
+    match config_effort.trim().to_ascii_lowercase().as_str() {
         "auto" => None,
-        "max" | "high" => Some("high"),
+        "max" if is_gpt_5_4_family => Some("xhigh"),
+        "high" | "max" => Some("high"),
         "medium" => Some("medium"),
-        "low" | "minimal" => Some("low"),
+        "low" | "minimal" if is_gpt_5_4_pro => Some("medium"),
+        "low" | "minimal" => Some("low"),
         _ => None,
     }
 }

Update the call site accordingly:

-        if self.provider == "openai-chatgpt"
-            && let Some(effort) = map_openai_reasoning_effort(self.configured_thinking_effort())
+        if self.provider == "openai-chatgpt"
+            && let Some(effort) =
+                map_openai_reasoning_effort(&self.model_name, self.configured_thinking_effort())
         {
             body["reasoning"] = serde_json::json!({ "effort": effort });
         }

Based on learnings: In src/llm/model.rs, the openai_reasoning_effort mapping must preserve GPT-5.4-family normalization, specifically max -> xhigh, and gpt-5.4-pro requires normalization to its supported subset.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/llm/model.rs` around lines 2883 - 2890, The helper
map_openai_reasoning_effort currently collapses "max" to "high" for all chat
models; update it to be model-aware by adding a model parameter (e.g., model_id:
&str) or by adding a separate branch/variant and use that when called from
openai_reasoning_effort so GPT-5.4-family preserves "max" -> "xhigh" while other
chat models keep "max" -> "high"; also ensure gpt-5.4-pro inputs are normalized
to the exact subset it supports (map unsupported strings to the closest
supported values for "gpt-5.4-pro"); change the call site that funnels ChatGPT
through map_openai_reasoning_effort (where openai_reasoning_effort is invoked)
to pass the model identifier so the helper can apply the GPT-5.4-specific rules.

}

fn remap_model_name_for_api(provider: &str, model_name: &str) -> String {
if provider == "zai-coding-plan" {
// Coding Plan endpoint expects plain model ids (e.g. "glm-5").
Expand Down Expand Up @@ -2916,6 +2947,18 @@ mod tests {
panic!("expected ToolCall");
}
}

#[test]
fn map_openai_reasoning_effort_maps_values() {
assert_eq!(map_openai_reasoning_effort("auto"), None);
assert_eq!(map_openai_reasoning_effort("max"), Some("high"));
assert_eq!(map_openai_reasoning_effort("high"), Some("high"));
assert_eq!(map_openai_reasoning_effort("medium"), Some("medium"));
assert_eq!(map_openai_reasoning_effort("low"), Some("low"));
assert_eq!(map_openai_reasoning_effort("minimal"), Some("low"));
assert_eq!(map_openai_reasoning_effort("invalid"), None);
}

#[test]
fn coding_plan_model_name_uses_plain_glm_id() {
assert_eq!(
Expand Down