Skip to content

Commit 82733d3

Browse files
committed
🤖 refactor: remove MUX_PROMPT/MUX_OUTPUT in favor of stdout/stderr
Replace the special environment file mechanism with standard Unix conventions: - stdout: sent to agent as tool result - stderr: shown to user in UI only (not sent to agent) This simplifies the implementation and makes scripts work identically whether run inside mux or directly from the command line. _Generated with mux_ Change-Id: Idbb4b8006a81a92dd7676693002bd652ddd62e00 Signed-off-by: Thomas Kosiewski <[email protected]>
1 parent 73e55ca commit 82733d3

File tree

10 files changed

+166
-189
lines changed

10 files changed

+166
-189
lines changed

docs/scripts.md

Lines changed: 23 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,7 @@ To make your scripts effective AI tools:
6161
```
6262

6363
2. **Robustness**: Use `set -euo pipefail` to ensure the script fails loudly if something goes wrong, allowing the AI to catch the error.
64-
3. **Feedback**: Use `MUX_PROMPT` to guide the AI on what to do next if the script succeeds or fails (see below).
64+
3. **Clear Output**: Write structured output to stdout so the agent can understand results and take action.
6565

6666
## Usage
6767

@@ -95,64 +95,49 @@ Scripts run with:
9595
- **Human**: Visible in the chat card.
9696
- **Agent**: Returned as the tool execution result.
9797

98-
### Environment Variables
98+
### Standard Streams
9999

100-
Scripts receive special environment variables for controlling cmux behavior and interacting with the agent:
100+
Scripts follow Unix conventions for output:
101101

102-
#### `MUX_OUTPUT` (User Toasts)
102+
- **stdout**: Sent to the agent as the tool result. Use this for structured output the agent should act on.
103+
- **stderr**: Shown to the user in the UI but **not** sent to the agent. Use this for progress messages, logs, or debugging info that doesn't need AI attention.
103104

104-
Path to a temporary file for custom toast display content. Write markdown here for rich formatting in the UI toast:
105+
This design means scripts work identically whether run inside mux or directly from the command line.
105106

106-
```bash
107-
#!/usr/bin/env bash
108-
# Description: Deploy with custom output
109-
110-
echo "Deploying..." # Logged to stdout
111-
112-
# Write formatted output for toast display
113-
cat >> "$MUX_OUTPUT" << 'EOF'
114-
## 🚀 Deployment Complete
115-
116-
✅ Successfully deployed to staging
117-
EOF
118-
```
119-
120-
#### `MUX_PROMPT` (Agent Feedback)
121-
122-
Path to a temporary file for **sending messages back to the agent**. This is powerful for "Human-in-the-loop" or "Chain-of-thought" workflows where a script performs an action and then asks the agent to analyze the result.
107+
#### Example: Test Runner
123108

124109
```bash
125110
#!/usr/bin/env bash
126-
# Description: Run tests and ask Agent to fix failures
111+
# Description: Run tests and report failures for the agent to fix
112+
113+
set -euo pipefail
127114

128-
if ! npm test > test.log 2>&1; then
129-
echo "❌ Tests failed" >> "$MUX_OUTPUT"
115+
# Progress to stderr (user sees it, agent doesn't)
116+
echo "Running test suite..." >&2
130117

131-
# Feed the failure log back to the agent automatically
132-
cat >> "$MUX_PROMPT" << EOF
133-
The test suite failed. Here is the log:
118+
if npm test > test.log 2>&1; then
119+
# Success message to stdout (agent sees it)
120+
echo "✅ All tests passed"
121+
else
122+
# Structured failure info to stdout (agent sees and can act on it)
123+
cat << EOF
124+
❌ Tests failed. Here is the log:
134125
135126
\`\`\`
136127
$(cat test.log)
137128
\`\`\`
138129
139130
Please analyze this error and propose a fix.
140131
EOF
132+
exit 1
141133
fi
142134
```
143135

144136
**Result**:
145137

146-
1. Script fails.
147-
2. Agent receives the tool output (stderr/stdout) **PLUS** the content of `MUX_PROMPT` as part of the tool result.
148-
3. Agent can immediately act on the instructions in `MUX_PROMPT`.
149-
150-
**Note**: If a human ran the script, the content of `MUX_PROMPT` is sent as a **new user message** to the agent, triggering a conversation.
151-
152-
### File Size Limits
153-
154-
- **MUX_OUTPUT**: Maximum 10KB (truncated if exceeded)
155-
- **MUX_PROMPT**: Maximum 100KB (truncated if exceeded)
138+
1. User sees "Running test suite..." progress message.
139+
2. On failure, agent receives the structured error with test log and instructions.
140+
3. Agent can immediately analyze and propose fixes.
156141

157142
## Example Scripts
158143

src/browser/components/Messages/ScriptExecutionMessage.tsx

Lines changed: 1 addition & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -96,25 +96,11 @@ export const ScriptExecutionMessage: React.FC<ScriptExecutionMessageProps> = ({
9696

9797
{!isPending && result.output && (
9898
<DetailSection>
99-
<DetailLabel>Stdout / Stderr</DetailLabel>
99+
<DetailLabel>Output (agent-visible)</DetailLabel>
100100
<DetailContent>{result.output}</DetailContent>
101101
</DetailSection>
102102
)}
103103

104-
{!isPending && result.outputFile && (
105-
<DetailSection>
106-
<DetailLabel>MUX_OUTPUT</DetailLabel>
107-
<DetailContent>{result.outputFile}</DetailContent>
108-
</DetailSection>
109-
)}
110-
111-
{!isPending && result.promptFile && (
112-
<DetailSection>
113-
<DetailLabel>MUX_PROMPT</DetailLabel>
114-
<DetailContent>{result.promptFile}</DetailContent>
115-
</DetailSection>
116-
)}
117-
118104
{!isPending && result.truncated && (
119105
<DetailSection>
120106
<DetailLabel>Truncation</DetailLabel>

src/browser/utils/messages/modelMessageTransform.ts

Lines changed: 5 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -204,8 +204,7 @@ export function injectModeTransition(
204204
* Logic:
205205
* - Identifies messages with metadata.muxMetadata.type === "script-execution"
206206
* - Replaces them with a simple user text message
207-
* - Content format: "Script '<name>' executed (exit code <N>).\nStdout/Stderr:\n<output>"
208-
* - Explicitly EXCLUDES the full MUX_OUTPUT and MUX_PROMPT content to save tokens
207+
* - Content format: "Script '<name>' executed (exit code <N>).\nOutput:\n<output>"
209208
* - Preserves the rest of the message structure (id, role, other metadata)
210209
*/
211210
export function transformScriptMessagesForLLM(messages: MuxMessage[]): MuxMessage[] {
@@ -227,25 +226,21 @@ export function transformScriptMessagesForLLM(messages: MuxMessage[]): MuxMessag
227226

228227
let llmContent = `Script '${scriptMeta.scriptName}' executed (exit code ${result.exitCode}).`;
229228

230-
// Include Stdout/Stderr if present
229+
// Include output if present (this is stdout which is agent-visible)
231230
if (result.output) {
232-
llmContent += `\nStdout/Stderr:\n${result.output}`;
231+
llmContent += `\nOutput:\n${result.output}`;
233232
} else {
234-
llmContent += `\nStdout/Stderr: (no output)`;
233+
llmContent += `\nOutput: (no output)`;
235234
}
236235

237-
// Surface script errors for Codex/LLM reviewers even when no output exists.
236+
// Surface script errors for LLM reviewers even when no output exists.
238237
if ("error" in result) {
239238
const trimmedError = result.error.trim();
240239
if (trimmedError.length > 0) {
241240
llmContent += `\nError:\n${trimmedError}`;
242241
}
243242
}
244243

245-
// EXCLUDE MUX_OUTPUT and MUX_PROMPT from the LLM context for the script message itself.
246-
// MUX_PROMPT is sent as a separate user message by ChatInput, so including it here would be duplication.
247-
// MUX_OUTPUT is intended for user toasts, not LLM context.
248-
249244
return [
250245
{
251246
...msg,

src/browser/utils/messages/transformScriptMessagesForLLM.test.ts

Lines changed: 11 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ import type { MuxMessage } from "@/common/types/message";
44
import type { BashToolResult } from "@/common/types/tools";
55

66
describe("transformScriptMessagesForLLM", () => {
7-
it("should include stdout/stderr in script execution messages", () => {
7+
it("should include output in script execution messages", () => {
88
const scriptResult: BashToolResult = {
99
success: true,
1010
output: "some stdout output",
@@ -38,34 +38,32 @@ describe("transformScriptMessagesForLLM", () => {
3838
expect(textPart.type).toBe("text");
3939
if (textPart.type === "text") {
4040
expect(textPart.text).toContain("Script 'test.sh' executed");
41-
expect(textPart.text).toContain("Stdout/Stderr:");
41+
expect(textPart.text).toContain("Output:");
4242
expect(textPart.text).toContain("some stdout output");
4343
}
4444
});
4545

46-
it("should exclude MUX_OUTPUT and MUX_PROMPT from script execution messages (avoid duplication)", () => {
46+
it("should show (no output) when script has empty stdout", () => {
4747
const scriptResult: BashToolResult = {
4848
success: true,
49-
output: "stdout stuff",
49+
output: "",
5050
exitCode: 0,
5151
wall_duration_ms: 100,
52-
outputFile: "User toast",
53-
promptFile: "Model prompt",
5452
};
5553

5654
const messages: MuxMessage[] = [
5755
{
58-
id: "script-all",
56+
id: "script-empty",
5957
role: "user",
60-
parts: [{ type: "text", text: "Executed script: /script all" }],
58+
parts: [{ type: "text", text: "Executed script: /script empty" }],
6159
metadata: {
6260
muxMetadata: {
6361
type: "script-execution",
64-
id: "script-all",
62+
id: "script-empty",
6563
historySequence: 0,
6664
timestamp: 123,
67-
command: "/script all",
68-
scriptName: "all.sh",
65+
command: "/script empty",
66+
scriptName: "empty.sh",
6967
args: [],
7068
result: scriptResult,
7169
},
@@ -78,10 +76,7 @@ describe("transformScriptMessagesForLLM", () => {
7876
const textPart = result[0].parts[0];
7977
expect(textPart.type).toBe("text");
8078
if (textPart.type === "text") {
81-
expect(textPart.text).not.toContain("MUX_OUTPUT");
82-
expect(textPart.text).not.toContain("User toast");
83-
expect(textPart.text).not.toContain("MUX_PROMPT");
84-
expect(textPart.text).not.toContain("Model prompt");
79+
expect(textPart.text).toContain("Output: (no output)");
8580
}
8681
});
8782

@@ -118,7 +113,7 @@ describe("transformScriptMessagesForLLM", () => {
118113
const textPart = result[0].parts[0];
119114
expect(textPart.type).toBe("text");
120115
if (textPart.type === "text") {
121-
expect(textPart.text).toContain("Stdout/Stderr: (no output)");
116+
expect(textPart.text).toContain("Output: (no output)");
122117
expect(textPart.text).toContain("Error:");
123118
expect(textPart.text).toContain("Permission denied");
124119
}

src/common/types/tools.ts

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -25,8 +25,6 @@ export type BashToolResult =
2525
reason: string;
2626
totalLines: number;
2727
};
28-
outputFile?: string; // Content from MUX_OUTPUT env file
29-
promptFile?: string; // Content from MUX_PROMPT env file
3028
})
3129
| (CommonBashFields & {
3230
success: false;
@@ -38,8 +36,6 @@ export type BashToolResult =
3836
reason: string;
3937
totalLines: number;
4038
};
41-
outputFile?: string; // Content from MUX_OUTPUT env file
42-
promptFile?: string; // Content from MUX_PROMPT env file
4339
});
4440

4541
// File Read Tool Types

src/common/utils/tools/tools.test.ts

Lines changed: 95 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -117,7 +117,7 @@ describe("getToolsForModel", () => {
117117
expect(demoTool).toBeDefined();
118118
});
119119

120-
it("should include MUX_PROMPT and MUX_OUTPUT in tool result", async () => {
120+
it("should return stdout as agent-visible output", async () => {
121121
const mockScripts = [
122122
{
123123
name: "diagnose",
@@ -134,14 +134,12 @@ describe("getToolsForModel", () => {
134134
success: true,
135135
data: {
136136
exitCode: 0,
137-
stdout: "Standard output",
137+
stdout: "Standard output from script",
138138
stderr: "",
139-
outputFileContent: "User notification",
140-
promptFileContent: "Agent instruction",
141139
toolResult: {
142140
success: true,
143141
exitCode: 0,
144-
output: "",
142+
output: "Standard output from script",
145143
wall_duration_ms: 1000,
146144
},
147145
},
@@ -171,9 +169,98 @@ describe("getToolsForModel", () => {
171169
})
172170
);
173171

174-
expect(result).toContain("Standard output");
175-
expect(result).toContain("--- MUX_OUTPUT ---\nUser notification");
176-
expect(result).toContain("--- MUX_PROMPT ---\nAgent instruction");
172+
expect(result).toContain("Standard output from script");
173+
// stderr is frontend-only, should not appear in result on success
174+
expect(result).not.toContain("Error:");
175+
});
176+
177+
it("should return (no stdout) when script produces no output", async () => {
178+
const mockScripts = [
179+
{
180+
name: "silent",
181+
description: "Silent script",
182+
isExecutable: true,
183+
},
184+
];
185+
186+
const mockListScripts = listScripts as unknown as Mock<typeof listScripts>;
187+
mockListScripts.mockResolvedValue(mockScripts);
188+
189+
const mockRunScript = runWorkspaceScript as unknown as Mock<typeof runWorkspaceScript>;
190+
mockRunScript.mockResolvedValue({
191+
success: true,
192+
data: {
193+
exitCode: 0,
194+
stdout: "",
195+
stderr: "",
196+
toolResult: {
197+
success: true,
198+
exitCode: 0,
199+
output: "",
200+
wall_duration_ms: 100,
201+
},
202+
},
203+
});
204+
205+
const tools = await getToolsForModel(
206+
"anthropic:claude-3-5-sonnet",
207+
config,
208+
"workspace-id",
209+
mockInitStateManager
210+
);
211+
212+
const silentTool = tools.script_silent as unknown as {
213+
execute: (args: { args: string[] }) => Promise<string>;
214+
};
215+
const result = await silentTool.execute({ args: [] });
216+
217+
expect(result).toBe("(no stdout)");
218+
});
219+
220+
it("should include stderr in result only on non-zero exit", async () => {
221+
const mockScripts = [
222+
{
223+
name: "failing",
224+
description: "Failing script",
225+
isExecutable: true,
226+
},
227+
];
228+
229+
const mockListScripts = listScripts as unknown as Mock<typeof listScripts>;
230+
mockListScripts.mockResolvedValue(mockScripts);
231+
232+
const mockRunScript = runWorkspaceScript as unknown as Mock<typeof runWorkspaceScript>;
233+
mockRunScript.mockResolvedValue({
234+
success: true,
235+
data: {
236+
exitCode: 1,
237+
stdout: "",
238+
stderr: "Something went wrong",
239+
toolResult: {
240+
success: false,
241+
exitCode: 1,
242+
output: "",
243+
error: "Something went wrong",
244+
wall_duration_ms: 100,
245+
},
246+
},
247+
});
248+
249+
const tools = await getToolsForModel(
250+
"anthropic:claude-3-5-sonnet",
251+
config,
252+
"workspace-id",
253+
mockInitStateManager
254+
);
255+
256+
const failingTool = tools.script_failing as unknown as {
257+
execute: (args: { args: string[] }) => Promise<string>;
258+
};
259+
const result = await failingTool.execute({ args: [] });
260+
261+
expect(result).toContain("(no stdout)");
262+
expect(result).toContain("Error: Something went wrong");
263+
expect(result).toContain("(Exit Code: 1)");
177264
});
178265

179266
it("should handle script discovery failure gracefully", async () => {

0 commit comments

Comments
 (0)