feat: upgrade the voice interaction experience#459
feat: upgrade the voice interaction experience#459jamiepine wants to merge 40 commits intofeat/worker-orchestration-screenfrom
Conversation
Implements a full Mattermost adapter using a custom HTTP + WebSocket
client (reqwest + tokio-tungstenite), following the existing adapter
architecture.
Features:
- WebSocket-based inbound messages with exponential backoff reconnection
- Text responses, streaming edits (ZWS placeholder + throttled PUT),
file uploads, reactions, typing indicators, thread replies
- History fetch via /channels/{id}/posts
- Named adapter instances with per-instance permissions
- Hot-reload of permissions via spawn_file_watcher
- Conversation IDs: mattermost:{team_id}:{channel_id} and
mattermost:{team_id}:dm:{user_id}
- Fail-closed permission filtering: messages without a team_id are
rejected when team/channel filters are configured
- URL validation at config load time
Config:
[messaging.mattermost]
enabled = true
base_url = "https://mattermost.example.com"
token = "env:MATTERMOST_TOKEN"
team_id = "optional-default-team-id"
Resolved conflicts in src/config.rs (refactored to submodules), src/main.rs, src/messaging.rs, and src/messaging/target.rs. Distributed Mattermost config additions across the new config submodule structure introduced upstream.
- Add Mattermost platform to ChannelSettingCard, ChannelEditModal, Settings, and platformIcons - Server URL and access token credential inputs - Backend: read mattermost config into instances list in messaging_status - Backend: write base_url and token to TOML via create_messaging_instance - Mattermost instances configured manually in config.toml appear in the UI
Empty dm_allowed_users blocks all DMs (fail-closed). Non-empty list allows only the specified Mattermost user IDs, consistent with Slack and Telegram adapter behaviour. All other layers (config, TOML schema, permissions merge from bindings, bindings API, UI TagInput) were already in place.
SEC2: add sanitize_reaction_name() — resolves unicode emoji to shortcodes
via the emojis crate, strips colon wrappers, lowercases plain names.
FN1: user identity resolution — add user_identity_cache, resolve
display name from /users/{id} API on first encounter, set
sender_display_name metadata and formatted_author.
FN2: channel name resolution — add channel_name_cache, resolve
display_name from /channels/{id} API on first encounter, set
mattermost_channel_name and CHANNEL_NAME metadata.
FN3: fetch_history now returns timestamps parsed from post.create_at
milliseconds instead of always None.
FN4: bot mention detection — check message text for @bot_username,
set mattermost_mentions_or_replies_to_bot metadata flag so
require_mention routing mode works correctly.
Also adds 12 new unit tests covering these fixes.
mattermost.rs:
- Validate base_url has no path/query/fragment in new()
- Check /users/me HTTP status before .json() to surface auth errors
- start_typing() aborts previous task before spawning to prevent leaks
- shutdown() drains typing_tasks with abort() instead of clear()
- StreamStart now falls back to REPLY_TO_MESSAGE_ID for thread replies
- StreamEnd applies MAX_MESSAGE_LENGTH chunking to avoid oversized edits
- Reaction handler fails fast if bot_user_id not initialized
- respond() file attachment post checks HTTP status
- broadcast() file upload checks HTTP status
- broadcast() resolves dm:{user_id} targets via /channels/direct API
- split_message uses floor_char_boundary before slicing to prevent UTF-8 panics
- Identity/channel lookups log errors via tracing::debug instead of silent None
target.rs:
- resolve_broadcast_target extracts named adapter key from conversation ID
- normalize_mattermost_target strips team/instance prefixes down to bare channel_id or dm:{user_id}
api/messaging.rs:
- Validate mattermost base_url before persisting to config.toml
- Delete default Mattermost instance now clears all credential fields
config/load.rs:
- Named Mattermost instances with missing credentials are preserved but
disabled (map + enabled&&has_credentials) instead of silently dropped
channel_ids filter at line 1543 only checked Discord/Slack/Twitch metadata, causing it to return false for Mattermost messages before the Mattermost- specific block ran. Added mattermost_channel_id to the direct_match set. require_mention routing had no mattermost arm in the mention_key switch, so all Mattermost messages were dropped when require_mention = true. Added mattermost -> mattermost_mentions_or_replies_to_bot mapping.
target.rs: Default DM conv IDs (mattermost:team:dm:user, 4 parts) were
matching the named-instance arm in extract_mattermost_adapter_from_channel_id
and returning mattermost:{team_id} as the adapter. Fixed match arm ordering
so the default-DM pattern is checked before the 4-part named-channel pattern.
target.rs: normalize_mattermost_target accepted bare 'dm' as a valid channel
ID. Added != 'dm' guard to the single-segment arm.
mattermost.rs: StreamEnd overflow chunks were posted with root_id = None,
escaping the original thread. Now resolves root_id from mattermost_root_id /
REPLY_TO_MESSAGE_ID metadata and passes it to all overflow create_post calls.
mattermost.rs: WS post JSON decode failures were silently dropped. Now logs
at DEBUG level via tracing::debug so malformed payloads are visible.
Use Cow<str> so the common case (text under MAX_MESSAGE_LENGTH) borrows accumulated_text directly instead of cloning it on every throttled edit.
- validate_mattermost_url: fail-closed for non-localhost http (was warn-only) - MattermostConfig: add SystemSecrets impl and register in system_secret_registry - StreamChunk: release active_messages write lock before awaiting edit_post - channel_filter: fail-closed when team has no allowlist entry (was fail-open) - WS shutdown: log close frame send failure instead of discarding with let _ = Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- channel_filter: add test for fail-closed behavior when team is missing from filter - WS event loop: log outer MattermostWsEvent parse failures (was silently dropped) - broadcast: check HTTP status on /posts with file_ids (was unchecked) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- SystemSecrets: add InstancePattern to token and base_url fields so
MATTERMOST_{INSTANCE}_TOKEN/BASE_URL are recognized as system secrets
- start(): log warning when bot_user_id/bot_username OnceCell is already set
instead of silently discarding the error
- shutdown(): use tx.send().await.ok() pattern for intentional channel-send discard
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- validate_mattermost_url: reject URLs with credentials, non-root path, query string, or fragment (accept origin-only URLs only) - start()/WS task: apply default_team_id fallback when WS event omits team_id - matches_route: remove duplicate Mattermost-only channel_ids filter block (mattermost_channel_id is already covered by direct_match) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- bindings API: add team_id to BindingResponse, CreateBindingRequest, and UpdateBindingRequest so Mattermost team scoping is readable/writable via API - validate_mattermost_url: reject URLs with credentials, non-root path, query string, or fragment; add tests covering all new rejection branches - broadcast File: include root_id in /posts payload so file replies stay in thread - StreamStart: stop typing task before propagating create_post error to prevent leak - config/types.rs: rename |i| closure var to |instance| for readability - mattermost tests: rename bmfp/bmfp_named helpers and p param to descriptive names Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace implementation plan with concise feature list, setup guide, named instance configuration, and notes on what is missing compared to Slack/Discord. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- create_binding: guard empty team_id to match update_binding behavior - DeleteBindingRequest: add team_id to prevent wrong-binding deletion - split_message: filter break_point == 0 to prevent empty chunk push - mentions_bot: set true for DMs; detect thread replies to bot via async root-post fetch in WS handler (resolve_root_post_author) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- bindings: normalize team_id in update/delete matching — trim whitespace and treat whitespace-only values as absent for consistent comparisons - bindings: trim team_id before persisting in update_binding write path - bindings: rename abbreviated match locals (gid/wid/cid/tid) to full names - ws_url: replace silent scheme passthrough with unreachable! for unsupported schemes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- bindings: hoist request_team_id normalization before the loop in both update_binding and delete_binding to avoid recomputing it per-iteration - mattermost new(): remove redundant .into() on anyhow::anyhow! return - ws loop: rename e to error in match arms and tracing macros Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix skill install API returning 500 for single-skill repos by always using 3-part spec format (owner/repo/skill-name) instead of bare owner/repo format that the backend rejects - Fix recharts "width/height should be greater than 0" warnings by adding min-width/min-height constraints to ResponsiveContainer components and their parent divs Fixes browser console errors: - /api/agents/skills/install 500 errors when installing single-skill repos - Chart dimension warnings on AgentDetail page
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add speech personality editing and persistent spoken-response history so voice replies stay varied across sessions. Refresh the overlay with the official bull orb treatment, Siri-style branded waves, and polished desktop window styling.
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
WalkthroughAdds end-to-end voice features: desktop overlay window with global hotkeys, frontend recording/tts playback and overlay UI, OpenCode embed/orchestration, backend transcription and TTS proxying, spoken-response generation with message IDs and SSE events, and SPEECH.md identity support plus related plumbing. Changes
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip CodeRabbit can scan for known vulnerabilities in your dependencies using OSV Scanner.OSV Scanner will automatically detect and report security vulnerabilities in your project's dependencies. No additional configuration is required. |
| /// Generate a short, conversational spoken version of a full text response. | ||
| /// | ||
| /// Uses a fast LLM call with a focused prompt that strips markdown, code | ||
| /// blocks, and technical detail — producing 1-3 natural sentences suitable | ||
| /// for TTS playback. | ||
| /// Generate a short, conversational spoken version of a full text response. | ||
| /// | ||
| /// Uses a fast LLM call with a focused prompt that strips markdown, code | ||
| /// blocks, and technical detail — producing 1-3 natural sentences suitable | ||
| /// for TTS playback. |
There was a problem hiding this comment.
Doc comment looks duplicated here (same block twice); dropping the second copy keeps rustdoc clean.
| /// Generate a short, conversational spoken version of a full text response. | |
| /// | |
| /// Uses a fast LLM call with a focused prompt that strips markdown, code | |
| /// blocks, and technical detail — producing 1-3 natural sentences suitable | |
| /// for TTS playback. | |
| /// Generate a short, conversational spoken version of a full text response. | |
| /// | |
| /// Uses a fast LLM call with a focused prompt that strips markdown, code | |
| /// blocks, and technical detail — producing 1-3 natural sentences suitable | |
| /// for TTS playback. | |
| /// Generate a short, conversational spoken version of a full text response. | |
| /// | |
| /// Uses a fast LLM call with a focused prompt that strips markdown, code | |
| /// blocks, and technical detail — producing 1-3 natural sentences suitable | |
| /// for TTS playback. |
src/tools/reply.rs
Outdated
| if let Ok(mut slot) = self.replied_text.lock() { | ||
| *slot = Some(converted_content.clone()); | ||
| } |
There was a problem hiding this comment.
If this mutex ever gets poisoned, this silently skips setting replied_text (and voice gen won’t run). Seems safer to recover the guard.
| if let Ok(mut slot) = self.replied_text.lock() { | |
| *slot = Some(converted_content.clone()); | |
| } | |
| let mut slot = self | |
| .replied_text | |
| .lock() | |
| .unwrap_or_else(|error| error.into_inner()); | |
| *slot = Some(converted_content.clone()); |
| setState("loading"); | ||
|
|
||
| try { | ||
| const audioBuffer = await api.ttsGenerate(text, {agentId, profileId}); |
There was a problem hiding this comment.
Small edge case: if stop() is called while api.ttsGenerate(...) is still in-flight, this speak() continues and will start playback when the request resolves. Consider a simple request token (or AbortController) so stop() cancels pending loads too.
| Array.from({length: SPECTRUM_BAR_COUNT}, () => 0), | ||
| ); | ||
|
|
||
| const cleanupAudioGraph = useCallback(() => { |
There was a problem hiding this comment.
This hook doesn’t do any unmount cleanup; if the overlay UI unmounts while recording, the mic tracks / AudioContext can stick around. I’d add a useEffect cleanup that calls cleanupAudioGraph() (and optionally stops the recorder) on unmount.
There was a problem hiding this comment.
Actionable comments posted: 14
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
🟡 Minor comments (9)
desktop/src-tauri/src/main.rs-133-137 (1)
133-137:⚠️ Potential issue | 🟡 MinorLog errors instead of silently discarding them.
Same issue here — window chrome operations can fail and should be logged.
🔧 Proposed fix
fn apply_overlay_window_chrome(window: &tauri::WebviewWindow) { - let _ = window.set_decorations(false); - let _ = window.set_shadow(false); - let _ = window.set_always_on_top(true); + if let Err(error) = window.set_decorations(false) { + tracing::warn!(%error, "failed to set overlay decorations"); + } + if let Err(error) = window.set_shadow(false) { + tracing::warn!(%error, "failed to set overlay shadow"); + } + if let Err(error) = window.set_always_on_top(true) { + tracing::warn!(%error, "failed to set overlay always-on-top"); + } }As per coding guidelines: "Don't silently discard errors. No
let _ =on Results."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@desktop/src-tauri/src/main.rs` around lines 133 - 137, The apply_overlay_window_chrome function currently discards Result values from window.set_decorations, window.set_shadow, and window.set_always_on_top; update it to inspect each Result and log failures instead of using `let _ =`. For each call (set_decorations, set_shadow, set_always_on_top) match or use `if let Err(e)` and emit a clear error log (e.g., via the project's logger/tracing facility) that includes the operation name and the error value so failures are visible for debugging.desktop/src-tauri/src/main.rs-54-68 (1)
54-68:⚠️ Potential issue | 🟡 MinorAvoid silently discarding window operation errors.
Per coding guidelines,
let _ =on Results should be avoided. These window operations can fail, and silently discarding errors makes debugging harder. Consider logging failures instead.🔧 Proposed fix to log errors
fn toggle_overlay(app: &tauri::AppHandle) { if let Some(overlay) = app.get_webview_window("voice-overlay") { // Toggle visibility if overlay.is_visible().unwrap_or(false) { - let _ = overlay.hide(); + if let Err(error) = overlay.hide() { + tracing::warn!(%error, "failed to hide voice overlay"); + } } else { apply_overlay_window_chrome(&overlay); - let _ = overlay.show(); - let _ = overlay.set_focus(); + if let Err(error) = overlay.show() { + tracing::warn!(%error, "failed to show voice overlay"); + } + if let Err(error) = overlay.set_focus() { + tracing::warn!(%error, "failed to focus voice overlay"); + } } } else { // Create the overlay window on first toggle create_overlay_window(app); } }As per coding guidelines: "Don't silently discard errors. No
let _ =on Results. Handle them, log them, or propagate them."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@desktop/src-tauri/src/main.rs` around lines 54 - 68, The toggle_overlay function currently discards Results from window operations using `let _ =`, which hides failures; update toggle_overlay to handle and log errors instead: when calling `overlay.is_visible()`, `overlay.hide()`, `apply_overlay_window_chrome(&overlay)`, `overlay.show()`, and `overlay.set_focus()` return or propagate their Results (via match/if let Err(e) / ? where appropriate) and log any Err using the project's logging facility (or eprintln! if none) so failures are visible during debugging; keep the existing behavior of creating the window via `create_overlay_window` when the webview is absent.desktop/src-tauri/src/main.rs-152-161 (1)
152-161:⚠️ Potential issue | 🟡 MinorAvoid
.unwrap()on shortcut registration — prefer graceful handling.If shortcut registration fails (e.g., the shortcut is already taken by another application),
.unwrap()will panic and crash the app. Consider usingexpect()with context or handling the error gracefully.🔧 Proposed fix
.plugin( tauri_plugin_global_shortcut::Builder::new() .with_shortcut(shortcut) - .unwrap() + .expect("failed to register Alt+Space shortcut — may be in use by another app") .with_handler(move |app, _shortcut, event| { if event.state == tauri_plugin_global_shortcut::ShortcutState::Pressed { toggle_overlay(app); } }) .build(), )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@desktop/src-tauri/src/main.rs` around lines 152 - 161, The code currently calls .unwrap() on tauri_plugin_global_shortcut::Builder::with_shortcut which will panic if registration fails; change this to handle the Result instead (e.g., use .map_err/.and_then or a match) so you can log the error and either skip registering the shortcut plugin or return a controlled Err; specifically locate the Builder chain where with_shortcut(shortcut).unwrap() is called and replace unwrap with error handling that logs the failure (using your app logger) and avoids panicking while keeping the .with_handler and toggle_overlay logic intact.src/config/watcher.rs-129-131 (1)
129-131:⚠️ Potential issue | 🟡 MinorLimit identity reloads to actual identity files under the agent root.
This basename-only check now treats any changed
SPEECH.mdas an identity edit, including files under the recursively watchedskills/trees. A skill/doc file with the same name can trigger unnecessary identity reloads for every agent.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/config/watcher.rs` around lines 129 - 131, The current identity_changed check only matches basenames and wrongly treats files like skills/**/SPEECH.md as identity edits; change the predicate used in the changed_paths.iter().any closure (the identity_changed computation) to first verify the changed path is directly under the agent root (use Path::strip_prefix(agent_root) and ensure the resulting relative path has no parent, i.e. rel.parent().is_none()), then match the file_name against "SOUL.md" | "IDENTITY.md" | "ROLE.md" | "SPEECH.md"; this ensures only top-level identity files trigger reloads.src/api/webchat.rs-166-168 (1)
166-168:⚠️ Potential issue | 🟡 MinorHandle unexpected multipart fields explicitly.
Swallowing
field.bytes().awaithere drops parse failures silently, so malformed trailing fields disappear without any trace. Either reject unknown fields or log/read failures before continuing.As per coding guidelines, "Don't silently discard errors. No `let _ =` on Results. Handle them, log them, or propagate them."💡 Minimal fix
_ => { - // Skip unknown fields - let _ = field.bytes().await; + if let Err(error) = field.bytes().await { + tracing::warn!(%error, field = %name, "failed to drain unexpected multipart field"); + return Err(StatusCode::BAD_REQUEST); + } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/api/webchat.rs` around lines 166 - 168, The match arm that currently does `let _ = field.bytes().await;` silently discards any errors from parsing unknown multipart fields; update the `_` arm in the multipart field handling (the match that binds `field`) to explicitly handle the Result from `field.bytes().await` — either propagate an error (return Err with a descriptive context) or log the failure (using the existing logger) and continue, e.g. by matching on `field.bytes().await` and calling `process_logger.error`/`tracing::error` with the error if it fails or discarding the bytes only on Ok; ensure you reference the same variable names (`field`, `field.bytes().await`) and keep behavior consistent with other arms.src/tools/reply.rs-467-470 (1)
467-470:⚠️ Potential issue | 🟡 MinorDon't silently drop
replied_textlock failures.A poisoned mutex here disables spoken-response capture with no signal. Log and recover from the lock error instead of skipping the update.
As per coding guidelines, "Don't silently discard errors. No `let _ =` on Results. Handle them, log them, or propagate them."💡 Minimal fix
- if let Ok(mut slot) = self.replied_text.lock() { - *slot = Some(converted_content.clone()); - } + match self.replied_text.lock() { + Ok(mut replied_text) => *replied_text = Some(converted_content.clone()), + Err(poisoned_lock) => { + tracing::warn!( + conversation_id = %self.conversation_id, + "replied_text mutex was poisoned; recovering" + ); + *poisoned_lock.into_inner() = Some(converted_content.clone()); + } + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/tools/reply.rs` around lines 467 - 470, The code currently ignores errors from self.replied_text.lock(), which drops updates silently; change the lock handling to catch the Err case, log the poisoning error (using the existing logger/tracing infrastructure) and recover by calling err.into_inner() to obtain the guard so you can still set *slot = Some(converted_content.clone()); ensure you still handle the Ok(path) branch the same way and do not use let _ = on the Result — explicitly match on self.replied_text.lock() and log the error before recovering with into_inner() and performing the assignment.interface/src/routes/Orchestrate.tsx-142-160 (1)
142-160:⚠️ Potential issue | 🟡 MinorGroup tracks by
project_id, not the display name.Two unrelated projects with the same
project_namewill be collapsed into one track here, and the render loop below will also reuse the same React key. Keepproject_nameas the label, but use a stable project identifier for grouping.Also applies to: 247-248
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@interface/src/routes/Orchestrate.tsx` around lines 142 - 160, Group by the stable project identifier instead of the display name: change the grouping key in the byProject build to use worker.project_id (e.g. const groupKey = worker.project_id ?? `ungrouped:${worker.project_name ?? 'Ungrouped'}`) while still storing the human label as name = worker.project_name ?? 'Ungrouped'; when creating entries and the mapped object (the block that returns { name, projectId: ... }), ensure projectId is the actual numeric/string worker.project_id (or null) and use that stable projectId as the React key in the render loop; apply the same change to the other instance noted (lines ~247-248) so unrelated projects with identical project_name are not collapsed.src/agent/channel.rs-1537-1540 (1)
1537-1540:⚠️ Potential issue | 🟡 MinorDon't swallow poisoned
replied_textlocks.If the reply tool panics while holding this mutex,
.lock().ok()turns that into “no spoken response” with no signal. Please log or propagate the lock failure instead of silently discarding it. As per coding guidelines, "Don't silently discard errors. Nolet _ =on Results. Handle them, log them, or propagate them. The only exception is.ok()on channel sends where the receiver may be dropped".Also applies to: 1916-1919
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/agent/channel.rs` around lines 1537 - 1540, The code currently swallows poisoned mutex errors by calling replied_text.lock().ok(), losing error context; change this to explicitly handle PoisonError instead of .ok(): call replied_text.lock() and match the Result, log the poison error (including context like replied_flag and operation), and recover by using poison_err.into_inner() to obtain the guard (or propagate the error via Result if the surrounding function returns a Result). Replace the .lock().ok().and_then(|mut slot| slot.take()) pattern in the replied handling (the block using replied_flag and replied_text) with this explicit match/log/recover flow, and apply the same fix at the other occurrence mentioned (around lines 1916-1919) so poisoned locks are not silently discarded.interface/src/hooks/useTtsPlayback.ts-49-58 (1)
49-58:⚠️ Potential issue | 🟡 MinorAdd cleanup on unmount to stop playback and close AudioContext.
The hook has no cleanup function. If the component unmounts mid-playback, the audio continues running and the RAF-driven metering updates persist after unmount. Add a
useEffectwith cleanup that callsstop()and closes theaudioContextRef.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@interface/src/hooks/useTtsPlayback.ts` around lines 49 - 58, The hook useTtsPlayback is missing an unmount cleanup which can leave audio playing and RAF metering running; add a useEffect that returns a cleanup function which calls the hook's stop() to halt playback, cancels any pending animation frame via animationFrameRef, disconnects/clears sourceRef and analyserRef, sets playbackDoneRef to null, and closes audioContextRef (await close if needed) and nulls it so the AudioContext is released when the component unmounts. Ensure the cleanup references the existing stop() function and the refs (audioContextRef, sourceRef, analyserRef, animationFrameRef, playbackDoneRef) to locate the code to modify.
🧹 Nitpick comments (4)
interface/src/components/OpenCodeEmbed.tsx (1)
124-127: Event listener never removed from controller.signal.The
onUnmountlistener added tocontroller.signalis never removed. While this is minor since the controller is aborted on cleanup anyway, explicitly removing listeners is cleaner.♻️ Proposed fix
const probeController = new AbortController(); const onUnmount = () => probeController.abort(); controller.signal.addEventListener("abort", onUnmount); + + const cleanup = () => { + controller.signal.removeEventListener("abort", onUnmount); + }; const sseRes = await fetch(`${serverUrl}/global/event`, {Then call
cleanup()before returning from the async function (at lines 154 and 159).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@interface/src/components/OpenCodeEmbed.tsx` around lines 124 - 127, The added abort listener on controller.signal (via controller.signal.addEventListener("abort", onUnmount)) is never removed; update the async routine that creates const probeController = new AbortController() and const onUnmount = () => probeController.abort() so that you remove the listener before any early return and on normal completion — call controller.signal.removeEventListener("abort", onUnmount) inside the existing cleanup() (or invoke cleanup() before returning) and ensure both early-return paths in the async function call cleanup() so the event listener is always removed.src/factory/presets.rs (1)
79-86: Please add one test that a preset withSPEECH.mdreturnsSome(speech).Right now a missing or misnamed speech asset still passes every existing test because
speechis optional. Onemain-agentassertion would catch silent regressions.🧪 Example coverage
#[test] fn load_returns_full_preset() { let preset = PresetRegistry::load("community-manager") .expect("community-manager preset should exist"); @@ assert!(!preset.role.is_empty()); } + + #[test] + fn main_agent_loads_speech_guidance() { + let preset = + PresetRegistry::load("main-agent").expect("main-agent preset should exist"); + + assert!( + preset + .speech + .as_deref() + .is_some_and(|speech| !speech.trim().is_empty()), + "main-agent should ship with non-empty SPEECH.md guidance" + ); + }Also applies to: 111-116
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/factory/presets.rs` around lines 79 - 86, Add a unit test that verifies a preset with a SPEECH.md file returns Some(speech): create a small fixture (or temp dir) containing a preset id folder with SPEECH.md, load the preset via the existing Preset loading path (use Preset::load or the codepath that calls Self::load_optional_file), and assert preset.speech.is_some() and that the contents match the expected string (e.g., contains "main-agent" or the known fixture text); place the test near other Preset tests in presets.rs so it exercises the load_optional_file -> Preset construction path referenced around the speech handling code.src/api/tts.rs (2)
207-218: Consider logging when profile resolution fails.The
.ok()?pattern silently discards HTTP and JSON parsing errors. While the fallback behavior is correct (caller handlesNone), adding debug-level logging would aid troubleshooting when profile auto-detection fails.🔧 Suggested improvement
async fn resolve_default_profile_id( client: &reqwest::Client, voicebox_url: &str, ) -> Option<String> { let endpoint = format!("{}/profiles", voicebox_url.trim_end_matches('/')); - let response = client.get(endpoint).send().await.ok()?; + let response = client.get(&endpoint).send().await.map_err(|e| { + tracing::debug!(%e, %endpoint, "failed to fetch Voicebox profiles for default resolution"); + e + }).ok()?; if !response.status().is_success() { + tracing::debug!(status = %response.status(), "Voicebox profiles returned non-success status"); return None; } - let profiles = response.json::<Vec<VoiceboxProfile>>().await.ok()?; + let profiles = response.json::<Vec<VoiceboxProfile>>().await.map_err(|e| { + tracing::debug!(%e, "failed to parse Voicebox profiles response"); + e + }).ok()?; profiles.first().map(|p| p.id.clone()) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/api/tts.rs` around lines 207 - 218, The function resolve_default_profile_id currently swallows errors via .ok()? which makes failures silent; change the implementation to explicitly handle the request and JSON parsing errors so you can emit debug-level logs: perform the GET on the constructed endpoint and if response.send() fails log the error with context (including voicebox_url/endpoint), if response.status().is_success() is false log the status and body or a short message, and when response.json::<Vec<VoiceboxProfile>>() fails log the parse error; keep returning None on failure but replace the chained .ok()? calls with match/if let so you can call your preferred logger (e.g. debug! or tracing::debug!) referencing resolve_default_profile_id, client, voicebox_url, endpoint, response, and profiles to aid troubleshooting.
62-68: Consider reusing the HTTP client across requests.A new
reqwest::Clientis constructed on every TTS request. The client maintains connection pools and is designed to be reused. Consider storing a shared client inApiStateor using aOnceCell/LazyLockfor better performance.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/api/tts.rs` around lines 62 - 68, The TTS handler currently builds a new reqwest::Client inside the function (the local variable `client` from `reqwest::Client::builder()`), which wastes connection pools; instead add a shared/reused client and use it for requests: e.g., add a field to `ApiState` (or a global `OnceCell`/`LazyLock`) to hold a pre-built `reqwest::Client` and replace the local builder call with retrieving that shared client (handle the Option/initialization or return the same `StatusCode::INTERNAL_SERVER_ERROR` on startup/build failure). Ensure you remove the per-request `Client::builder()` usage and reference the shared client where `client` is currently used.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@interface/src/components/WebChatPanel.tsx`:
- Around line 193-210: The spoken-response matching currently deduplicates by
text in handleSpokenResponse, which causes collisions and leaks across sessions;
change the logic to use a stable per-message identifier from the event (e.g.,
event.message_id or event.turn_id) instead of full_text/spoken_text for both the
duplicate check and the created id (update the id generation and duplicate
predicate in handleSpokenResponse and the other spoken-block insertion logic
that mirrors it). If the event doesn't provide a stable id, fall back to a
combined stable key like `${event.channel_id}:${event.sequence}` or include a
monotonic counter scoped to sessionId so duplicates are resolved by stable
message id rather than text. Ensure you reference SpokenResponseEvent,
sessionId, setSpokenBlocks, and the created block id consistently.
In `@interface/src/hooks/useAudioRecorder.ts`:
- Around line 177-181: The catch in startRecording() swallows errors, preventing
callers from handling failures; modify startRecording() so the catch still
performs cleanupAudioGraph() and sets state back to "idle" but then rethrow the
caught error (or return a rejected Promise) so callers can react to
getUserMedia/recorder setup failures instead of assuming recording started;
ensure cleanupAudioGraph() and setState("idle") are called before rethrowing so
resources are freed and UI can roll back.
- Around line 64-78: Add a React useEffect cleanup that calls the existing
cleanupAudioGraph() when the hook unmounts and also stops any active
MediaRecorder by calling mediaRecorderRef.current?.stop(); specifically, add a
useEffect with an empty dependency array that returns a cleanup function which
first calls mediaRecorderRef.current?.stop() (if present) and then calls
cleanupAudioGraph() so the analyserRef, streamRef, audioContextRef, and other
resources are always released on unmount.
In `@interface/src/hooks/useTtsPlayback.ts`:
- Around line 83-154: Add a per-call cancellation token to speak(): introduce a
requestIdRef (number) that you increment at the top of speak() and also in
stop(), capture localRequestId at start of the async flow, and after every await
(after api.ttsGenerate, after decodeAudioData, and before source.start()) check
that localRequestId === requestIdRef.current and if not bail out and clean up
any created nodes/contexts (clear analyserRef/sourceRef, stopMetering,
setState("idle") and resolve/return). Use these symbols to find places to
change: speak(), stop(), requestIdRef, audioContextRef, sourceRef, analyserRef,
playbackDoneRef, smoothedLevelRef. Ensure playbackDoneRef and source.onended are
only assigned when the token is current so stale calls cannot call
source.start() or change playback state.
In `@interface/src/routes/Orchestrate.tsx`:
- Around line 354-356: The code currently falls back to worker.id when
opencode_session_id is null, which incorrectly mounts an OpenCode session that
doesn't exist; update the render logic in Orchestrate.tsx around the
OpenCodeEmbed usage so that OpenCodeEmbed receives sessionId only when
worker.opencode_session_id is non-null (keep port as worker.opencode_port), and
when opencode_session_id is null render a proper loading/unsupported placeholder
(or null) instead of using worker.id; reference the OpenCodeEmbed component and
the worker.opencode_session_id / worker.id props to locate the change.
- Around line 91-98: The code is overwriting worker.status with
liveWorker?.status, which breaks WorkerColumn's exact-enum checks; instead
preserve the normalized worker.status (keep using worker.status for the status
field) and move the SSE/descriptive string from liveWorker?.status into a new
separate field (e.g., live_status or liveStatusText) on the pushed object so UI
components can read normalized status from status and descriptive/live text from
the new property; update the object created in the result.push(...) near
activeWorkers[worker.id] to stop assigning liveWorker?.status to status and add
a distinct property for the live status text.
In `@interface/src/routes/Overlay.tsx`:
- Around line 81-98: The transcript gets duplicate assistant entries because
handleSpokenResponse always pushes a new assistant turn even if an
outbound_message already appended the pending assistant reply; modify
handleSpokenResponse (and the similar block for spoken_response at lines
109-118) to detect and update the existing pending assistant item instead of
appending: use sessionId and spokenReceivedRef/ttsStartedRef as currently used
to match the turn, then call setTranscript with an updater that finds the last
assistant entry (or a pending marker created by outbound_message) and replaces
its text with event.full_text (or event.spoken_text where appropriate); keep the
existing behavior of adding a new entry only if no pending assistant row exists.
In `@src/agent/channel_attachments.rs`:
- Around line 351-357: The byte-based transcribe_audio_bytes path currently
infers audio format only from mime_type, causing mismatches when clients send
generic MIME types; update transcribe_audio_bytes to call
audio_format_for_attachment (or replicate its logic) passing both mime_type and
filename as fallback so the filename extension is used when MIME is
non-specific, and apply the same filename-fallback logic in the other byte-path
locations around the other transcribe functions (the blocks referenced near the
transcribe_audio_bytes function and the later multipart/bytes handling) so all
byte-based inference mirrors the multipart-based audio_format_for_attachment
behavior.
In `@src/api/webchat.rs`:
- Around line 155-164: The "audio" branch currently calls field.bytes().await
and assigns to audio_bytes (then later base64-encoded), allowing unbounded
in-memory copies; replace that with a bounded read: read the multipart Field in
chunks (e.g., field.chunk() or AsyncRead + tokio_util::io::StreamReader) and
accumulate into a buffer while enforcing a MAX_AUDIO_BYTES constant (reject with
StatusCode::PAYLOAD_TOO_LARGE or BAD_REQUEST if exceeded) or stream directly to
the transcription path without a full Vec<u8> copy. Target the "audio" match
arm, the use of field.bytes().await, and the audio_bytes variable/transcription
helper when making this change.
In `@src/api/workers.rs`:
- Around line 57-60: The worker detail payload is missing project_id and
project_name so the UI loses project context; update the data model and mapping
in src/api/workers.rs to include those fields in the detail response as well as
the list rows—add project_id: Option<String> and project_name: Option<String> to
the detail struct (the same structs used for list rows and detail), and ensure
worker_detail() populates them from the existing join logic (the same join that
feeds the list rows, referenced around lines 155-159) so the detail endpoint
returns the same project-linked values.
In `@src/config/runtime.rs`:
- Around line 93-97: The voicebox_url is currently treated as startup-only
(stored in voicebox_url: ArcSwap<String>) and never updated in reload_config(),
so per-agent or default config changes (including empty to disable) are ignored;
update the code so reload_config() reads the voice.voicebox_url from the
resolved agent/default config and calls
voicebox_url.store(Arc::new(new_value.clone())) (or clears it to empty/None
semantics) to apply changes at runtime; likewise apply the same fix to the other
startup-only fields referenced around the same area (the other ArcSwap<String>
fields at the 169-172 equivalent), ensuring initialization still respects
env/config but runtime reload uses the same reload_config() update path.
In `@src/conversation/history.rs`:
- Around line 837-843: The detail endpoint is missing project metadata: update
the SQL in get_worker_detail to include a LEFT JOIN projects p ON w.project_id =
p.id and select w.project_id and p.name AS project_name (matching
list_worker_runs), then add corresponding fields to the WorkerDetailRow struct
and populate them where the query rows are mapped into WorkerDetailRow so the
detail response returns project_id and project_name consistently with
list_worker_runs.
In `@src/lib.rs`:
- Around line 313-319: The SpokenResponse struct uses full_text as a join key
which is brittle; add a stable persisted message ID field (e.g., message_id:
MessageId or persisted_message_id: MessageId) to SpokenResponse and populate it
wherever SpokenResponse instances are constructed so downstream consumers
correlate on this message_id instead of full_text; update any
serialization/deserialization and references to use SpokenResponse::message_id
for spoken-response correlation (identify constructors/usages that create
SpokenResponse and the consumer code that matches events).
---
Minor comments:
In `@desktop/src-tauri/src/main.rs`:
- Around line 133-137: The apply_overlay_window_chrome function currently
discards Result values from window.set_decorations, window.set_shadow, and
window.set_always_on_top; update it to inspect each Result and log failures
instead of using `let _ =`. For each call (set_decorations, set_shadow,
set_always_on_top) match or use `if let Err(e)` and emit a clear error log
(e.g., via the project's logger/tracing facility) that includes the operation
name and the error value so failures are visible for debugging.
- Around line 54-68: The toggle_overlay function currently discards Results from
window operations using `let _ =`, which hides failures; update toggle_overlay
to handle and log errors instead: when calling `overlay.is_visible()`,
`overlay.hide()`, `apply_overlay_window_chrome(&overlay)`, `overlay.show()`, and
`overlay.set_focus()` return or propagate their Results (via match/if let Err(e)
/ ? where appropriate) and log any Err using the project's logging facility (or
eprintln! if none) so failures are visible during debugging; keep the existing
behavior of creating the window via `create_overlay_window` when the webview is
absent.
- Around line 152-161: The code currently calls .unwrap() on
tauri_plugin_global_shortcut::Builder::with_shortcut which will panic if
registration fails; change this to handle the Result instead (e.g., use
.map_err/.and_then or a match) so you can log the error and either skip
registering the shortcut plugin or return a controlled Err; specifically locate
the Builder chain where with_shortcut(shortcut).unwrap() is called and replace
unwrap with error handling that logs the failure (using your app logger) and
avoids panicking while keeping the .with_handler and toggle_overlay logic
intact.
In `@interface/src/hooks/useTtsPlayback.ts`:
- Around line 49-58: The hook useTtsPlayback is missing an unmount cleanup which
can leave audio playing and RAF metering running; add a useEffect that returns a
cleanup function which calls the hook's stop() to halt playback, cancels any
pending animation frame via animationFrameRef, disconnects/clears sourceRef and
analyserRef, sets playbackDoneRef to null, and closes audioContextRef (await
close if needed) and nulls it so the AudioContext is released when the component
unmounts. Ensure the cleanup references the existing stop() function and the
refs (audioContextRef, sourceRef, analyserRef, animationFrameRef,
playbackDoneRef) to locate the code to modify.
In `@interface/src/routes/Orchestrate.tsx`:
- Around line 142-160: Group by the stable project identifier instead of the
display name: change the grouping key in the byProject build to use
worker.project_id (e.g. const groupKey = worker.project_id ??
`ungrouped:${worker.project_name ?? 'Ungrouped'}`) while still storing the human
label as name = worker.project_name ?? 'Ungrouped'; when creating entries and
the mapped object (the block that returns { name, projectId: ... }), ensure
projectId is the actual numeric/string worker.project_id (or null) and use that
stable projectId as the React key in the render loop; apply the same change to
the other instance noted (lines ~247-248) so unrelated projects with identical
project_name are not collapsed.
In `@src/agent/channel.rs`:
- Around line 1537-1540: The code currently swallows poisoned mutex errors by
calling replied_text.lock().ok(), losing error context; change this to
explicitly handle PoisonError instead of .ok(): call replied_text.lock() and
match the Result, log the poison error (including context like replied_flag and
operation), and recover by using poison_err.into_inner() to obtain the guard (or
propagate the error via Result if the surrounding function returns a Result).
Replace the .lock().ok().and_then(|mut slot| slot.take()) pattern in the replied
handling (the block using replied_flag and replied_text) with this explicit
match/log/recover flow, and apply the same fix at the other occurrence mentioned
(around lines 1916-1919) so poisoned locks are not silently discarded.
In `@src/api/webchat.rs`:
- Around line 166-168: The match arm that currently does `let _ =
field.bytes().await;` silently discards any errors from parsing unknown
multipart fields; update the `_` arm in the multipart field handling (the match
that binds `field`) to explicitly handle the Result from `field.bytes().await` —
either propagate an error (return Err with a descriptive context) or log the
failure (using the existing logger) and continue, e.g. by matching on
`field.bytes().await` and calling `process_logger.error`/`tracing::error` with
the error if it fails or discarding the bytes only on Ok; ensure you reference
the same variable names (`field`, `field.bytes().await`) and keep behavior
consistent with other arms.
In `@src/config/watcher.rs`:
- Around line 129-131: The current identity_changed check only matches basenames
and wrongly treats files like skills/**/SPEECH.md as identity edits; change the
predicate used in the changed_paths.iter().any closure (the identity_changed
computation) to first verify the changed path is directly under the agent root
(use Path::strip_prefix(agent_root) and ensure the resulting relative path has
no parent, i.e. rel.parent().is_none()), then match the file_name against
"SOUL.md" | "IDENTITY.md" | "ROLE.md" | "SPEECH.md"; this ensures only top-level
identity files trigger reloads.
In `@src/tools/reply.rs`:
- Around line 467-470: The code currently ignores errors from
self.replied_text.lock(), which drops updates silently; change the lock handling
to catch the Err case, log the poisoning error (using the existing
logger/tracing infrastructure) and recover by calling err.into_inner() to obtain
the guard so you can still set *slot = Some(converted_content.clone()); ensure
you still handle the Ok(path) branch the same way and do not use let _ = on the
Result — explicitly match on self.replied_text.lock() and log the error before
recovering with into_inner() and performing the assignment.
---
Nitpick comments:
In `@interface/src/components/OpenCodeEmbed.tsx`:
- Around line 124-127: The added abort listener on controller.signal (via
controller.signal.addEventListener("abort", onUnmount)) is never removed; update
the async routine that creates const probeController = new AbortController() and
const onUnmount = () => probeController.abort() so that you remove the listener
before any early return and on normal completion — call
controller.signal.removeEventListener("abort", onUnmount) inside the existing
cleanup() (or invoke cleanup() before returning) and ensure both early-return
paths in the async function call cleanup() so the event listener is always
removed.
In `@src/api/tts.rs`:
- Around line 207-218: The function resolve_default_profile_id currently
swallows errors via .ok()? which makes failures silent; change the
implementation to explicitly handle the request and JSON parsing errors so you
can emit debug-level logs: perform the GET on the constructed endpoint and if
response.send() fails log the error with context (including
voicebox_url/endpoint), if response.status().is_success() is false log the
status and body or a short message, and when
response.json::<Vec<VoiceboxProfile>>() fails log the parse error; keep
returning None on failure but replace the chained .ok()? calls with match/if let
so you can call your preferred logger (e.g. debug! or tracing::debug!)
referencing resolve_default_profile_id, client, voicebox_url, endpoint,
response, and profiles to aid troubleshooting.
- Around line 62-68: The TTS handler currently builds a new reqwest::Client
inside the function (the local variable `client` from
`reqwest::Client::builder()`), which wastes connection pools; instead add a
shared/reused client and use it for requests: e.g., add a field to `ApiState`
(or a global `OnceCell`/`LazyLock`) to hold a pre-built `reqwest::Client` and
replace the local builder call with retrieving that shared client (handle the
Option/initialization or return the same `StatusCode::INTERNAL_SERVER_ERROR` on
startup/build failure). Ensure you remove the per-request `Client::builder()`
usage and reference the shared client where `client` is currently used.
In `@src/factory/presets.rs`:
- Around line 79-86: Add a unit test that verifies a preset with a SPEECH.md
file returns Some(speech): create a small fixture (or temp dir) containing a
preset id folder with SPEECH.md, load the preset via the existing Preset loading
path (use Preset::load or the codepath that calls Self::load_optional_file), and
assert preset.speech.is_some() and that the contents match the expected string
(e.g., contains "main-agent" or the known fixture text); place the test near
other Preset tests in presets.rs so it exercises the load_optional_file ->
Preset construction path referenced around the speech handling code.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 45e5566e-0044-4af1-90f3-bb6c5e5cef37
⛔ Files ignored due to path filters (7)
desktop/src-tauri/Cargo.lockis excluded by!**/*.lock,!**/*.lockdesktop/src-tauri/Cargo.tomlis excluded by!**/*.tomldesktop/src-tauri/capabilities/default.jsonis excluded by!**/*.jsondesktop/src-tauri/gen/schemas/acl-manifests.jsonis excluded by!**/gen/**,!**/*.json,!**/gen/**desktop/src-tauri/gen/schemas/capabilities.jsonis excluded by!**/gen/**,!**/*.json,!**/gen/**desktop/src-tauri/gen/schemas/desktop-schema.jsonis excluded by!**/gen/**,!**/*.json,!**/gen/**desktop/src-tauri/gen/schemas/macOS-schema.jsonis excluded by!**/gen/**,!**/*.json,!**/gen/**
📒 Files selected for processing (43)
desktop/src-tauri/src/main.rsinterface/src/App.tsxinterface/src/api/client.tsinterface/src/components/OpenCodeEmbed.tsxinterface/src/components/Sidebar.tsxinterface/src/components/WebChatPanel.tsxinterface/src/hooks/useAudioRecorder.tsinterface/src/hooks/useTtsPlayback.tsinterface/src/router.tsxinterface/src/routes/AgentConfig.tsxinterface/src/routes/AgentDetail.tsxinterface/src/routes/AgentWorkers.tsxinterface/src/routes/Orchestrate.tsxinterface/src/routes/Overlay.tsxinterface/src/ui/style/style.scsspresets/main-agent/SPEECH.mdsrc/agent/channel.rssrc/agent/channel_attachments.rssrc/agent/channel_history.rssrc/agent/cortex.rssrc/api.rssrc/api/agents.rssrc/api/server.rssrc/api/state.rssrc/api/system.rssrc/api/tts.rssrc/api/webchat.rssrc/api/workers.rssrc/config/runtime.rssrc/config/types.rssrc/config/watcher.rssrc/conversation/history.rssrc/factory/presets.rssrc/identity.rssrc/identity/files.rssrc/lib.rssrc/settings/store.rssrc/tools.rssrc/tools/factory_create_agent.rssrc/tools/factory_load_preset.rssrc/tools/factory_update_identity.rssrc/tools/reply.rstests/context_dump.rs
| } catch (error) { | ||
| console.error("Failed to start recording:", error); | ||
| cleanupAudioGraph(); | ||
| setState("idle"); | ||
| } |
There was a problem hiding this comment.
Don't swallow startRecording() failures.
This catch block always resolves, so callers cannot roll back optimistic UI when getUserMedia is denied or recorder setup fails. The overlay ends up stuck in a recording state even though capture never started.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@interface/src/hooks/useAudioRecorder.ts` around lines 177 - 181, The catch in
startRecording() swallows errors, preventing callers from handling failures;
modify startRecording() so the catch still performs cleanupAudioGraph() and sets
state back to "idle" but then rethrow the caught error (or return a rejected
Promise) so callers can react to getUserMedia/recorder setup failures instead of
assuming recording started; ensure cleanupAudioGraph() and setState("idle") are
called before rethrowing so resources are freed and UI can roll back.
| const liveWorker = activeWorkers[worker.id]; | ||
| result.push({ | ||
| ...worker, | ||
| agent_id: agent.id, | ||
| agent_name: agentName, | ||
| status: liveWorker?.status ?? worker.status, | ||
| live_tool_calls: liveWorker?.toolCalls, | ||
| }); |
There was a problem hiding this comment.
Keep worker.status normalized to "running"/"idle".
WorkerColumn keys its live UI off exact enum values. Overwriting status with liveWorker.status turns active workers into arbitrary strings, which hides cancel/live-duration behavior as soon as SSE publishes descriptive status text.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@interface/src/routes/Orchestrate.tsx` around lines 91 - 98, The code is
overwriting worker.status with liveWorker?.status, which breaks WorkerColumn's
exact-enum checks; instead preserve the normalized worker.status (keep using
worker.status for the status field) and move the SSE/descriptive string from
liveWorker?.status into a new separate field (e.g., live_status or
liveStatusText) on the pushed object so UI components can read normalized status
from status and descriptive/live text from the new property; update the object
created in the result.push(...) near activeWorkers[worker.id] to stop assigning
liveWorker?.status to status and add a distinct property for the live status
text.
| "audio" => { | ||
| audio_mime = field.content_type().map(|s| s.to_string()); | ||
| audio_filename = field.file_name().map(|s| s.to_string()); | ||
| audio_bytes = Some( | ||
| field | ||
| .bytes() | ||
| .await | ||
| .map_err(|_| StatusCode::BAD_REQUEST)? | ||
| .to_vec(), | ||
| ); |
There was a problem hiding this comment.
Cap audio uploads before buffering them fully.
This copies the entire audio part into a Vec<u8>, and the transcription helper then base64-encodes it again. Accepted uploads therefore cost multiple full in-memory copies on the request path, which makes large audio payloads an easy memory spike. Please put a tight per-upload limit in front of this branch or switch to chunked reads.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/api/webchat.rs` around lines 155 - 164, The "audio" branch currently
calls field.bytes().await and assigns to audio_bytes (then later
base64-encoded), allowing unbounded in-memory copies; replace that with a
bounded read: read the multipart Field in chunks (e.g., field.chunk() or
AsyncRead + tokio_util::io::StreamReader) and accumulate into a buffer while
enforcing a MAX_AUDIO_BYTES constant (reject with StatusCode::PAYLOAD_TOO_LARGE
or BAD_REQUEST if exceeded) or stream directly to the transcription path without
a full Vec<u8> copy. Target the "audio" match arm, the use of
field.bytes().await, and the audio_bytes variable/transcription helper when
making this change.
| /// Project ID this worker is linked to. | ||
| project_id: Option<String>, | ||
| /// Project name (resolved via join). | ||
| project_name: Option<String>, |
There was a problem hiding this comment.
project_id and project_name need to be in the detail payload too.
These fields are now on list rows only. As soon as the UI loads a worker through worker_detail(), the new project context disappears, so the detail view cannot render the same project-linked state. If the detail query row does not already carry them, that needs plumbing too.
Also applies to: 155-159
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/api/workers.rs` around lines 57 - 60, The worker detail payload is
missing project_id and project_name so the UI loses project context; update the
data model and mapping in src/api/workers.rs to include those fields in the
detail response as well as the list rows—add project_id: Option<String> and
project_name: Option<String> to the detail struct (the same structs used for
list rows and detail), and ensure worker_detail() populates them from the
existing join logic (the same join that feeds the list rows, referenced around
lines 155-159) so the detail endpoint returns the same project-linked values.
| /// Voicebox server URL for TTS output. When non-empty, the channel | ||
| /// generates a spoken version of replies and broadcasts it as a | ||
| /// `SpokenResponse` event. Configurable via `SPACEBOT_VOICEBOX_URL` env | ||
| /// or `[voice] voicebox_url` in the agent config. | ||
| pub voicebox_url: ArcSwap<String>, |
There was a problem hiding this comment.
voicebox_url is wired as startup-only env state.
This bypasses the resolved agent/default config entirely, and reload_config() never refreshes it, so per-agent voicebox_url changes — including an explicit empty value to disable spoken replies — will not take effect after startup.
Also applies to: 169-172
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/config/runtime.rs` around lines 93 - 97, The voicebox_url is currently
treated as startup-only (stored in voicebox_url: ArcSwap<String>) and never
updated in reload_config(), so per-agent or default config changes (including
empty to disable) are ignored; update the code so reload_config() reads the
voice.voicebox_url from the resolved agent/default config and calls
voicebox_url.store(Arc::new(new_value.clone())) (or clears it to empty/None
semantics) to apply changes at runtime; likewise apply the same fix to the other
startup-only fields referenced around the same area (the other ArcSwap<String>
fields at the 169-172 equivalent), ensuring initialization still respects
env/config but runtime reload uses the same reload_config() update path.
| w.tool_calls, w.opencode_port, w.opencode_session_id, w.directory, \ | ||
| w.interactive, \ | ||
| c.display_name as channel_name, \ | ||
| w.project_id, p.name as project_name \ | ||
| FROM worker_runs w \ | ||
| LEFT JOIN channels c ON w.channel_id = c.id \ | ||
| LEFT JOIN projects p ON w.project_id = p.id \ |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n -C3 'list_worker_runs|get_worker_detail|WorkerRunRow|WorkerDetailRow|project_id|project_name' \
src/conversation/history.rs src/api/workers.rs interface/src/api/client.ts interface/src/routes interface/src/componentsRepository: spacedriveapp/spacebot
Length of output: 12403
🏁 Script executed:
sed -n '906,950p' src/conversation/history.rsRepository: spacedriveapp/spacebot
Length of output: 2166
🏁 Script executed:
sed -n '990,1020p' src/conversation/history.rsRepository: spacedriveapp/spacebot
Length of output: 997
Add project_id and project_name to the detail response path.
get_worker_detail() does not select or return project_id / project_name, even though list_worker_runs() does. The detail SQL query (lines 911–923) is missing the LEFT JOIN projects and the project field selections, and WorkerDetailRow (lines 993–1010) lacks those fields entirely. This creates a schema mismatch where the detail endpoint omits project metadata present in the list.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/conversation/history.rs` around lines 837 - 843, The detail endpoint is
missing project metadata: update the SQL in get_worker_detail to include a LEFT
JOIN projects p ON w.project_id = p.id and select w.project_id and p.name AS
project_name (matching list_worker_runs), then add corresponding fields to the
WorkerDetailRow struct and populate them where the query rows are mapped into
WorkerDetailRow so the detail response returns project_id and project_name
consistently with list_worker_runs.
| SpokenResponse { | ||
| agent_id: AgentId, | ||
| channel_id: ChannelId, | ||
| /// The short, conversational spoken text (1-3 sentences). | ||
| spoken_text: String, | ||
| /// The full response text this was derived from. | ||
| full_text: String, |
There was a problem hiding this comment.
Use a stable message ID for spoken-response correlation.
full_text is an ambiguous join key. Repeated assistant replies or small formatting differences can attach the spoken-response event to the wrong message card. Include the persisted conversation message ID here and correlate on that downstream.
Suggested event shape
SpokenResponse {
agent_id: AgentId,
channel_id: ChannelId,
+ message_id: String,
/// The short, conversational spoken text (1-3 sentences).
spoken_text: String,
/// The full response text this was derived from.
full_text: String,
},📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| SpokenResponse { | |
| agent_id: AgentId, | |
| channel_id: ChannelId, | |
| /// The short, conversational spoken text (1-3 sentences). | |
| spoken_text: String, | |
| /// The full response text this was derived from. | |
| full_text: String, | |
| SpokenResponse { | |
| agent_id: AgentId, | |
| channel_id: ChannelId, | |
| message_id: String, | |
| /// The short, conversational spoken text (1-3 sentences). | |
| spoken_text: String, | |
| /// The full response text this was derived from. | |
| full_text: String, |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/lib.rs` around lines 313 - 319, The SpokenResponse struct uses full_text
as a join key which is brittle; add a stable persisted message ID field (e.g.,
message_id: MessageId or persisted_message_id: MessageId) to SpokenResponse and
populate it wherever SpokenResponse instances are constructed so downstream
consumers correlate on this message_id instead of full_text; update any
serialization/deserialization and references to use SpokenResponse::message_id
for spoken-response correlation (identify constructors/usages that create
SpokenResponse and the consumer code that matches events).
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
interface/src/hooks/useChannelLiveState.ts (1)
251-268:⚠️ Potential issue | 🟠 MajorReplace the provisional stream id with
event.message_id.When a reply starts as
outbound_message_delta, this branch keeps the temporarystream-*id because it only updatescontent.WebChatPanelnow correlates spoken replies bymessage_id, so streamed replies never pick up their spoken block, and they can still diverge from history keyed by the persisted id.💡 Suggested fix
if (streamIndex >= 0) { const streamItem = timeline[streamIndex]; if (streamItem.type === "message") { - timeline[streamIndex] = { ...streamItem, content: event.text }; + timeline[streamIndex] = { + ...streamItem, + id: event.message_id ?? streamItem.id, + content: event.text, + }; } } else {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@interface/src/hooks/useChannelLiveState.ts` around lines 251 - 268, In useChannelLiveState, when handling streaming events you must ensure the provisional stream id is replaced with the persisted event.message_id: when pushing a new message via assistantMessageItem, use event.message_id if present (fallback to `out-${generateId()}`), and when updating an existing streaming timeline entry (streamingMessageId branch where you set timeline[streamIndex] = { ...streamItem, content: event.text }), also set its id to event.message_id when event.message_id is available so the streamed reply keys match persisted history and WebChatPanel’s spoken-block correlation.src/agent/channel.rs (1)
851-872:⚠️ Potential issue | 🔴 CriticalThread outbound
message_ids through the fallback send path.
RoutedResponsenow requiresmessage_id, butsend_routed()still constructs it without one andsend_outbound_text()only returnsbool. That leaves this file non-compiling at Lines 856 and 865, and it also means the three fallbackmaybe_generate_spoken_response(...)calls below have no way to produce the requiredRepliedMessage { text, message_id }. Please carry the outbound ID through the helper and return it from fallback sends instead of collapsing everything to a success flag.Also applies to: 2665-2691, 2768-2777, 2838-2847, 2899-2905
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/agent/channel.rs` around lines 851 - 872, send_routed currently builds RoutedResponse without supplying the required message_id; change the fallback send path to thread the outbound message id through the helpers: update send_outbound_text (and the other fallback helpers that call maybe_generate_spoken_response) to return the RepliedMessage (or at least Option<MessageId>) instead of bool, have maybe_generate_spoken_response produce RepliedMessage { text, message_id }, and then modify send_routed to use that returned message_id when constructing RoutedResponse; update all call sites (including the three fallback calls around send_routed and the other locations mentioned) to propagate the returned message_id into the RoutedResponse construction so compilation succeeds and outbound IDs are preserved.
♻️ Duplicate comments (1)
interface/src/hooks/useAudioRecorder.ts (1)
196-200:⚠️ Potential issue | 🟠 MajorRethrow setup failures after cleanup.
Permission denials and recorder setup errors still resolve
startRecording()successfully here. Keep the cleanup/reset, but rethrow so callers can surface the failure and unwind any optimistic UI.🔧 Minimal fix
} catch (error) { console.error("Failed to start recording:", error); cleanupAudioGraph(); setState("idle"); + throw error; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@interface/src/hooks/useAudioRecorder.ts` around lines 196 - 200, The catch block in startRecording currently logs the error, calls cleanupAudioGraph(), and setState("idle") but swallows the exception; modify the catch to perform the same cleanup/reset (call cleanupAudioGraph() and setState("idle")) and then rethrow the caught error so callers of startRecording can observe failures (i.e. preserve existing calls to cleanupAudioGraph and setState inside the catch, then throw the original error variable).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@interface/src/hooks/useAudioRecorder.ts`:
- Around line 98-116: Introduce a pending-start cancellation ref (e.g.,
pendingStartRef) and set it true at the start of startRecording() (return early
if already true or state !== "idle"), then after each await (after
navigator.mediaDevices.getUserMedia(...) and after audioContext.resume()) check
pendingStartRef and abort/cleanup if it was cleared; have stopRecording() set
pendingStartRef to false and perform the same cleanup of
streamRef/mediaRecorderRef/audioContextRef so an in-flight start is cancelled,
and only assign mediaRecorderRef.current and set state to "recording" if
pendingStartRef is still true.
In `@interface/src/hooks/useTtsPlayback.ts`:
- Around line 93-113: abandonIfStale currently tears down shared state
(playbackDoneRef, stopMetering, setState) for any non-current request; change it
so stale requests only stop their own nodes and do not mutate shared refs/state.
Specifically, in abandonIfStale (use isCurrentRequest(), sourceRef.current,
analyserRef.current) still stop() the passed source/analyser if provided, and
clear sourceRef.current/analyserRef.current only if they strictly equal the
passed nodes, but remove the lines that set playbackDoneRef.current = null, call
stopMetering(), or setState("idle") when the request is stale.
In `@src/agent/channel.rs`:
- Around line 3729-3731: The agent is built without the required SpacebotHook
and without explicit max_turns, bypassing the standard Spacebot agent pattern;
update the creation that currently uses SpacebotModel::make(...) and
AgentBuilder::new(model).preamble(system).build() to include
.hook(SpacebotHook::new(...)) and .max_turns(5) before .build(), ensuring the
hook is configured with the same channel/context parameters and leaving
build_spoken_generation_prompt(...) usage unchanged.
---
Outside diff comments:
In `@interface/src/hooks/useChannelLiveState.ts`:
- Around line 251-268: In useChannelLiveState, when handling streaming events
you must ensure the provisional stream id is replaced with the persisted
event.message_id: when pushing a new message via assistantMessageItem, use
event.message_id if present (fallback to `out-${generateId()}`), and when
updating an existing streaming timeline entry (streamingMessageId branch where
you set timeline[streamIndex] = { ...streamItem, content: event.text }), also
set its id to event.message_id when event.message_id is available so the
streamed reply keys match persisted history and WebChatPanel’s spoken-block
correlation.
In `@src/agent/channel.rs`:
- Around line 851-872: send_routed currently builds RoutedResponse without
supplying the required message_id; change the fallback send path to thread the
outbound message id through the helpers: update send_outbound_text (and the
other fallback helpers that call maybe_generate_spoken_response) to return the
RepliedMessage (or at least Option<MessageId>) instead of bool, have
maybe_generate_spoken_response produce RepliedMessage { text, message_id }, and
then modify send_routed to use that returned message_id when constructing
RoutedResponse; update all call sites (including the three fallback calls around
send_routed and the other locations mentioned) to propagate the returned
message_id into the RoutedResponse construction so compilation succeeds and
outbound IDs are preserved.
---
Duplicate comments:
In `@interface/src/hooks/useAudioRecorder.ts`:
- Around line 196-200: The catch block in startRecording currently logs the
error, calls cleanupAudioGraph(), and setState("idle") but swallows the
exception; modify the catch to perform the same cleanup/reset (call
cleanupAudioGraph() and setState("idle")) and then rethrow the caught error so
callers of startRecording can observe failures (i.e. preserve existing calls to
cleanupAudioGraph and setState inside the catch, then throw the original error
variable).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: a7971fb5-694d-4922-aae1-de1d14295716
📒 Files selected for processing (12)
interface/src/api/client.tsinterface/src/components/WebChatPanel.tsxinterface/src/hooks/useAudioRecorder.tsinterface/src/hooks/useChannelLiveState.tsinterface/src/hooks/useTtsPlayback.tssrc/agent/channel.rssrc/api/state.rssrc/conversation/history.rssrc/lib.rssrc/main.rssrc/messaging/webchat.rssrc/tools/reply.rs
🚧 Files skipped from review as they are similar to previous changes (2)
- src/api/state.rs
- src/lib.rs
| const startRecording = useCallback(async () => { | ||
| if (state !== "idle") return; | ||
|
|
||
| try { | ||
| const stream = await navigator.mediaDevices.getUserMedia({ | ||
| audio: { | ||
| echoCancellation: true, | ||
| noiseSuppression: true, | ||
| autoGainControl: true, | ||
| }, | ||
| }); | ||
| streamRef.current = stream; | ||
|
|
||
| // Set up audio analyser for level visualization | ||
| const audioContext = new AudioContext(); | ||
| audioContextRef.current = audioContext; | ||
| if (audioContext.state === "suspended") { | ||
| await audioContext.resume(); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify the hold-to-talk flow and inspect whether stop can run before
# mediaRecorderRef.current exists.
sed -n '98,214p' interface/src/hooks/useAudioRecorder.ts
sed -n '1,260p' interface/src/routes/Overlay.tsx | rg -n -C2 '\b(startRecording|stopRecording)\b|key(up|down)|pointer(up|down)|mouse(up|down)'Repository: spacedriveapp/spacebot
Length of output: 4691
Guard pending starts against release and duplication during async mic/audio setup.
startRecording() has a substantial async gap—between the state check and the mediaRecorderRef.current assignment—where getUserMedia() and audioContext.resume() are awaited. If the user releases the hotkey during this window, stopRecording() finds no recorder and resolves null, but the original startRecording() continuation proceeds and eventually starts capture. The same gap allows duplicate starts if the key is pressed twice before the first call settles.
Add refs to track pending starts and allow stopRecording() to cancel them. Check the cancellation flag after each await in startRecording() before proceeding to recorder creation and state transitions.
Suggested approach
+ const startPendingRef = useRef(false);
+ const cancelPendingStartRef = useRef(false);
const startRecording = useCallback(async () => {
- if (state !== "idle") return;
+ if (state !== "idle" || startPendingRef.current) return;
+ startPendingRef.current = true;
+ cancelPendingStartRef.current = false;
try {
const stream = await navigator.mediaDevices.getUserMedia({
audio: {
echoCancellation: true,
noiseSuppression: true,
autoGainControl: true,
},
});
+ if (cancelPendingStartRef.current) {
+ stream.getTracks().forEach((track) => track.stop());
+ return;
+ }
streamRef.current = stream;
// ... audio graph setup ...
const recorder = new MediaRecorder(stream, {mimeType});
+ if (cancelPendingStartRef.current) {
+ cleanupAudioGraph();
+ return;
+ }
mediaRecorderRef.current = recorder;
recorder.start(100);
setState("recording");
} catch (error) {
console.error("Failed to start recording:", error);
cleanupAudioGraph();
setState("idle");
+ } finally {
+ startPendingRef.current = false;
}
}, [cleanupAudioGraph, state]);
const stopRecording = useCallback((): Promise<Blob | null> => {
return new Promise((resolve) => {
+ if (startPendingRef.current) {
+ cancelPendingStartRef.current = true;
+ resolve(null);
+ return;
+ }
const recorder = mediaRecorderRef.current;
if (!recorder || recorder.state !== "recording") {
resolve(null);
return;
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@interface/src/hooks/useAudioRecorder.ts` around lines 98 - 116, Introduce a
pending-start cancellation ref (e.g., pendingStartRef) and set it true at the
start of startRecording() (return early if already true or state !== "idle"),
then after each await (after navigator.mediaDevices.getUserMedia(...) and after
audioContext.resume()) check pendingStartRef and abort/cleanup if it was
cleared; have stopRecording() set pendingStartRef to false and perform the same
cleanup of streamRef/mediaRecorderRef/audioContextRef so an in-flight start is
cancelled, and only assign mediaRecorderRef.current and set state to "recording"
if pendingStartRef is still true.
| const abandonIfStale = (source?: AudioBufferSourceNode, analyser?: AnalyserNode) => { | ||
| if (isCurrentRequest()) { | ||
| return false; | ||
| } | ||
| if (source) { | ||
| try { | ||
| source.stop(); | ||
| } catch { | ||
| // no-op: source was never started or already stopped | ||
| } | ||
| } | ||
| if (sourceRef.current === source) { | ||
| sourceRef.current = null; | ||
| } | ||
| if (analyserRef.current === analyser) { | ||
| analyserRef.current = null; | ||
| } | ||
| playbackDoneRef.current = null; | ||
| stopMetering(); | ||
| setState("idle"); | ||
| return true; |
There was a problem hiding this comment.
Don't let stale requests clear the active playback state.
abandonIfStale() always nulls playbackDoneRef, calls stopMetering(), and sets idle. If request A resolves after request B has already started, A will blank B's visualizer, clear B's completion resolver, and leave the UI idle while B's audio keeps playing. The stale path should only tear down nodes it owns, or just return without touching shared refs/state.
Possible fix
const abandonIfStale = (source?: AudioBufferSourceNode, analyser?: AnalyserNode) => {
if (isCurrentRequest()) {
return false;
}
if (source) {
try {
source.stop();
} catch {
// no-op: source was never started or already stopped
}
}
- if (sourceRef.current === source) {
+ if (source && sourceRef.current === source) {
sourceRef.current = null;
}
- if (analyserRef.current === analyser) {
+ if (analyser && analyserRef.current === analyser) {
analyserRef.current = null;
}
- playbackDoneRef.current = null;
- stopMetering();
- setState("idle");
return true;
};🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@interface/src/hooks/useTtsPlayback.ts` around lines 93 - 113, abandonIfStale
currently tears down shared state (playbackDoneRef, stopMetering, setState) for
any non-current request; change it so stale requests only stop their own nodes
and do not mutate shared refs/state. Specifically, in abandonIfStale (use
isCurrentRequest(), sourceRef.current, analyserRef.current) still stop() the
passed source/analyser if provided, and clear
sourceRef.current/analyserRef.current only if they strictly equal the passed
nodes, but remove the lines that set playbackDoneRef.current = null, call
stopMetering(), or setState("idle") when the request is stale.
| let model = crate::llm::model::SpacebotModel::make(llm_manager, model_name); | ||
| let agent = AgentBuilder::new(model).preamble(system).build(); | ||
| let prompt = build_spoken_generation_prompt(input, speech_guidance, recent_spoken_responses); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's look at the specific lines and surrounding context in the file
wc -l src/agent/channel.rsRepository: spacedriveapp/spacebot
Length of output: 91
🏁 Script executed:
# Read the section around lines 3729-3731
sed -n '3720,3750p' src/agent/channel.rsRepository: spacedriveapp/spacebot
Length of output: 1464
🏁 Script executed:
# Search for the build_spoken_generation_prompt function and related code to understand the agent's purpose
rg -B 20 -A 10 "build_spoken_generation_prompt" src/agent/channel.rsRepository: spacedriveapp/spacebot
Length of output: 4226
🏁 Script executed:
# Search for SpacebotHook usage to understand the pattern
rg "SpacebotHook" -B 3 -A 3 src/Repository: spacedriveapp/spacebot
Length of output: 43246
🏁 Script executed:
# Find other agent constructions in channel.rs to see the pattern
rg "AgentBuilder::new" src/agent/channel.rs -B 2 -A 5Repository: spacedriveapp/spacebot
Length of output: 686
🏁 Script executed:
# Check what SpacebotHook provides
rg -A 20 "impl.*SpacebotHook" src/Repository: spacedriveapp/spacebot
Length of output: 3381
Use the standard Spacebot agent pattern with hook and explicit max_turns.
This agent bypasses SpacebotHook and leaves max_turns implicit. For a channel-context agent, construct it as: AgentBuilder::new(model).preamble(system).hook(SpacebotHook::new(...)).max_turns(5).build(). The hook is required for all LLM processes per coding guidelines, and channel agents must have max_turns(5) explicitly set.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/agent/channel.rs` around lines 3729 - 3731, The agent is built without
the required SpacebotHook and without explicit max_turns, bypassing the standard
Spacebot agent pattern; update the creation that currently uses
SpacebotModel::make(...) and AgentBuilder::new(model).preamble(system).build()
to include .hook(SpacebotHook::new(...)) and .max_turns(5) before .build(),
ensuring the hook is configured with the same channel/context parameters and
leaving build_spoken_generation_prompt(...) usage unchanged.
Document the new voice interface, speech personality system, working memory layers, and desktop voice hotkeys across the README and docs. Add a dedicated overlay voice shortcut so hold-to-talk works independently from the window toggle.
There was a problem hiding this comment.
Actionable comments posted: 6
♻️ Duplicate comments (1)
interface/src/routes/Overlay.tsx (1)
81-99:⚠️ Potential issue | 🟠 MajorOne assistant reply still becomes two transcript entries.
This is the same duplicate-row problem already called out earlier:
outbound_messageappends one assistant row andspoken_responseappends the same reply again viafull_text. Update the pending row instead of pushing a second assistant entry, ideally keyed bymessage_id.Also applies to: 109-118
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@interface/src/routes/Overlay.tsx` around lines 81 - 99, handleSpokenResponse is currently appending a second assistant transcript entry (using event.full_text) which duplicates the row added by outbound_message; instead of pushing a new entry in setTranscript inside handleSpokenResponse, find the pending assistant message by its message_id (from the SpokenResponseEvent) and update that existing transcript entry (replace its text or merge fields) — modify the setTranscript updater used in handleSpokenResponse (and the similar block around lines 109-118) to map over previous entries, match on entry.message_id === event.message_id (or another unique key added by outbound_message), and return an updated entry rather than appending a new one; keep the other state changes (setSpokenText, setVoiceState, setStatusText) intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@desktop/src-tauri/src/main.rs`:
- Around line 50-52: toggle_voice_overlay currently always returns Ok(()) and
calls toggle_overlay which and the overlay helper functions
(create_overlay_window, show/hide/set_focus/emit) swallow or only log errors;
change the API to propagate failures instead of hiding them: update
toggle_voice_overlay's signature to return Result<(), String> (or a suitable
Error) and have it return Err(...) when toggle_overlay or any downstream call
fails, and modify toggle_overlay and create_overlay_window to return Results and
propagate errors (using ?), removing any let _ = or ignored Result uses in
show/hide/set_focus/emit paths so errors are either logged with context or
propagated upward; ensure each referenced function name (toggle_voice_overlay,
toggle_overlay, create_overlay_window, and the overlay show/hide/set_focus/emit
calls) surfaces failures rather than silently discarding them.
In `@docs/content/docs/`(core)/cortex.mdx:
- Around line 275-276: Update the readiness documentation to explicitly state
whether the API field name for the warmup/readiness signal remains
`bulletin_age_secs` for backwards compatibility (or if it was renamed), and make
the warmup payload example and the surrounding text consistent: if unchanged,
call out “the field remains `bulletin_age_secs` (seconds since last bulletin)”;
if renamed, show the new field name and note the old `bulletin_age_secs` is
deprecated/removed and map how to translate it. Also search within this document
for the warmup payload example and the phrase “knowledge synthesis” to ensure
both description and example reference the same field name.
In `@docs/content/docs/`(core)/memory.mdx:
- Around line 23-26: Replace the phrase "recall worker" in the
docs/content/docs/(core)/memory.mdx with the project-standard terms "branch
recall" or "recall tool" and update the explanatory sentence to match the
existing model flow "branch -> memory_recall tool"; also ensure any references
implying workers have channel/branch context are removed (workers are
context-free). Look for related mismatches in src/agent/channel.rs,
src/agent/branch.rs, and src/agent/worker.rs and align the documentation wording
to reflect that workers receive only a fresh prompt and task description, not
channel/branch context.
- Around line 207-209: Summary: The docs state SPEECH.md is always injected into
channel system prompts, but SPEECH.md is actually only used for spoken-reply
generation per the updated prompt layout in cortex.mdx; update memory.mdx to
avoid implying it's always-on. Fix: edit the paragraph that lists SPEECH.md in
memory.mdx to remove "every time" / "always-injected" language and explicitly
state that SPEECH.md is loaded only when spoken-reply/voice output is enabled
(per cortex.mdx), and clarify that the other identity files remain
always-injected; reference SPEECH.md and the prompt layout described in
cortex.mdx so readers know the conditional behavior.
In `@interface/src/routes/Overlay.tsx`:
- Line 28: The component currently hardcodes agentId = "main", which breaks
session creation, TTS profile lookup, and audio upload when the deployed default
agent ID differs; update the Overlay component to derive agentId from a prop or
shared state/context (e.g., a passed-in defaultAgentId or selectedAgentId from
app context) instead of the literal "main", and use that derived id wherever
session creation, TTS profile lookup, and audio upload occur (references: the
agentId useState declaration, any createSession/createVoiceSession calls, TTS
profile lookup logic, and audio upload handlers) so the component routes voice
traffic to the correct agent in all environments.
- Around line 100-106: The playback cleanup currently lives only in the
speak(...).then(...) success path so if speak() rejects ttsStartedRef.current
stays true and the overlay remains in speaking state; update the speak
invocation in Overlay.tsx to attach a .finally() callback (or use
async/try/finally) that always resets ttsStartedRef.current = false and calls
setVoiceState("idle") and setStatusText("Press Option+Shift+Space to talk"), and
optionally add a .catch(...) to log the error; reference the existing speak(...)
call, ttsStartedRef, setVoiceState, and setStatusText to locate and modify the
code.
---
Duplicate comments:
In `@interface/src/routes/Overlay.tsx`:
- Around line 81-99: handleSpokenResponse is currently appending a second
assistant transcript entry (using event.full_text) which duplicates the row
added by outbound_message; instead of pushing a new entry in setTranscript
inside handleSpokenResponse, find the pending assistant message by its
message_id (from the SpokenResponseEvent) and update that existing transcript
entry (replace its text or merge fields) — modify the setTranscript updater used
in handleSpokenResponse (and the similar block around lines 109-118) to map over
previous entries, match on entry.message_id === event.message_id (or another
unique key added by outbound_message), and return an updated entry rather than
appending a new one; keep the other state changes (setSpokenText, setVoiceState,
setStatusText) intact.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: d191b8ba-7ca7-4734-ab13-74cf85be5434
📒 Files selected for processing (10)
README.mddesktop/src-tauri/src/main.rsdocs/content/docs/(configuration)/config.mdxdocs/content/docs/(core)/agents.mdxdocs/content/docs/(core)/architecture.mdxdocs/content/docs/(core)/cortex.mdxdocs/content/docs/(core)/memory.mdxdocs/content/docs/(getting-started)/desktop.mdxdocs/content/docs/(getting-started)/quickstart.mdxinterface/src/routes/Overlay.tsx
✅ Files skipped from review due to trivial changes (2)
- README.md
- docs/content/docs/(core)/agents.mdx
| fn toggle_voice_overlay(app: tauri::AppHandle) -> Result<(), String> { | ||
| toggle_overlay(&app); | ||
| Ok(()) |
There was a problem hiding this comment.
Surface overlay failures instead of always succeeding.
toggle_voice_overlay always returns Ok(()), create_overlay_window() only logs build failures, and the rest of the helpers ignore show / hide / set_focus / emit errors. If the overlay path breaks, the command still looks successful and you lose the only useful diagnostics for the hotkey flow. As per coding guidelines, "Don't silently discard errors. No let _ = on Results. Handle them, log them, or propagate them. The only exception is .ok() on channel sends where the receiver may be dropped".
Also applies to: 55-63, 67-75, 83-144, 146-149, 183-188
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@desktop/src-tauri/src/main.rs` around lines 50 - 52, toggle_voice_overlay
currently always returns Ok(()) and calls toggle_overlay which and the overlay
helper functions (create_overlay_window, show/hide/set_focus/emit) swallow or
only log errors; change the API to propagate failures instead of hiding them:
update toggle_voice_overlay's signature to return Result<(), String> (or a
suitable Error) and have it return Err(...) when toggle_overlay or any
downstream call fails, and modify toggle_overlay and create_overlay_window to
return Results and propagate errors (using ?), removing any let _ = or ignored
Result uses in show/hide/set_focus/emit paths so errors are either logged with
context or propagated upward; ensure each referenced function name
(toggle_voice_overlay, toggle_overlay, create_overlay_window, and the overlay
show/hide/set_focus/emit calls) surfaces failures rather than silently
discarding them.
| - knowledge synthesis is available and fresh enough for degraded-safe dispatch | ||
|
|
There was a problem hiding this comment.
Clarify the backwards-compatible readiness field name.
This section now describes readiness in terms of knowledge-synthesis freshness, but the warmup payload example later on the page still exposes bulletin_age_secs. Please call out whether the API field name is still bulletin_age_secs for compatibility, otherwise operators will look for the wrong signal when debugging warmup/readiness.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/content/docs/`(core)/cortex.mdx around lines 275 - 276, Update the
readiness documentation to explicitly state whether the API field name for the
warmup/readiness signal remains `bulletin_age_secs` for backwards compatibility
(or if it was renamed), and make the warmup payload example and the surrounding
text consistent: if unchanged, call out “the field remains `bulletin_age_secs`
(seconds since last bulletin)”; if renamed, show the new field name and note the
old `bulletin_age_secs` is deprecated/removed and map how to translate it. Also
search within this document for the warmup payload example and the phrase
“knowledge synthesis” to ensure both description and example reference the same
field name.
| - **SQLite** -- the memory graph **and** working-memory event log. Graph rows with content, type, importance, timestamps, source. Association edges with weights and relation types. Working-memory events, synthesis blocks, and channel activity rendering also live here. | ||
| - **LanceDB** -- embeddings and search. Vector storage in Lance columnar format with HNSW indexing. Built-in full-text search via Tantivy. Hybrid search (vector + FTS) in one system. | ||
|
|
||
| The two are joined on memory ID. A recall worker queries LanceDB for semantic/keyword matches, then hits SQLite for graph traversal and metadata. No server processes -- both are embedded, everything is files in a data directory. | ||
| Graph-memory recall joins the two on memory ID. A recall worker queries LanceDB for semantic/keyword matches, then hits SQLite for graph traversal and metadata. Working memory does not use embeddings at all — it is SQL-first, temporal, and renderable even before an embedding model is warm. |
There was a problem hiding this comment.
Use “branch recall” or “recall tool” here, not “recall worker.”
The rest of this page already documents recall as branch -> memory_recall tool, and workers are intentionally context-free. Calling this a “recall worker” muddies the execution boundary and points readers at the wrong process. Based on learnings: Applies to src/agent/{channel,branch,worker}.rs : Do not give workers channel context; workers receive only a fresh prompt and task description.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/content/docs/`(core)/memory.mdx around lines 23 - 26, Replace the phrase
"recall worker" in the docs/content/docs/(core)/memory.mdx with the
project-standard terms "branch recall" or "recall tool" and update the
explanatory sentence to match the existing model flow "branch -> memory_recall
tool"; also ensure any references implying workers have channel/branch context
are removed (workers are context-free). Look for related mismatches in
src/agent/channel.rs, src/agent/branch.rs, and src/agent/worker.rs and align the
documentation wording to reflect that workers receive only a fresh prompt and
task description, not channel/branch context.
| - **SPEECH.md** -- spoken cadence, delivery style, warmth, and anti-repetition rules for voice output | ||
|
|
||
| These are loaded into channel system prompts every time. They're files in the agent root directory (not the workspace), not database rows, because they change rarely and users should be able to edit them in a text editor. | ||
| These are loaded into channel system prompts every time. `SPEECH.md` is also used during spoken-reply generation when Voicebox output is enabled. They are files in the agent root directory (not the workspace), not database rows, because they change rarely and users should be able to edit them in a text editor or through the config UI. |
There was a problem hiding this comment.
SPEECH.md reads as always-injected here.
This paragraph says the listed identity files are loaded into channel system prompts every time, but the updated prompt layout in docs/content/docs/(core)/cortex.mdx makes speech style conditional to spoken-reply generation. Separating SPEECH.md from the always-on list would keep text-only turns unambiguous.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/content/docs/`(core)/memory.mdx around lines 207 - 209, Summary: The
docs state SPEECH.md is always injected into channel system prompts, but
SPEECH.md is actually only used for spoken-reply generation per the updated
prompt layout in cortex.mdx; update memory.mdx to avoid implying it's always-on.
Fix: edit the paragraph that lists SPEECH.md in memory.mdx to remove "every
time" / "always-injected" language and explicitly state that SPEECH.md is loaded
only when spoken-reply/voice output is enabled (per cortex.mdx), and clarify
that the other identity files remain always-injected; reference SPEECH.md and
the prompt layout described in cortex.mdx so readers know the conditional
behavior.
| export function Overlay() { | ||
| const [expanded, setExpanded] = useState(false); | ||
| const [voiceState, setVoiceState] = useState<VoiceState>("idle"); | ||
| const [agentId] = useState("main"); // default agent |
There was a problem hiding this comment.
Don't pin voice overlay traffic to agent main.
This component uses the literal "main" for session creation, TTS profile lookup, and audio upload. Any deployment whose default/selected agent has a different ID will route voice turns to the wrong agent or fail outright.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@interface/src/routes/Overlay.tsx` at line 28, The component currently
hardcodes agentId = "main", which breaks session creation, TTS profile lookup,
and audio upload when the deployed default agent ID differs; update the Overlay
component to derive agentId from a prop or shared state/context (e.g., a
passed-in defaultAgentId or selectedAgentId from app context) instead of the
literal "main", and use that derived id wherever session creation, TTS profile
lookup, and audio upload occur (references: the agentId useState declaration,
any createSession/createVoiceSession calls, TTS profile lookup logic, and audio
upload handlers) so the component routes voice traffic to the correct agent in
all environments.
| // Play TTS | ||
| ttsStartedRef.current = true; | ||
| speak(event.spoken_text, agentId, profileId).then(() => { | ||
| ttsStartedRef.current = false; | ||
| setVoiceState("idle"); | ||
| setStatusText("Press Option+Shift+Space to talk"); | ||
| }); |
There was a problem hiding this comment.
Use finally for playback cleanup.
If speak() rejects, ttsStartedRef.current never resets and the overlay stays in speaking, so later spoken responses are treated as already-playing forever.
🛠️ Suggested fix
- ttsStartedRef.current = true;
- speak(event.spoken_text, agentId, profileId).then(() => {
- ttsStartedRef.current = false;
- setVoiceState("idle");
- setStatusText("Press Option+Shift+Space to talk");
- });
+ ttsStartedRef.current = true;
+ void speak(event.spoken_text, agentId, profileId)
+ .catch((error) => {
+ console.error("Failed to play spoken reply:", error);
+ setStatusText("Couldn't play reply");
+ })
+ .finally(() => {
+ ttsStartedRef.current = false;
+ setVoiceState("idle");
+ setStatusText("Press Option+Shift+Space to talk");
+ });📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Play TTS | |
| ttsStartedRef.current = true; | |
| speak(event.spoken_text, agentId, profileId).then(() => { | |
| ttsStartedRef.current = false; | |
| setVoiceState("idle"); | |
| setStatusText("Press Option+Shift+Space to talk"); | |
| }); | |
| // Play TTS | |
| ttsStartedRef.current = true; | |
| void speak(event.spoken_text, agentId, profileId) | |
| .catch((error) => { | |
| console.error("Failed to play spoken reply:", error); | |
| setStatusText("Couldn't play reply"); | |
| }) | |
| .finally(() => { | |
| ttsStartedRef.current = false; | |
| setVoiceState("idle"); | |
| setStatusText("Press Option+Shift+Space to talk"); | |
| }); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@interface/src/routes/Overlay.tsx` around lines 100 - 106, The playback
cleanup currently lives only in the speak(...).then(...) success path so if
speak() rejects ttsStartedRef.current stays true and the overlay remains in
speaking state; update the speak invocation in Overlay.tsx to attach a
.finally() callback (or use async/try/finally) that always resets
ttsStartedRef.current = false and calls setVoiceState("idle") and
setStatusText("Press Option+Shift+Space to talk"), and optionally add a
.catch(...) to log the error; reference the existing speak(...) call,
ttsStartedRef, setVoiceState, and setStatusText to locate and modify the code.
…screen feat: worker orchestration screen
fix: add provider prefix to zai-coding-plan default model
…ings fix: skill install 500 error and chart dimension warnings
feat: Add Mattermost channel support
- Add "error sending request" to retriable error patterns to handle transient network failures (DNS, connection refused, etc.) - Remove broken keepalive code that tried to clone non-Clone Browser - Implement graceful browser shutdown in Drop to prevent Chrome core dumps - Add comprehensive test suite (7 tests) for error classification - Fix clippy warning about collapsible if statements Fixes WebSocket connection errors when workers fail due to transient API errors, preventing Chrome from being force-killed and dumping core.
fix: graceful browser shutdown and network error retries
Add collapsed/expanded overlay window states with Tauri resize command so the pill stays bottom-pinned. Start overlay collapsed by default. Swap sidebar orchestrate icon to HugeiconsIcon and reformat AgentTasks.
Summary
SPEECH.mdsupport across presets, identity APIs, config UI, and runtime loading so spoken replies can be tuned separately from text identityHighlights
interface/src/routes/Overlay.tsxdesktop/src-tauri/src/main.rsso the overlay feels native and cleanSPEECH.mdto presets and identity loading so spoken behavior has a first-class config surfaceSPEECH.mdin the agent settings UI and agent detail view for editing alongsideSOUL.md,IDENTITY.md, andROLE.mdsrc/agent/channel.rsinterface/src/hooks/useAudioRecorder.tsandinterface/src/hooks/useTtsPlayback.tstests/context_dump.rshelper updateVerification
just preflightjust gate-prcargo test --lib spoken_generation_prompt_includes_recent_historycargo test --lib spoken_generation_history_loads_from_settings_storecargo test --lib spoken_history_round_trips_per_channelbun run build(frominterface/)Why this matters
This is a major upgrade to the voice interface experience. The overlay now looks and feels like a real product surface instead of a utilitarian debug panel, while the new speech-personality and anti-repetition pipeline makes spoken responses sound more intentional, branded, and less repetitive over time.
Note
Technical Summary
50 files changed across 4 main areas. Desktop/Tauri layer gains global-hotkey and rustix dependencies for enhanced overlay control. Frontend (Overlay.tsx, hooks) implements FFT-driven visualizations (+606 lines) and integrates WebGL orb rendering with real-time spectral analysis on recording/playback. Core voice API (src/api/tts.rs +234, src/api/webchat.rs +132) wires up spoken response generation and visualization events. Agent channel (src/agent/channel.rs +319) now constructs speech prompts with personality injection and historical anti-repetition context from settings store (+72 lines in settings/store.rs). Identity system loads SPEECH.md as first-class personality config alongside existing Soul/Identity/Role files. Test coverage added for prompt construction and history persistence round-tripping.
Written by Tembo for commit 68b3f01