Skip to content

feat: upgrade the voice interaction experience#459

Open
jamiepine wants to merge 40 commits intofeat/worker-orchestration-screenfrom
feat/voice-interaction
Open

feat: upgrade the voice interaction experience#459
jamiepine wants to merge 40 commits intofeat/worker-orchestration-screenfrom
feat/voice-interaction

Conversation

@jamiepine
Copy link
Member

@jamiepine jamiepine commented Mar 20, 2026

Summary

  • turn the voice overlay into a polished branded experience with the official bull + WebGL orb treatment, transparent desktop overlay chrome, and a new Siri-style multi-layer wave visualization that reacts to live mic and playback FFT data
  • add a dedicated speech personality system with SPEECH.md support across presets, identity APIs, config UI, and runtime loading so spoken replies can be tuned separately from text identity
  • persist the last 15 spoken rewrites per channel and inject that history into speech synthesis prompts so voice replies avoid repetitive openings and stay varied across sessions
  • round out the voice pipeline with Voicebox webchat/audio plumbing, spoken-response events, playback visualization hooks, and related UI polish for a much stronger end-to-end voice interface

Highlights

  • Voice overlay redesign
    • replaces the simple meter with a layered Siri-style wave in interface/src/routes/Overlay.tsx
    • aligns recording and playback states to Spacebot blue/purple branding instead of the previous green accent
    • integrates the official bull logo and orb treatment used by the other Spacebot surfaces
    • improves the desktop overlay window styling in desktop/src-tauri/src/main.rs so the overlay feels native and clean
  • Speech personality support
    • adds SPEECH.md to presets and identity loading so spoken behavior has a first-class config surface
    • exposes SPEECH.md in the agent settings UI and agent detail view for editing alongside SOUL.md, IDENTITY.md, and ROLE.md
    • updates agent create/load/update flows and API responses so speech personality content stays in sync everywhere
  • Speech synthesis improvements
    • persists recent spoken-response history in settings and reloads it when channels resume
    • injects speech guidance plus the last 15 spoken responses into the speech-generation prompt in src/agent/channel.rs
    • adds targeted tests covering prompt construction and spoken-history persistence
  • Voice pipeline and UI polish
    • adds live spectral analysis for both recording and playback in interface/src/hooks/useAudioRecorder.ts and interface/src/hooks/useTtsPlayback.ts
    • expands TTS and webchat plumbing so spoken responses can be generated, surfaced, and visualized reliably
    • includes supporting fixes to keep repo gates green, including the tests/context_dump.rs helper update

Verification

  • just preflight
  • just gate-pr
  • cargo test --lib spoken_generation_prompt_includes_recent_history
  • cargo test --lib spoken_generation_history_loads_from_settings_store
  • cargo test --lib spoken_history_round_trips_per_channel
  • bun run build (from interface/)

Why this matters

This is a major upgrade to the voice interface experience. The overlay now looks and feels like a real product surface instead of a utilitarian debug panel, while the new speech-personality and anti-repetition pipeline makes spoken responses sound more intentional, branded, and less repetitive over time.

Note

Technical Summary

50 files changed across 4 main areas. Desktop/Tauri layer gains global-hotkey and rustix dependencies for enhanced overlay control. Frontend (Overlay.tsx, hooks) implements FFT-driven visualizations (+606 lines) and integrates WebGL orb rendering with real-time spectral analysis on recording/playback. Core voice API (src/api/tts.rs +234, src/api/webchat.rs +132) wires up spoken response generation and visualization events. Agent channel (src/agent/channel.rs +319) now constructs speech prompts with personality injection and historical anti-repetition context from settings store (+72 lines in settings/store.rs). Identity system loads SPEECH.md as first-class personality config alongside existing Soul/Identity/Role files. Test coverage added for prompt construction and history persistence round-tripping.

Written by Tembo for commit 68b3f01

unverbraucht and others added 26 commits March 11, 2026 19:41
Implements a full Mattermost adapter using a custom HTTP + WebSocket
client (reqwest + tokio-tungstenite), following the existing adapter
architecture.

Features:
- WebSocket-based inbound messages with exponential backoff reconnection
- Text responses, streaming edits (ZWS placeholder + throttled PUT),
  file uploads, reactions, typing indicators, thread replies
- History fetch via /channels/{id}/posts
- Named adapter instances with per-instance permissions
- Hot-reload of permissions via spawn_file_watcher
- Conversation IDs: mattermost:{team_id}:{channel_id} and
  mattermost:{team_id}:dm:{user_id}
- Fail-closed permission filtering: messages without a team_id are
  rejected when team/channel filters are configured
- URL validation at config load time

Config:
  [messaging.mattermost]
  enabled = true
  base_url = "https://mattermost.example.com"
  token = "env:MATTERMOST_TOKEN"
  team_id = "optional-default-team-id"
Resolved conflicts in src/config.rs (refactored to submodules), src/main.rs,
src/messaging.rs, and src/messaging/target.rs. Distributed Mattermost config
additions across the new config submodule structure introduced upstream.
- Add Mattermost platform to ChannelSettingCard, ChannelEditModal,
  Settings, and platformIcons
- Server URL and access token credential inputs
- Backend: read mattermost config into instances list in messaging_status
- Backend: write base_url and token to TOML via create_messaging_instance
- Mattermost instances configured manually in config.toml appear in the UI
Empty dm_allowed_users blocks all DMs (fail-closed). Non-empty list
allows only the specified Mattermost user IDs, consistent with Slack
and Telegram adapter behaviour.

All other layers (config, TOML schema, permissions merge from bindings,
bindings API, UI TagInput) were already in place.
SEC2: add sanitize_reaction_name() — resolves unicode emoji to shortcodes
via the emojis crate, strips colon wrappers, lowercases plain names.

FN1: user identity resolution — add user_identity_cache, resolve
display name from /users/{id} API on first encounter, set
sender_display_name metadata and formatted_author.

FN2: channel name resolution — add channel_name_cache, resolve
display_name from /channels/{id} API on first encounter, set
mattermost_channel_name and CHANNEL_NAME metadata.

FN3: fetch_history now returns timestamps parsed from post.create_at
milliseconds instead of always None.

FN4: bot mention detection — check message text for @bot_username,
set mattermost_mentions_or_replies_to_bot metadata flag so
require_mention routing mode works correctly.

Also adds 12 new unit tests covering these fixes.
mattermost.rs:
- Validate base_url has no path/query/fragment in new()
- Check /users/me HTTP status before .json() to surface auth errors
- start_typing() aborts previous task before spawning to prevent leaks
- shutdown() drains typing_tasks with abort() instead of clear()
- StreamStart now falls back to REPLY_TO_MESSAGE_ID for thread replies
- StreamEnd applies MAX_MESSAGE_LENGTH chunking to avoid oversized edits
- Reaction handler fails fast if bot_user_id not initialized
- respond() file attachment post checks HTTP status
- broadcast() file upload checks HTTP status
- broadcast() resolves dm:{user_id} targets via /channels/direct API
- split_message uses floor_char_boundary before slicing to prevent UTF-8 panics
- Identity/channel lookups log errors via tracing::debug instead of silent None

target.rs:
- resolve_broadcast_target extracts named adapter key from conversation ID
- normalize_mattermost_target strips team/instance prefixes down to bare channel_id or dm:{user_id}

api/messaging.rs:
- Validate mattermost base_url before persisting to config.toml
- Delete default Mattermost instance now clears all credential fields

config/load.rs:
- Named Mattermost instances with missing credentials are preserved but
  disabled (map + enabled&&has_credentials) instead of silently dropped
channel_ids filter at line 1543 only checked Discord/Slack/Twitch metadata,
causing it to return false for Mattermost messages before the Mattermost-
specific block ran. Added mattermost_channel_id to the direct_match set.

require_mention routing had no mattermost arm in the mention_key switch,
so all Mattermost messages were dropped when require_mention = true. Added
mattermost -> mattermost_mentions_or_replies_to_bot mapping.
target.rs: Default DM conv IDs (mattermost:team:dm:user, 4 parts) were
matching the named-instance arm in extract_mattermost_adapter_from_channel_id
and returning mattermost:{team_id} as the adapter. Fixed match arm ordering
so the default-DM pattern is checked before the 4-part named-channel pattern.

target.rs: normalize_mattermost_target accepted bare 'dm' as a valid channel
ID. Added != 'dm' guard to the single-segment arm.

mattermost.rs: StreamEnd overflow chunks were posted with root_id = None,
escaping the original thread. Now resolves root_id from mattermost_root_id /
REPLY_TO_MESSAGE_ID metadata and passes it to all overflow create_post calls.

mattermost.rs: WS post JSON decode failures were silently dropped. Now logs
at DEBUG level via tracing::debug so malformed payloads are visible.
Use Cow<str> so the common case (text under MAX_MESSAGE_LENGTH) borrows
accumulated_text directly instead of cloning it on every throttled edit.
- validate_mattermost_url: fail-closed for non-localhost http (was warn-only)
- MattermostConfig: add SystemSecrets impl and register in system_secret_registry
- StreamChunk: release active_messages write lock before awaiting edit_post
- channel_filter: fail-closed when team has no allowlist entry (was fail-open)
- WS shutdown: log close frame send failure instead of discarding with let _ =

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- channel_filter: add test for fail-closed behavior when team is missing from filter
- WS event loop: log outer MattermostWsEvent parse failures (was silently dropped)
- broadcast: check HTTP status on /posts with file_ids (was unchecked)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- SystemSecrets: add InstancePattern to token and base_url fields so
  MATTERMOST_{INSTANCE}_TOKEN/BASE_URL are recognized as system secrets
- start(): log warning when bot_user_id/bot_username OnceCell is already set
  instead of silently discarding the error
- shutdown(): use tx.send().await.ok() pattern for intentional channel-send discard

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- validate_mattermost_url: reject URLs with credentials, non-root path,
  query string, or fragment (accept origin-only URLs only)
- start()/WS task: apply default_team_id fallback when WS event omits team_id
- matches_route: remove duplicate Mattermost-only channel_ids filter block
  (mattermost_channel_id is already covered by direct_match)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- bindings API: add team_id to BindingResponse, CreateBindingRequest, and
  UpdateBindingRequest so Mattermost team scoping is readable/writable via API
- validate_mattermost_url: reject URLs with credentials, non-root path,
  query string, or fragment; add tests covering all new rejection branches
- broadcast File: include root_id in /posts payload so file replies stay in thread
- StreamStart: stop typing task before propagating create_post error to prevent leak
- config/types.rs: rename |i| closure var to |instance| for readability
- mattermost tests: rename bmfp/bmfp_named helpers and p param to descriptive names

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace implementation plan with concise feature list, setup guide,
named instance configuration, and notes on what is missing compared
to Slack/Discord.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- create_binding: guard empty team_id to match update_binding behavior
- DeleteBindingRequest: add team_id to prevent wrong-binding deletion
- split_message: filter break_point == 0 to prevent empty chunk push
- mentions_bot: set true for DMs; detect thread replies to bot via
  async root-post fetch in WS handler (resolve_root_post_author)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- bindings: normalize team_id in update/delete matching — trim whitespace
  and treat whitespace-only values as absent for consistent comparisons
- bindings: trim team_id before persisting in update_binding write path
- bindings: rename abbreviated match locals (gid/wid/cid/tid) to full names
- ws_url: replace silent scheme passthrough with unreachable! for unsupported schemes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- bindings: hoist request_team_id normalization before the loop in both
  update_binding and delete_binding to avoid recomputing it per-iteration
- mattermost new(): remove redundant .into() on anyhow::anyhow! return
- ws loop: rename e to error in match arms and tracing macros

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix skill install API returning 500 for single-skill repos by always
  using 3-part spec format (owner/repo/skill-name) instead of bare
  owner/repo format that the backend rejects

- Fix recharts "width/height should be greater than 0" warnings by
  adding min-width/min-height constraints to ResponsiveContainer
  components and their parent divs

Fixes browser console errors:
- /api/agents/skills/install 500 errors when installing single-skill repos
- Chart dimension warnings on AgentDetail page
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add speech personality editing and persistent spoken-response history so voice replies stay varied across sessions. Refresh the overlay with the official bull orb treatment, Siri-style branded waves, and polished desktop window styling.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 20, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b373d1ff-4463-4ba8-93d7-13a6af93490e

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Walkthrough

Adds end-to-end voice features: desktop overlay window with global hotkeys, frontend recording/tts playback and overlay UI, OpenCode embed/orchestration, backend transcription and TTS proxying, spoken-response generation with message IDs and SSE events, and SPEECH.md identity support plus related plumbing.

Changes

Cohort / File(s) Summary
Desktop Tauri overlay & shortcuts
desktop/src-tauri/src/main.rs
New Tauri invoke toggle_voice_overlay, overlay lifecycle/create logic, macOS theme lock, and global shortcuts (Alt+Space, Alt+Shift+Space) wired to overlay toggle/recording events.
Overlay route & UI
interface/src/routes/Overlay.tsx, interface/src/router.tsx, interface/src/App.tsx
New /overlay route and Overlay component; RootLayout suppresses chrome for overlay path; Devtools hidden on /overlay.
Audio capture & TTS hooks
interface/src/hooks/useAudioRecorder.ts, interface/src/hooks/useTtsPlayback.ts
New hooks for microphone recording (visualization, WebM/Opus) and TTS playback (api.ttsGenerate, visualization, lifecycle).
Web chat + spoken UI changes
interface/src/components/WebChatPanel.tsx, interface/src/hooks/useChannelLiveState.ts
Handle spoken_response SSE events, dedupe by message_id and render “Spoken reply” blocks; prefer server message_id for assistant timeline entries.
OpenCode embed & orchestrate UI
interface/src/components/OpenCodeEmbed.tsx, interface/src/routes/Orchestrate.tsx, interface/src/routes/AgentWorkers.tsx
New shared OpenCode embed (Shadow DOM, asset manifest, SSE probe); Orchestrate route lists/embeds workers; AgentWorkers refactored to import shared embed.
Frontend API types & client
interface/src/api/client.ts
Added SpokenResponseEvent, TtsProfile, optional message_id on outbound events; added client methods webChatSendAudio, ttsGenerate, ttsProfiles; expanded worker/identity types.
Overlay UI styles
interface/src/ui/style/style.scss
Added overlay-window transparency rules, pill/halo styles and breathe keyframes for recording/speaking states.
New OpenCode asset/component
interface/src/components/OpenCodeEmbed.tsx
Large new component: embed asset loading, Shadow DOM mounting, SSE probe, portal CSS ref-counting, and navigation API.
Orchestrate & agent UIs
interface/src/routes/Orchestrate.tsx, interface/src/routes/AgentConfig.tsx, interface/src/routes/AgentDetail.tsx
New Orchestrate route/UI with OpenCode embeds; added speech identity section and AgentDetail speech rendering.
New webchat audio endpoint
src/api/webchat.rs, src/api/server.rs, src/api.rs
Added POST /webchat/send-audio handler and route wiring; added internal mod tts;.
TTS proxy & profiles
src/api/tts.rs, src/api/server.rs
New TTS module that proxies Voicebox streaming /generate/stream and GET /profiles, resolves profile/engine and forwards audio stream.
SSE event surface & message ID plumbing
src/lib.rs, src/api/state.rs, src/api/system.rs, src/main.rs, src/messaging/webchat.rs
Added ProcessEvent::SpokenResponse and ApiEvent::SpokenResponse; included message_id on outbound events and forwarded into SSE.
Channel spoken-response flow & reply tooling
src/agent/channel.rs, src/tools/reply.rs, src/tools.rs
Channel spawns spoken-response generation, tracks recent spoken rewrites; ReplyTool writes replies with message IDs and exposes shared RepliedText; add_channel_tools signature updated.
Audio transcription helper
src/agent/channel_attachments.rs
Added transcribe_audio_bytes helper (byte-based transcription via LLM provider API).
Identity / SPEECH.md support
src/identity/files.rs, src/api/agents.rs, src/factory/presets.rs, presets/main-agent/SPEECH.md
Identity gained speech: Option<String>, rendering and template support; identity API accepts/returns speech; preset includes SPEECH.md.
Factory/create/update tools
src/tools/factory_create_agent.rs, src/tools/factory_update_identity.rs, src/tools/factory_load_preset.rs
Create/update preset tools accept speech_content; load preset output includes speech.
Conversation & worker run updates
src/conversation/history.rs, src/api/workers.rs
Persist bot messages with provided message IDs; worker run/list rows include opencode session, directory, and project metadata.
Settings, config, watcher
src/settings/store.rs, src/config/runtime.rs, src/config/watcher.rs
Per-channel spoken history APIs; added hot-reloadable voicebox_url to RuntimeConfig; file watcher now watches SPEECH.md.
Frontend small changes & tests
interface/src/components/OpenCodeEmbed.tsx, interface/src/components/Sidebar.tsx, interface/src/routes/AgentWorkers.tsx, tests/context_dump.rs
New OpenCode embed component used by AgentWorkers; Sidebar adds Orchestrate link; tests updated to create replied_text.
Docs & README
README.md, docs/...
Extensive docs updates describing voice interaction, SPEECH.md, working memory, and desktop overlay; new SPEECH.md preset.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: a major upgrade to the voice interaction experience with overlay redesign, speech personality support, and synthesis improvements.
Description check ✅ Passed The description is comprehensive and directly related to the changeset, clearly explaining the voice interaction system redesign, speech personality features, visualization improvements, and API plumbing additions.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/voice-interaction

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

CodeRabbit can scan for known vulnerabilities in your dependencies using OSV Scanner.

OSV Scanner will automatically detect and report security vulnerabilities in your project's dependencies. No additional configuration is required.

Comment on lines +3687 to +3696
/// Generate a short, conversational spoken version of a full text response.
///
/// Uses a fast LLM call with a focused prompt that strips markdown, code
/// blocks, and technical detail — producing 1-3 natural sentences suitable
/// for TTS playback.
/// Generate a short, conversational spoken version of a full text response.
///
/// Uses a fast LLM call with a focused prompt that strips markdown, code
/// blocks, and technical detail — producing 1-3 natural sentences suitable
/// for TTS playback.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc comment looks duplicated here (same block twice); dropping the second copy keeps rustdoc clean.

Suggested change
/// Generate a short, conversational spoken version of a full text response.
///
/// Uses a fast LLM call with a focused prompt that strips markdown, code
/// blocks, and technical detail — producing 1-3 natural sentences suitable
/// for TTS playback.
/// Generate a short, conversational spoken version of a full text response.
///
/// Uses a fast LLM call with a focused prompt that strips markdown, code
/// blocks, and technical detail — producing 1-3 natural sentences suitable
/// for TTS playback.
/// Generate a short, conversational spoken version of a full text response.
///
/// Uses a fast LLM call with a focused prompt that strips markdown, code
/// blocks, and technical detail — producing 1-3 natural sentences suitable
/// for TTS playback.

Comment on lines +468 to +470
if let Ok(mut slot) = self.replied_text.lock() {
*slot = Some(converted_content.clone());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this mutex ever gets poisoned, this silently skips setting replied_text (and voice gen won’t run). Seems safer to recover the guard.

Suggested change
if let Ok(mut slot) = self.replied_text.lock() {
*slot = Some(converted_content.clone());
}
let mut slot = self
.replied_text
.lock()
.unwrap_or_else(|error| error.into_inner());
*slot = Some(converted_content.clone());

setState("loading");

try {
const audioBuffer = await api.ttsGenerate(text, {agentId, profileId});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small edge case: if stop() is called while api.ttsGenerate(...) is still in-flight, this speak() continues and will start playback when the request resolves. Consider a simple request token (or AbortController) so stop() cancels pending loads too.

Array.from({length: SPECTRUM_BAR_COUNT}, () => 0),
);

const cleanupAudioGraph = useCallback(() => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This hook doesn’t do any unmount cleanup; if the overlay UI unmounts while recording, the mic tracks / AudioContext can stick around. I’d add a useEffect cleanup that calls cleanupAudioGraph() (and optionally stops the recorder) on unmount.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 14

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

🟡 Minor comments (9)
desktop/src-tauri/src/main.rs-133-137 (1)

133-137: ⚠️ Potential issue | 🟡 Minor

Log errors instead of silently discarding them.

Same issue here — window chrome operations can fail and should be logged.

🔧 Proposed fix
 fn apply_overlay_window_chrome(window: &tauri::WebviewWindow) {
-    let _ = window.set_decorations(false);
-    let _ = window.set_shadow(false);
-    let _ = window.set_always_on_top(true);
+    if let Err(error) = window.set_decorations(false) {
+        tracing::warn!(%error, "failed to set overlay decorations");
+    }
+    if let Err(error) = window.set_shadow(false) {
+        tracing::warn!(%error, "failed to set overlay shadow");
+    }
+    if let Err(error) = window.set_always_on_top(true) {
+        tracing::warn!(%error, "failed to set overlay always-on-top");
+    }
 }

As per coding guidelines: "Don't silently discard errors. No let _ = on Results."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@desktop/src-tauri/src/main.rs` around lines 133 - 137, The
apply_overlay_window_chrome function currently discards Result values from
window.set_decorations, window.set_shadow, and window.set_always_on_top; update
it to inspect each Result and log failures instead of using `let _ =`. For each
call (set_decorations, set_shadow, set_always_on_top) match or use `if let
Err(e)` and emit a clear error log (e.g., via the project's logger/tracing
facility) that includes the operation name and the error value so failures are
visible for debugging.
desktop/src-tauri/src/main.rs-54-68 (1)

54-68: ⚠️ Potential issue | 🟡 Minor

Avoid silently discarding window operation errors.

Per coding guidelines, let _ = on Results should be avoided. These window operations can fail, and silently discarding errors makes debugging harder. Consider logging failures instead.

🔧 Proposed fix to log errors
 fn toggle_overlay(app: &tauri::AppHandle) {
     if let Some(overlay) = app.get_webview_window("voice-overlay") {
         // Toggle visibility
         if overlay.is_visible().unwrap_or(false) {
-            let _ = overlay.hide();
+            if let Err(error) = overlay.hide() {
+                tracing::warn!(%error, "failed to hide voice overlay");
+            }
         } else {
             apply_overlay_window_chrome(&overlay);
-            let _ = overlay.show();
-            let _ = overlay.set_focus();
+            if let Err(error) = overlay.show() {
+                tracing::warn!(%error, "failed to show voice overlay");
+            }
+            if let Err(error) = overlay.set_focus() {
+                tracing::warn!(%error, "failed to focus voice overlay");
+            }
         }
     } else {
         // Create the overlay window on first toggle
         create_overlay_window(app);
     }
 }

As per coding guidelines: "Don't silently discard errors. No let _ = on Results. Handle them, log them, or propagate them."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@desktop/src-tauri/src/main.rs` around lines 54 - 68, The toggle_overlay
function currently discards Results from window operations using `let _ =`,
which hides failures; update toggle_overlay to handle and log errors instead:
when calling `overlay.is_visible()`, `overlay.hide()`,
`apply_overlay_window_chrome(&overlay)`, `overlay.show()`, and
`overlay.set_focus()` return or propagate their Results (via match/if let Err(e)
/ ? where appropriate) and log any Err using the project's logging facility (or
eprintln! if none) so failures are visible during debugging; keep the existing
behavior of creating the window via `create_overlay_window` when the webview is
absent.
desktop/src-tauri/src/main.rs-152-161 (1)

152-161: ⚠️ Potential issue | 🟡 Minor

Avoid .unwrap() on shortcut registration — prefer graceful handling.

If shortcut registration fails (e.g., the shortcut is already taken by another application), .unwrap() will panic and crash the app. Consider using expect() with context or handling the error gracefully.

🔧 Proposed fix
         .plugin(
             tauri_plugin_global_shortcut::Builder::new()
                 .with_shortcut(shortcut)
-                .unwrap()
+                .expect("failed to register Alt+Space shortcut — may be in use by another app")
                 .with_handler(move |app, _shortcut, event| {
                     if event.state == tauri_plugin_global_shortcut::ShortcutState::Pressed {
                         toggle_overlay(app);
                     }
                 })
                 .build(),
         )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@desktop/src-tauri/src/main.rs` around lines 152 - 161, The code currently
calls .unwrap() on tauri_plugin_global_shortcut::Builder::with_shortcut which
will panic if registration fails; change this to handle the Result instead
(e.g., use .map_err/.and_then or a match) so you can log the error and either
skip registering the shortcut plugin or return a controlled Err; specifically
locate the Builder chain where with_shortcut(shortcut).unwrap() is called and
replace unwrap with error handling that logs the failure (using your app logger)
and avoids panicking while keeping the .with_handler and toggle_overlay logic
intact.
src/config/watcher.rs-129-131 (1)

129-131: ⚠️ Potential issue | 🟡 Minor

Limit identity reloads to actual identity files under the agent root.

This basename-only check now treats any changed SPEECH.md as an identity edit, including files under the recursively watched skills/ trees. A skill/doc file with the same name can trigger unnecessary identity reloads for every agent.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/config/watcher.rs` around lines 129 - 131, The current identity_changed
check only matches basenames and wrongly treats files like skills/**/SPEECH.md
as identity edits; change the predicate used in the changed_paths.iter().any
closure (the identity_changed computation) to first verify the changed path is
directly under the agent root (use Path::strip_prefix(agent_root) and ensure the
resulting relative path has no parent, i.e. rel.parent().is_none()), then match
the file_name against "SOUL.md" | "IDENTITY.md" | "ROLE.md" | "SPEECH.md"; this
ensures only top-level identity files trigger reloads.
src/api/webchat.rs-166-168 (1)

166-168: ⚠️ Potential issue | 🟡 Minor

Handle unexpected multipart fields explicitly.

Swallowing field.bytes().await here drops parse failures silently, so malformed trailing fields disappear without any trace. Either reject unknown fields or log/read failures before continuing.

💡 Minimal fix
             _ => {
-                // Skip unknown fields
-                let _ = field.bytes().await;
+                if let Err(error) = field.bytes().await {
+                    tracing::warn!(%error, field = %name, "failed to drain unexpected multipart field");
+                    return Err(StatusCode::BAD_REQUEST);
+                }
             }
As per coding guidelines, "Don't silently discard errors. No `let _ =` on Results. Handle them, log them, or propagate them."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/api/webchat.rs` around lines 166 - 168, The match arm that currently does
`let _ = field.bytes().await;` silently discards any errors from parsing unknown
multipart fields; update the `_` arm in the multipart field handling (the match
that binds `field`) to explicitly handle the Result from `field.bytes().await` —
either propagate an error (return Err with a descriptive context) or log the
failure (using the existing logger) and continue, e.g. by matching on
`field.bytes().await` and calling `process_logger.error`/`tracing::error` with
the error if it fails or discarding the bytes only on Ok; ensure you reference
the same variable names (`field`, `field.bytes().await`) and keep behavior
consistent with other arms.
src/tools/reply.rs-467-470 (1)

467-470: ⚠️ Potential issue | 🟡 Minor

Don't silently drop replied_text lock failures.

A poisoned mutex here disables spoken-response capture with no signal. Log and recover from the lock error instead of skipping the update.

💡 Minimal fix
-        if let Ok(mut slot) = self.replied_text.lock() {
-            *slot = Some(converted_content.clone());
-        }
+        match self.replied_text.lock() {
+            Ok(mut replied_text) => *replied_text = Some(converted_content.clone()),
+            Err(poisoned_lock) => {
+                tracing::warn!(
+                    conversation_id = %self.conversation_id,
+                    "replied_text mutex was poisoned; recovering"
+                );
+                *poisoned_lock.into_inner() = Some(converted_content.clone());
+            }
+        }
As per coding guidelines, "Don't silently discard errors. No `let _ =` on Results. Handle them, log them, or propagate them."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/reply.rs` around lines 467 - 470, The code currently ignores errors
from self.replied_text.lock(), which drops updates silently; change the lock
handling to catch the Err case, log the poisoning error (using the existing
logger/tracing infrastructure) and recover by calling err.into_inner() to obtain
the guard so you can still set *slot = Some(converted_content.clone()); ensure
you still handle the Ok(path) branch the same way and do not use let _ = on the
Result — explicitly match on self.replied_text.lock() and log the error before
recovering with into_inner() and performing the assignment.
interface/src/routes/Orchestrate.tsx-142-160 (1)

142-160: ⚠️ Potential issue | 🟡 Minor

Group tracks by project_id, not the display name.

Two unrelated projects with the same project_name will be collapsed into one track here, and the render loop below will also reuse the same React key. Keep project_name as the label, but use a stable project identifier for grouping.

Also applies to: 247-248

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interface/src/routes/Orchestrate.tsx` around lines 142 - 160, Group by the
stable project identifier instead of the display name: change the grouping key
in the byProject build to use worker.project_id (e.g. const groupKey =
worker.project_id ?? `ungrouped:${worker.project_name ?? 'Ungrouped'}`) while
still storing the human label as name = worker.project_name ?? 'Ungrouped'; when
creating entries and the mapped object (the block that returns { name,
projectId: ... }), ensure projectId is the actual numeric/string
worker.project_id (or null) and use that stable projectId as the React key in
the render loop; apply the same change to the other instance noted (lines
~247-248) so unrelated projects with identical project_name are not collapsed.
src/agent/channel.rs-1537-1540 (1)

1537-1540: ⚠️ Potential issue | 🟡 Minor

Don't swallow poisoned replied_text locks.

If the reply tool panics while holding this mutex, .lock().ok() turns that into “no spoken response” with no signal. Please log or propagate the lock failure instead of silently discarding it. As per coding guidelines, "Don't silently discard errors. No let _ = on Results. Handle them, log them, or propagate them. The only exception is .ok() on channel sends where the receiver may be dropped".

Also applies to: 1916-1919

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/channel.rs` around lines 1537 - 1540, The code currently swallows
poisoned mutex errors by calling replied_text.lock().ok(), losing error context;
change this to explicitly handle PoisonError instead of .ok(): call
replied_text.lock() and match the Result, log the poison error (including
context like replied_flag and operation), and recover by using
poison_err.into_inner() to obtain the guard (or propagate the error via Result
if the surrounding function returns a Result). Replace the
.lock().ok().and_then(|mut slot| slot.take()) pattern in the replied handling
(the block using replied_flag and replied_text) with this explicit
match/log/recover flow, and apply the same fix at the other occurrence mentioned
(around lines 1916-1919) so poisoned locks are not silently discarded.
interface/src/hooks/useTtsPlayback.ts-49-58 (1)

49-58: ⚠️ Potential issue | 🟡 Minor

Add cleanup on unmount to stop playback and close AudioContext.

The hook has no cleanup function. If the component unmounts mid-playback, the audio continues running and the RAF-driven metering updates persist after unmount. Add a useEffect with cleanup that calls stop() and closes the audioContextRef.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interface/src/hooks/useTtsPlayback.ts` around lines 49 - 58, The hook
useTtsPlayback is missing an unmount cleanup which can leave audio playing and
RAF metering running; add a useEffect that returns a cleanup function which
calls the hook's stop() to halt playback, cancels any pending animation frame
via animationFrameRef, disconnects/clears sourceRef and analyserRef, sets
playbackDoneRef to null, and closes audioContextRef (await close if needed) and
nulls it so the AudioContext is released when the component unmounts. Ensure the
cleanup references the existing stop() function and the refs (audioContextRef,
sourceRef, analyserRef, animationFrameRef, playbackDoneRef) to locate the code
to modify.
🧹 Nitpick comments (4)
interface/src/components/OpenCodeEmbed.tsx (1)

124-127: Event listener never removed from controller.signal.

The onUnmount listener added to controller.signal is never removed. While this is minor since the controller is aborted on cleanup anyway, explicitly removing listeners is cleaner.

♻️ Proposed fix
                 const probeController = new AbortController();
                 const onUnmount = () => probeController.abort();
                 controller.signal.addEventListener("abort", onUnmount);
+                
+                const cleanup = () => {
+                    controller.signal.removeEventListener("abort", onUnmount);
+                };
 
                 const sseRes = await fetch(`${serverUrl}/global/event`, {

Then call cleanup() before returning from the async function (at lines 154 and 159).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interface/src/components/OpenCodeEmbed.tsx` around lines 124 - 127, The added
abort listener on controller.signal (via
controller.signal.addEventListener("abort", onUnmount)) is never removed; update
the async routine that creates const probeController = new AbortController() and
const onUnmount = () => probeController.abort() so that you remove the listener
before any early return and on normal completion — call
controller.signal.removeEventListener("abort", onUnmount) inside the existing
cleanup() (or invoke cleanup() before returning) and ensure both early-return
paths in the async function call cleanup() so the event listener is always
removed.
src/factory/presets.rs (1)

79-86: Please add one test that a preset with SPEECH.md returns Some(speech).

Right now a missing or misnamed speech asset still passes every existing test because speech is optional. One main-agent assertion would catch silent regressions.

🧪 Example coverage
     #[test]
     fn load_returns_full_preset() {
         let preset = PresetRegistry::load("community-manager")
             .expect("community-manager preset should exist");
@@
         assert!(!preset.role.is_empty());
     }
+
+    #[test]
+    fn main_agent_loads_speech_guidance() {
+        let preset =
+            PresetRegistry::load("main-agent").expect("main-agent preset should exist");
+
+        assert!(
+            preset
+                .speech
+                .as_deref()
+                .is_some_and(|speech| !speech.trim().is_empty()),
+            "main-agent should ship with non-empty SPEECH.md guidance"
+        );
+    }

Also applies to: 111-116

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/factory/presets.rs` around lines 79 - 86, Add a unit test that verifies a
preset with a SPEECH.md file returns Some(speech): create a small fixture (or
temp dir) containing a preset id folder with SPEECH.md, load the preset via the
existing Preset loading path (use Preset::load or the codepath that calls
Self::load_optional_file), and assert preset.speech.is_some() and that the
contents match the expected string (e.g., contains "main-agent" or the known
fixture text); place the test near other Preset tests in presets.rs so it
exercises the load_optional_file -> Preset construction path referenced around
the speech handling code.
src/api/tts.rs (2)

207-218: Consider logging when profile resolution fails.

The .ok()? pattern silently discards HTTP and JSON parsing errors. While the fallback behavior is correct (caller handles None), adding debug-level logging would aid troubleshooting when profile auto-detection fails.

🔧 Suggested improvement
 async fn resolve_default_profile_id(
     client: &reqwest::Client,
     voicebox_url: &str,
 ) -> Option<String> {
     let endpoint = format!("{}/profiles", voicebox_url.trim_end_matches('/'));
-    let response = client.get(endpoint).send().await.ok()?;
+    let response = client.get(&endpoint).send().await.map_err(|e| {
+        tracing::debug!(%e, %endpoint, "failed to fetch Voicebox profiles for default resolution");
+        e
+    }).ok()?;
     if !response.status().is_success() {
+        tracing::debug!(status = %response.status(), "Voicebox profiles returned non-success status");
         return None;
     }
-    let profiles = response.json::<Vec<VoiceboxProfile>>().await.ok()?;
+    let profiles = response.json::<Vec<VoiceboxProfile>>().await.map_err(|e| {
+        tracing::debug!(%e, "failed to parse Voicebox profiles response");
+        e
+    }).ok()?;
     profiles.first().map(|p| p.id.clone())
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/api/tts.rs` around lines 207 - 218, The function
resolve_default_profile_id currently swallows errors via .ok()? which makes
failures silent; change the implementation to explicitly handle the request and
JSON parsing errors so you can emit debug-level logs: perform the GET on the
constructed endpoint and if response.send() fails log the error with context
(including voicebox_url/endpoint), if response.status().is_success() is false
log the status and body or a short message, and when
response.json::<Vec<VoiceboxProfile>>() fails log the parse error; keep
returning None on failure but replace the chained .ok()? calls with match/if let
so you can call your preferred logger (e.g. debug! or tracing::debug!)
referencing resolve_default_profile_id, client, voicebox_url, endpoint,
response, and profiles to aid troubleshooting.

62-68: Consider reusing the HTTP client across requests.

A new reqwest::Client is constructed on every TTS request. The client maintains connection pools and is designed to be reused. Consider storing a shared client in ApiState or using a OnceCell/LazyLock for better performance.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/api/tts.rs` around lines 62 - 68, The TTS handler currently builds a new
reqwest::Client inside the function (the local variable `client` from
`reqwest::Client::builder()`), which wastes connection pools; instead add a
shared/reused client and use it for requests: e.g., add a field to `ApiState`
(or a global `OnceCell`/`LazyLock`) to hold a pre-built `reqwest::Client` and
replace the local builder call with retrieving that shared client (handle the
Option/initialization or return the same `StatusCode::INTERNAL_SERVER_ERROR` on
startup/build failure). Ensure you remove the per-request `Client::builder()`
usage and reference the shared client where `client` is currently used.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@interface/src/components/WebChatPanel.tsx`:
- Around line 193-210: The spoken-response matching currently deduplicates by
text in handleSpokenResponse, which causes collisions and leaks across sessions;
change the logic to use a stable per-message identifier from the event (e.g.,
event.message_id or event.turn_id) instead of full_text/spoken_text for both the
duplicate check and the created id (update the id generation and duplicate
predicate in handleSpokenResponse and the other spoken-block insertion logic
that mirrors it). If the event doesn't provide a stable id, fall back to a
combined stable key like `${event.channel_id}:${event.sequence}` or include a
monotonic counter scoped to sessionId so duplicates are resolved by stable
message id rather than text. Ensure you reference SpokenResponseEvent,
sessionId, setSpokenBlocks, and the created block id consistently.

In `@interface/src/hooks/useAudioRecorder.ts`:
- Around line 177-181: The catch in startRecording() swallows errors, preventing
callers from handling failures; modify startRecording() so the catch still
performs cleanupAudioGraph() and sets state back to "idle" but then rethrow the
caught error (or return a rejected Promise) so callers can react to
getUserMedia/recorder setup failures instead of assuming recording started;
ensure cleanupAudioGraph() and setState("idle") are called before rethrowing so
resources are freed and UI can roll back.
- Around line 64-78: Add a React useEffect cleanup that calls the existing
cleanupAudioGraph() when the hook unmounts and also stops any active
MediaRecorder by calling mediaRecorderRef.current?.stop(); specifically, add a
useEffect with an empty dependency array that returns a cleanup function which
first calls mediaRecorderRef.current?.stop() (if present) and then calls
cleanupAudioGraph() so the analyserRef, streamRef, audioContextRef, and other
resources are always released on unmount.

In `@interface/src/hooks/useTtsPlayback.ts`:
- Around line 83-154: Add a per-call cancellation token to speak(): introduce a
requestIdRef (number) that you increment at the top of speak() and also in
stop(), capture localRequestId at start of the async flow, and after every await
(after api.ttsGenerate, after decodeAudioData, and before source.start()) check
that localRequestId === requestIdRef.current and if not bail out and clean up
any created nodes/contexts (clear analyserRef/sourceRef, stopMetering,
setState("idle") and resolve/return). Use these symbols to find places to
change: speak(), stop(), requestIdRef, audioContextRef, sourceRef, analyserRef,
playbackDoneRef, smoothedLevelRef. Ensure playbackDoneRef and source.onended are
only assigned when the token is current so stale calls cannot call
source.start() or change playback state.

In `@interface/src/routes/Orchestrate.tsx`:
- Around line 354-356: The code currently falls back to worker.id when
opencode_session_id is null, which incorrectly mounts an OpenCode session that
doesn't exist; update the render logic in Orchestrate.tsx around the
OpenCodeEmbed usage so that OpenCodeEmbed receives sessionId only when
worker.opencode_session_id is non-null (keep port as worker.opencode_port), and
when opencode_session_id is null render a proper loading/unsupported placeholder
(or null) instead of using worker.id; reference the OpenCodeEmbed component and
the worker.opencode_session_id / worker.id props to locate the change.
- Around line 91-98: The code is overwriting worker.status with
liveWorker?.status, which breaks WorkerColumn's exact-enum checks; instead
preserve the normalized worker.status (keep using worker.status for the status
field) and move the SSE/descriptive string from liveWorker?.status into a new
separate field (e.g., live_status or liveStatusText) on the pushed object so UI
components can read normalized status from status and descriptive/live text from
the new property; update the object created in the result.push(...) near
activeWorkers[worker.id] to stop assigning liveWorker?.status to status and add
a distinct property for the live status text.

In `@interface/src/routes/Overlay.tsx`:
- Around line 81-98: The transcript gets duplicate assistant entries because
handleSpokenResponse always pushes a new assistant turn even if an
outbound_message already appended the pending assistant reply; modify
handleSpokenResponse (and the similar block for spoken_response at lines
109-118) to detect and update the existing pending assistant item instead of
appending: use sessionId and spokenReceivedRef/ttsStartedRef as currently used
to match the turn, then call setTranscript with an updater that finds the last
assistant entry (or a pending marker created by outbound_message) and replaces
its text with event.full_text (or event.spoken_text where appropriate); keep the
existing behavior of adding a new entry only if no pending assistant row exists.

In `@src/agent/channel_attachments.rs`:
- Around line 351-357: The byte-based transcribe_audio_bytes path currently
infers audio format only from mime_type, causing mismatches when clients send
generic MIME types; update transcribe_audio_bytes to call
audio_format_for_attachment (or replicate its logic) passing both mime_type and
filename as fallback so the filename extension is used when MIME is
non-specific, and apply the same filename-fallback logic in the other byte-path
locations around the other transcribe functions (the blocks referenced near the
transcribe_audio_bytes function and the later multipart/bytes handling) so all
byte-based inference mirrors the multipart-based audio_format_for_attachment
behavior.

In `@src/api/webchat.rs`:
- Around line 155-164: The "audio" branch currently calls field.bytes().await
and assigns to audio_bytes (then later base64-encoded), allowing unbounded
in-memory copies; replace that with a bounded read: read the multipart Field in
chunks (e.g., field.chunk() or AsyncRead + tokio_util::io::StreamReader) and
accumulate into a buffer while enforcing a MAX_AUDIO_BYTES constant (reject with
StatusCode::PAYLOAD_TOO_LARGE or BAD_REQUEST if exceeded) or stream directly to
the transcription path without a full Vec<u8> copy. Target the "audio" match
arm, the use of field.bytes().await, and the audio_bytes variable/transcription
helper when making this change.

In `@src/api/workers.rs`:
- Around line 57-60: The worker detail payload is missing project_id and
project_name so the UI loses project context; update the data model and mapping
in src/api/workers.rs to include those fields in the detail response as well as
the list rows—add project_id: Option<String> and project_name: Option<String> to
the detail struct (the same structs used for list rows and detail), and ensure
worker_detail() populates them from the existing join logic (the same join that
feeds the list rows, referenced around lines 155-159) so the detail endpoint
returns the same project-linked values.

In `@src/config/runtime.rs`:
- Around line 93-97: The voicebox_url is currently treated as startup-only
(stored in voicebox_url: ArcSwap<String>) and never updated in reload_config(),
so per-agent or default config changes (including empty to disable) are ignored;
update the code so reload_config() reads the voice.voicebox_url from the
resolved agent/default config and calls
voicebox_url.store(Arc::new(new_value.clone())) (or clears it to empty/None
semantics) to apply changes at runtime; likewise apply the same fix to the other
startup-only fields referenced around the same area (the other ArcSwap<String>
fields at the 169-172 equivalent), ensuring initialization still respects
env/config but runtime reload uses the same reload_config() update path.

In `@src/conversation/history.rs`:
- Around line 837-843: The detail endpoint is missing project metadata: update
the SQL in get_worker_detail to include a LEFT JOIN projects p ON w.project_id =
p.id and select w.project_id and p.name AS project_name (matching
list_worker_runs), then add corresponding fields to the WorkerDetailRow struct
and populate them where the query rows are mapped into WorkerDetailRow so the
detail response returns project_id and project_name consistently with
list_worker_runs.

In `@src/lib.rs`:
- Around line 313-319: The SpokenResponse struct uses full_text as a join key
which is brittle; add a stable persisted message ID field (e.g., message_id:
MessageId or persisted_message_id: MessageId) to SpokenResponse and populate it
wherever SpokenResponse instances are constructed so downstream consumers
correlate on this message_id instead of full_text; update any
serialization/deserialization and references to use SpokenResponse::message_id
for spoken-response correlation (identify constructors/usages that create
SpokenResponse and the consumer code that matches events).

---

Minor comments:
In `@desktop/src-tauri/src/main.rs`:
- Around line 133-137: The apply_overlay_window_chrome function currently
discards Result values from window.set_decorations, window.set_shadow, and
window.set_always_on_top; update it to inspect each Result and log failures
instead of using `let _ =`. For each call (set_decorations, set_shadow,
set_always_on_top) match or use `if let Err(e)` and emit a clear error log
(e.g., via the project's logger/tracing facility) that includes the operation
name and the error value so failures are visible for debugging.
- Around line 54-68: The toggle_overlay function currently discards Results from
window operations using `let _ =`, which hides failures; update toggle_overlay
to handle and log errors instead: when calling `overlay.is_visible()`,
`overlay.hide()`, `apply_overlay_window_chrome(&overlay)`, `overlay.show()`, and
`overlay.set_focus()` return or propagate their Results (via match/if let Err(e)
/ ? where appropriate) and log any Err using the project's logging facility (or
eprintln! if none) so failures are visible during debugging; keep the existing
behavior of creating the window via `create_overlay_window` when the webview is
absent.
- Around line 152-161: The code currently calls .unwrap() on
tauri_plugin_global_shortcut::Builder::with_shortcut which will panic if
registration fails; change this to handle the Result instead (e.g., use
.map_err/.and_then or a match) so you can log the error and either skip
registering the shortcut plugin or return a controlled Err; specifically locate
the Builder chain where with_shortcut(shortcut).unwrap() is called and replace
unwrap with error handling that logs the failure (using your app logger) and
avoids panicking while keeping the .with_handler and toggle_overlay logic
intact.

In `@interface/src/hooks/useTtsPlayback.ts`:
- Around line 49-58: The hook useTtsPlayback is missing an unmount cleanup which
can leave audio playing and RAF metering running; add a useEffect that returns a
cleanup function which calls the hook's stop() to halt playback, cancels any
pending animation frame via animationFrameRef, disconnects/clears sourceRef and
analyserRef, sets playbackDoneRef to null, and closes audioContextRef (await
close if needed) and nulls it so the AudioContext is released when the component
unmounts. Ensure the cleanup references the existing stop() function and the
refs (audioContextRef, sourceRef, analyserRef, animationFrameRef,
playbackDoneRef) to locate the code to modify.

In `@interface/src/routes/Orchestrate.tsx`:
- Around line 142-160: Group by the stable project identifier instead of the
display name: change the grouping key in the byProject build to use
worker.project_id (e.g. const groupKey = worker.project_id ??
`ungrouped:${worker.project_name ?? 'Ungrouped'}`) while still storing the human
label as name = worker.project_name ?? 'Ungrouped'; when creating entries and
the mapped object (the block that returns { name, projectId: ... }), ensure
projectId is the actual numeric/string worker.project_id (or null) and use that
stable projectId as the React key in the render loop; apply the same change to
the other instance noted (lines ~247-248) so unrelated projects with identical
project_name are not collapsed.

In `@src/agent/channel.rs`:
- Around line 1537-1540: The code currently swallows poisoned mutex errors by
calling replied_text.lock().ok(), losing error context; change this to
explicitly handle PoisonError instead of .ok(): call replied_text.lock() and
match the Result, log the poison error (including context like replied_flag and
operation), and recover by using poison_err.into_inner() to obtain the guard (or
propagate the error via Result if the surrounding function returns a Result).
Replace the .lock().ok().and_then(|mut slot| slot.take()) pattern in the replied
handling (the block using replied_flag and replied_text) with this explicit
match/log/recover flow, and apply the same fix at the other occurrence mentioned
(around lines 1916-1919) so poisoned locks are not silently discarded.

In `@src/api/webchat.rs`:
- Around line 166-168: The match arm that currently does `let _ =
field.bytes().await;` silently discards any errors from parsing unknown
multipart fields; update the `_` arm in the multipart field handling (the match
that binds `field`) to explicitly handle the Result from `field.bytes().await` —
either propagate an error (return Err with a descriptive context) or log the
failure (using the existing logger) and continue, e.g. by matching on
`field.bytes().await` and calling `process_logger.error`/`tracing::error` with
the error if it fails or discarding the bytes only on Ok; ensure you reference
the same variable names (`field`, `field.bytes().await`) and keep behavior
consistent with other arms.

In `@src/config/watcher.rs`:
- Around line 129-131: The current identity_changed check only matches basenames
and wrongly treats files like skills/**/SPEECH.md as identity edits; change the
predicate used in the changed_paths.iter().any closure (the identity_changed
computation) to first verify the changed path is directly under the agent root
(use Path::strip_prefix(agent_root) and ensure the resulting relative path has
no parent, i.e. rel.parent().is_none()), then match the file_name against
"SOUL.md" | "IDENTITY.md" | "ROLE.md" | "SPEECH.md"; this ensures only top-level
identity files trigger reloads.

In `@src/tools/reply.rs`:
- Around line 467-470: The code currently ignores errors from
self.replied_text.lock(), which drops updates silently; change the lock handling
to catch the Err case, log the poisoning error (using the existing
logger/tracing infrastructure) and recover by calling err.into_inner() to obtain
the guard so you can still set *slot = Some(converted_content.clone()); ensure
you still handle the Ok(path) branch the same way and do not use let _ = on the
Result — explicitly match on self.replied_text.lock() and log the error before
recovering with into_inner() and performing the assignment.

---

Nitpick comments:
In `@interface/src/components/OpenCodeEmbed.tsx`:
- Around line 124-127: The added abort listener on controller.signal (via
controller.signal.addEventListener("abort", onUnmount)) is never removed; update
the async routine that creates const probeController = new AbortController() and
const onUnmount = () => probeController.abort() so that you remove the listener
before any early return and on normal completion — call
controller.signal.removeEventListener("abort", onUnmount) inside the existing
cleanup() (or invoke cleanup() before returning) and ensure both early-return
paths in the async function call cleanup() so the event listener is always
removed.

In `@src/api/tts.rs`:
- Around line 207-218: The function resolve_default_profile_id currently
swallows errors via .ok()? which makes failures silent; change the
implementation to explicitly handle the request and JSON parsing errors so you
can emit debug-level logs: perform the GET on the constructed endpoint and if
response.send() fails log the error with context (including
voicebox_url/endpoint), if response.status().is_success() is false log the
status and body or a short message, and when
response.json::<Vec<VoiceboxProfile>>() fails log the parse error; keep
returning None on failure but replace the chained .ok()? calls with match/if let
so you can call your preferred logger (e.g. debug! or tracing::debug!)
referencing resolve_default_profile_id, client, voicebox_url, endpoint,
response, and profiles to aid troubleshooting.
- Around line 62-68: The TTS handler currently builds a new reqwest::Client
inside the function (the local variable `client` from
`reqwest::Client::builder()`), which wastes connection pools; instead add a
shared/reused client and use it for requests: e.g., add a field to `ApiState`
(or a global `OnceCell`/`LazyLock`) to hold a pre-built `reqwest::Client` and
replace the local builder call with retrieving that shared client (handle the
Option/initialization or return the same `StatusCode::INTERNAL_SERVER_ERROR` on
startup/build failure). Ensure you remove the per-request `Client::builder()`
usage and reference the shared client where `client` is currently used.

In `@src/factory/presets.rs`:
- Around line 79-86: Add a unit test that verifies a preset with a SPEECH.md
file returns Some(speech): create a small fixture (or temp dir) containing a
preset id folder with SPEECH.md, load the preset via the existing Preset loading
path (use Preset::load or the codepath that calls Self::load_optional_file), and
assert preset.speech.is_some() and that the contents match the expected string
(e.g., contains "main-agent" or the known fixture text); place the test near
other Preset tests in presets.rs so it exercises the load_optional_file ->
Preset construction path referenced around the speech handling code.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 45e5566e-0044-4af1-90f3-bb6c5e5cef37

📥 Commits

Reviewing files that changed from the base of the PR and between b195e7d and 68b3f01.

⛔ Files ignored due to path filters (7)
  • desktop/src-tauri/Cargo.lock is excluded by !**/*.lock, !**/*.lock
  • desktop/src-tauri/Cargo.toml is excluded by !**/*.toml
  • desktop/src-tauri/capabilities/default.json is excluded by !**/*.json
  • desktop/src-tauri/gen/schemas/acl-manifests.json is excluded by !**/gen/**, !**/*.json, !**/gen/**
  • desktop/src-tauri/gen/schemas/capabilities.json is excluded by !**/gen/**, !**/*.json, !**/gen/**
  • desktop/src-tauri/gen/schemas/desktop-schema.json is excluded by !**/gen/**, !**/*.json, !**/gen/**
  • desktop/src-tauri/gen/schemas/macOS-schema.json is excluded by !**/gen/**, !**/*.json, !**/gen/**
📒 Files selected for processing (43)
  • desktop/src-tauri/src/main.rs
  • interface/src/App.tsx
  • interface/src/api/client.ts
  • interface/src/components/OpenCodeEmbed.tsx
  • interface/src/components/Sidebar.tsx
  • interface/src/components/WebChatPanel.tsx
  • interface/src/hooks/useAudioRecorder.ts
  • interface/src/hooks/useTtsPlayback.ts
  • interface/src/router.tsx
  • interface/src/routes/AgentConfig.tsx
  • interface/src/routes/AgentDetail.tsx
  • interface/src/routes/AgentWorkers.tsx
  • interface/src/routes/Orchestrate.tsx
  • interface/src/routes/Overlay.tsx
  • interface/src/ui/style/style.scss
  • presets/main-agent/SPEECH.md
  • src/agent/channel.rs
  • src/agent/channel_attachments.rs
  • src/agent/channel_history.rs
  • src/agent/cortex.rs
  • src/api.rs
  • src/api/agents.rs
  • src/api/server.rs
  • src/api/state.rs
  • src/api/system.rs
  • src/api/tts.rs
  • src/api/webchat.rs
  • src/api/workers.rs
  • src/config/runtime.rs
  • src/config/types.rs
  • src/config/watcher.rs
  • src/conversation/history.rs
  • src/factory/presets.rs
  • src/identity.rs
  • src/identity/files.rs
  • src/lib.rs
  • src/settings/store.rs
  • src/tools.rs
  • src/tools/factory_create_agent.rs
  • src/tools/factory_load_preset.rs
  • src/tools/factory_update_identity.rs
  • src/tools/reply.rs
  • tests/context_dump.rs

Comment on lines +177 to +181
} catch (error) {
console.error("Failed to start recording:", error);
cleanupAudioGraph();
setState("idle");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't swallow startRecording() failures.

This catch block always resolves, so callers cannot roll back optimistic UI when getUserMedia is denied or recorder setup fails. The overlay ends up stuck in a recording state even though capture never started.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interface/src/hooks/useAudioRecorder.ts` around lines 177 - 181, The catch in
startRecording() swallows errors, preventing callers from handling failures;
modify startRecording() so the catch still performs cleanupAudioGraph() and sets
state back to "idle" but then rethrow the caught error (or return a rejected
Promise) so callers can react to getUserMedia/recorder setup failures instead of
assuming recording started; ensure cleanupAudioGraph() and setState("idle") are
called before rethrowing so resources are freed and UI can roll back.

Comment on lines +91 to +98
const liveWorker = activeWorkers[worker.id];
result.push({
...worker,
agent_id: agent.id,
agent_name: agentName,
status: liveWorker?.status ?? worker.status,
live_tool_calls: liveWorker?.toolCalls,
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Keep worker.status normalized to "running"/"idle".

WorkerColumn keys its live UI off exact enum values. Overwriting status with liveWorker.status turns active workers into arbitrary strings, which hides cancel/live-duration behavior as soon as SSE publishes descriptive status text.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interface/src/routes/Orchestrate.tsx` around lines 91 - 98, The code is
overwriting worker.status with liveWorker?.status, which breaks WorkerColumn's
exact-enum checks; instead preserve the normalized worker.status (keep using
worker.status for the status field) and move the SSE/descriptive string from
liveWorker?.status into a new separate field (e.g., live_status or
liveStatusText) on the pushed object so UI components can read normalized status
from status and descriptive/live text from the new property; update the object
created in the result.push(...) near activeWorkers[worker.id] to stop assigning
liveWorker?.status to status and add a distinct property for the live status
text.

Comment on lines +155 to +164
"audio" => {
audio_mime = field.content_type().map(|s| s.to_string());
audio_filename = field.file_name().map(|s| s.to_string());
audio_bytes = Some(
field
.bytes()
.await
.map_err(|_| StatusCode::BAD_REQUEST)?
.to_vec(),
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Cap audio uploads before buffering them fully.

This copies the entire audio part into a Vec<u8>, and the transcription helper then base64-encodes it again. Accepted uploads therefore cost multiple full in-memory copies on the request path, which makes large audio payloads an easy memory spike. Please put a tight per-upload limit in front of this branch or switch to chunked reads.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/api/webchat.rs` around lines 155 - 164, The "audio" branch currently
calls field.bytes().await and assigns to audio_bytes (then later
base64-encoded), allowing unbounded in-memory copies; replace that with a
bounded read: read the multipart Field in chunks (e.g., field.chunk() or
AsyncRead + tokio_util::io::StreamReader) and accumulate into a buffer while
enforcing a MAX_AUDIO_BYTES constant (reject with StatusCode::PAYLOAD_TOO_LARGE
or BAD_REQUEST if exceeded) or stream directly to the transcription path without
a full Vec<u8> copy. Target the "audio" match arm, the use of
field.bytes().await, and the audio_bytes variable/transcription helper when
making this change.

Comment on lines +57 to +60
/// Project ID this worker is linked to.
project_id: Option<String>,
/// Project name (resolved via join).
project_name: Option<String>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

project_id and project_name need to be in the detail payload too.

These fields are now on list rows only. As soon as the UI loads a worker through worker_detail(), the new project context disappears, so the detail view cannot render the same project-linked state. If the detail query row does not already carry them, that needs plumbing too.

Also applies to: 155-159

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/api/workers.rs` around lines 57 - 60, The worker detail payload is
missing project_id and project_name so the UI loses project context; update the
data model and mapping in src/api/workers.rs to include those fields in the
detail response as well as the list rows—add project_id: Option<String> and
project_name: Option<String> to the detail struct (the same structs used for
list rows and detail), and ensure worker_detail() populates them from the
existing join logic (the same join that feeds the list rows, referenced around
lines 155-159) so the detail endpoint returns the same project-linked values.

Comment on lines +93 to +97
/// Voicebox server URL for TTS output. When non-empty, the channel
/// generates a spoken version of replies and broadcasts it as a
/// `SpokenResponse` event. Configurable via `SPACEBOT_VOICEBOX_URL` env
/// or `[voice] voicebox_url` in the agent config.
pub voicebox_url: ArcSwap<String>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

voicebox_url is wired as startup-only env state.

This bypasses the resolved agent/default config entirely, and reload_config() never refreshes it, so per-agent voicebox_url changes — including an explicit empty value to disable spoken replies — will not take effect after startup.

Also applies to: 169-172

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/config/runtime.rs` around lines 93 - 97, The voicebox_url is currently
treated as startup-only (stored in voicebox_url: ArcSwap<String>) and never
updated in reload_config(), so per-agent or default config changes (including
empty to disable) are ignored; update the code so reload_config() reads the
voice.voicebox_url from the resolved agent/default config and calls
voicebox_url.store(Arc::new(new_value.clone())) (or clears it to empty/None
semantics) to apply changes at runtime; likewise apply the same fix to the other
startup-only fields referenced around the same area (the other ArcSwap<String>
fields at the 169-172 equivalent), ensuring initialization still respects
env/config but runtime reload uses the same reload_config() update path.

Comment on lines +837 to +843
w.tool_calls, w.opencode_port, w.opencode_session_id, w.directory, \
w.interactive, \
c.display_name as channel_name, \
w.project_id, p.name as project_name \
FROM worker_runs w \
LEFT JOIN channels c ON w.channel_id = c.id \
LEFT JOIN projects p ON w.project_id = p.id \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n -C3 'list_worker_runs|get_worker_detail|WorkerRunRow|WorkerDetailRow|project_id|project_name' \
  src/conversation/history.rs src/api/workers.rs interface/src/api/client.ts interface/src/routes interface/src/components

Repository: spacedriveapp/spacebot

Length of output: 12403


🏁 Script executed:

sed -n '906,950p' src/conversation/history.rs

Repository: spacedriveapp/spacebot

Length of output: 2166


🏁 Script executed:

sed -n '990,1020p' src/conversation/history.rs

Repository: spacedriveapp/spacebot

Length of output: 997


Add project_id and project_name to the detail response path.

get_worker_detail() does not select or return project_id / project_name, even though list_worker_runs() does. The detail SQL query (lines 911–923) is missing the LEFT JOIN projects and the project field selections, and WorkerDetailRow (lines 993–1010) lacks those fields entirely. This creates a schema mismatch where the detail endpoint omits project metadata present in the list.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/conversation/history.rs` around lines 837 - 843, The detail endpoint is
missing project metadata: update the SQL in get_worker_detail to include a LEFT
JOIN projects p ON w.project_id = p.id and select w.project_id and p.name AS
project_name (matching list_worker_runs), then add corresponding fields to the
WorkerDetailRow struct and populate them where the query rows are mapped into
WorkerDetailRow so the detail response returns project_id and project_name
consistently with list_worker_runs.

Comment on lines +313 to +319
SpokenResponse {
agent_id: AgentId,
channel_id: ChannelId,
/// The short, conversational spoken text (1-3 sentences).
spoken_text: String,
/// The full response text this was derived from.
full_text: String,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use a stable message ID for spoken-response correlation.

full_text is an ambiguous join key. Repeated assistant replies or small formatting differences can attach the spoken-response event to the wrong message card. Include the persisted conversation message ID here and correlate on that downstream.

Suggested event shape
     SpokenResponse {
         agent_id: AgentId,
         channel_id: ChannelId,
+        message_id: String,
         /// The short, conversational spoken text (1-3 sentences).
         spoken_text: String,
         /// The full response text this was derived from.
         full_text: String,
     },
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
SpokenResponse {
agent_id: AgentId,
channel_id: ChannelId,
/// The short, conversational spoken text (1-3 sentences).
spoken_text: String,
/// The full response text this was derived from.
full_text: String,
SpokenResponse {
agent_id: AgentId,
channel_id: ChannelId,
message_id: String,
/// The short, conversational spoken text (1-3 sentences).
spoken_text: String,
/// The full response text this was derived from.
full_text: String,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib.rs` around lines 313 - 319, The SpokenResponse struct uses full_text
as a join key which is brittle; add a stable persisted message ID field (e.g.,
message_id: MessageId or persisted_message_id: MessageId) to SpokenResponse and
populate it wherever SpokenResponse instances are constructed so downstream
consumers correlate on this message_id instead of full_text; update any
serialization/deserialization and references to use SpokenResponse::message_id
for spoken-response correlation (identify constructors/usages that create
SpokenResponse and the consumer code that matches events).

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
interface/src/hooks/useChannelLiveState.ts (1)

251-268: ⚠️ Potential issue | 🟠 Major

Replace the provisional stream id with event.message_id.

When a reply starts as outbound_message_delta, this branch keeps the temporary stream-* id because it only updates content. WebChatPanel now correlates spoken replies by message_id, so streamed replies never pick up their spoken block, and they can still diverge from history keyed by the persisted id.

💡 Suggested fix
 				if (streamIndex >= 0) {
 					const streamItem = timeline[streamIndex];
 					if (streamItem.type === "message") {
-						timeline[streamIndex] = { ...streamItem, content: event.text };
+						timeline[streamIndex] = {
+							...streamItem,
+							id: event.message_id ?? streamItem.id,
+							content: event.text,
+						};
 					}
 				} else {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interface/src/hooks/useChannelLiveState.ts` around lines 251 - 268, In
useChannelLiveState, when handling streaming events you must ensure the
provisional stream id is replaced with the persisted event.message_id: when
pushing a new message via assistantMessageItem, use event.message_id if present
(fallback to `out-${generateId()}`), and when updating an existing streaming
timeline entry (streamingMessageId branch where you set timeline[streamIndex] =
{ ...streamItem, content: event.text }), also set its id to event.message_id
when event.message_id is available so the streamed reply keys match persisted
history and WebChatPanel’s spoken-block correlation.
src/agent/channel.rs (1)

851-872: ⚠️ Potential issue | 🔴 Critical

Thread outbound message_ids through the fallback send path.

RoutedResponse now requires message_id, but send_routed() still constructs it without one and send_outbound_text() only returns bool. That leaves this file non-compiling at Lines 856 and 865, and it also means the three fallback maybe_generate_spoken_response(...) calls below have no way to produce the required RepliedMessage { text, message_id }. Please carry the outbound ID through the helper and return it from fallback sends instead of collapsing everything to a success flag.

Also applies to: 2665-2691, 2768-2777, 2838-2847, 2899-2905

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/channel.rs` around lines 851 - 872, send_routed currently builds
RoutedResponse without supplying the required message_id; change the fallback
send path to thread the outbound message id through the helpers: update
send_outbound_text (and the other fallback helpers that call
maybe_generate_spoken_response) to return the RepliedMessage (or at least
Option<MessageId>) instead of bool, have maybe_generate_spoken_response produce
RepliedMessage { text, message_id }, and then modify send_routed to use that
returned message_id when constructing RoutedResponse; update all call sites
(including the three fallback calls around send_routed and the other locations
mentioned) to propagate the returned message_id into the RoutedResponse
construction so compilation succeeds and outbound IDs are preserved.
♻️ Duplicate comments (1)
interface/src/hooks/useAudioRecorder.ts (1)

196-200: ⚠️ Potential issue | 🟠 Major

Rethrow setup failures after cleanup.

Permission denials and recorder setup errors still resolve startRecording() successfully here. Keep the cleanup/reset, but rethrow so callers can surface the failure and unwind any optimistic UI.

🔧 Minimal fix
 		} catch (error) {
 			console.error("Failed to start recording:", error);
 			cleanupAudioGraph();
 			setState("idle");
+			throw error;
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interface/src/hooks/useAudioRecorder.ts` around lines 196 - 200, The catch
block in startRecording currently logs the error, calls cleanupAudioGraph(), and
setState("idle") but swallows the exception; modify the catch to perform the
same cleanup/reset (call cleanupAudioGraph() and setState("idle")) and then
rethrow the caught error so callers of startRecording can observe failures (i.e.
preserve existing calls to cleanupAudioGraph and setState inside the catch, then
throw the original error variable).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@interface/src/hooks/useAudioRecorder.ts`:
- Around line 98-116: Introduce a pending-start cancellation ref (e.g.,
pendingStartRef) and set it true at the start of startRecording() (return early
if already true or state !== "idle"), then after each await (after
navigator.mediaDevices.getUserMedia(...) and after audioContext.resume()) check
pendingStartRef and abort/cleanup if it was cleared; have stopRecording() set
pendingStartRef to false and perform the same cleanup of
streamRef/mediaRecorderRef/audioContextRef so an in-flight start is cancelled,
and only assign mediaRecorderRef.current and set state to "recording" if
pendingStartRef is still true.

In `@interface/src/hooks/useTtsPlayback.ts`:
- Around line 93-113: abandonIfStale currently tears down shared state
(playbackDoneRef, stopMetering, setState) for any non-current request; change it
so stale requests only stop their own nodes and do not mutate shared refs/state.
Specifically, in abandonIfStale (use isCurrentRequest(), sourceRef.current,
analyserRef.current) still stop() the passed source/analyser if provided, and
clear sourceRef.current/analyserRef.current only if they strictly equal the
passed nodes, but remove the lines that set playbackDoneRef.current = null, call
stopMetering(), or setState("idle") when the request is stale.

In `@src/agent/channel.rs`:
- Around line 3729-3731: The agent is built without the required SpacebotHook
and without explicit max_turns, bypassing the standard Spacebot agent pattern;
update the creation that currently uses SpacebotModel::make(...) and
AgentBuilder::new(model).preamble(system).build() to include
.hook(SpacebotHook::new(...)) and .max_turns(5) before .build(), ensuring the
hook is configured with the same channel/context parameters and leaving
build_spoken_generation_prompt(...) usage unchanged.

---

Outside diff comments:
In `@interface/src/hooks/useChannelLiveState.ts`:
- Around line 251-268: In useChannelLiveState, when handling streaming events
you must ensure the provisional stream id is replaced with the persisted
event.message_id: when pushing a new message via assistantMessageItem, use
event.message_id if present (fallback to `out-${generateId()}`), and when
updating an existing streaming timeline entry (streamingMessageId branch where
you set timeline[streamIndex] = { ...streamItem, content: event.text }), also
set its id to event.message_id when event.message_id is available so the
streamed reply keys match persisted history and WebChatPanel’s spoken-block
correlation.

In `@src/agent/channel.rs`:
- Around line 851-872: send_routed currently builds RoutedResponse without
supplying the required message_id; change the fallback send path to thread the
outbound message id through the helpers: update send_outbound_text (and the
other fallback helpers that call maybe_generate_spoken_response) to return the
RepliedMessage (or at least Option<MessageId>) instead of bool, have
maybe_generate_spoken_response produce RepliedMessage { text, message_id }, and
then modify send_routed to use that returned message_id when constructing
RoutedResponse; update all call sites (including the three fallback calls around
send_routed and the other locations mentioned) to propagate the returned
message_id into the RoutedResponse construction so compilation succeeds and
outbound IDs are preserved.

---

Duplicate comments:
In `@interface/src/hooks/useAudioRecorder.ts`:
- Around line 196-200: The catch block in startRecording currently logs the
error, calls cleanupAudioGraph(), and setState("idle") but swallows the
exception; modify the catch to perform the same cleanup/reset (call
cleanupAudioGraph() and setState("idle")) and then rethrow the caught error so
callers of startRecording can observe failures (i.e. preserve existing calls to
cleanupAudioGraph and setState inside the catch, then throw the original error
variable).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a7971fb5-694d-4922-aae1-de1d14295716

📥 Commits

Reviewing files that changed from the base of the PR and between 68b3f01 and 7e9da98.

📒 Files selected for processing (12)
  • interface/src/api/client.ts
  • interface/src/components/WebChatPanel.tsx
  • interface/src/hooks/useAudioRecorder.ts
  • interface/src/hooks/useChannelLiveState.ts
  • interface/src/hooks/useTtsPlayback.ts
  • src/agent/channel.rs
  • src/api/state.rs
  • src/conversation/history.rs
  • src/lib.rs
  • src/main.rs
  • src/messaging/webchat.rs
  • src/tools/reply.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/api/state.rs
  • src/lib.rs

Comment on lines +98 to +116
const startRecording = useCallback(async () => {
if (state !== "idle") return;

try {
const stream = await navigator.mediaDevices.getUserMedia({
audio: {
echoCancellation: true,
noiseSuppression: true,
autoGainControl: true,
},
});
streamRef.current = stream;

// Set up audio analyser for level visualization
const audioContext = new AudioContext();
audioContextRef.current = audioContext;
if (audioContext.state === "suspended") {
await audioContext.resume();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify the hold-to-talk flow and inspect whether stop can run before
# mediaRecorderRef.current exists.
sed -n '98,214p' interface/src/hooks/useAudioRecorder.ts
sed -n '1,260p' interface/src/routes/Overlay.tsx | rg -n -C2 '\b(startRecording|stopRecording)\b|key(up|down)|pointer(up|down)|mouse(up|down)'

Repository: spacedriveapp/spacebot

Length of output: 4691


Guard pending starts against release and duplication during async mic/audio setup.

startRecording() has a substantial async gap—between the state check and the mediaRecorderRef.current assignment—where getUserMedia() and audioContext.resume() are awaited. If the user releases the hotkey during this window, stopRecording() finds no recorder and resolves null, but the original startRecording() continuation proceeds and eventually starts capture. The same gap allows duplicate starts if the key is pressed twice before the first call settles.

Add refs to track pending starts and allow stopRecording() to cancel them. Check the cancellation flag after each await in startRecording() before proceeding to recorder creation and state transitions.

Suggested approach
+ const startPendingRef = useRef(false);
+ const cancelPendingStartRef = useRef(false);

 const startRecording = useCallback(async () => {
-	if (state !== "idle") return;
+	if (state !== "idle" || startPendingRef.current) return;
+	startPendingRef.current = true;
+	cancelPendingStartRef.current = false;

 	try {
 		const stream = await navigator.mediaDevices.getUserMedia({
 			audio: {
 				echoCancellation: true,
 				noiseSuppression: true,
 				autoGainControl: true,
 			},
 		});
+		if (cancelPendingStartRef.current) {
+			stream.getTracks().forEach((track) => track.stop());
+			return;
+		}
 		streamRef.current = stream;

 		// ... audio graph setup ...

 		const recorder = new MediaRecorder(stream, {mimeType});

+		if (cancelPendingStartRef.current) {
+			cleanupAudioGraph();
+			return;
+		}

 		mediaRecorderRef.current = recorder;
 		recorder.start(100);
 		setState("recording");
 	} catch (error) {
 		console.error("Failed to start recording:", error);
 		cleanupAudioGraph();
 		setState("idle");
+	} finally {
+		startPendingRef.current = false;
 	}
 }, [cleanupAudioGraph, state]);

 const stopRecording = useCallback((): Promise<Blob | null> => {
 	return new Promise((resolve) => {
+		if (startPendingRef.current) {
+			cancelPendingStartRef.current = true;
+			resolve(null);
+			return;
+		}

 		const recorder = mediaRecorderRef.current;
 		if (!recorder || recorder.state !== "recording") {
 			resolve(null);
 			return;
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interface/src/hooks/useAudioRecorder.ts` around lines 98 - 116, Introduce a
pending-start cancellation ref (e.g., pendingStartRef) and set it true at the
start of startRecording() (return early if already true or state !== "idle"),
then after each await (after navigator.mediaDevices.getUserMedia(...) and after
audioContext.resume()) check pendingStartRef and abort/cleanup if it was
cleared; have stopRecording() set pendingStartRef to false and perform the same
cleanup of streamRef/mediaRecorderRef/audioContextRef so an in-flight start is
cancelled, and only assign mediaRecorderRef.current and set state to "recording"
if pendingStartRef is still true.

Comment on lines +93 to +113
const abandonIfStale = (source?: AudioBufferSourceNode, analyser?: AnalyserNode) => {
if (isCurrentRequest()) {
return false;
}
if (source) {
try {
source.stop();
} catch {
// no-op: source was never started or already stopped
}
}
if (sourceRef.current === source) {
sourceRef.current = null;
}
if (analyserRef.current === analyser) {
analyserRef.current = null;
}
playbackDoneRef.current = null;
stopMetering();
setState("idle");
return true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't let stale requests clear the active playback state.

abandonIfStale() always nulls playbackDoneRef, calls stopMetering(), and sets idle. If request A resolves after request B has already started, A will blank B's visualizer, clear B's completion resolver, and leave the UI idle while B's audio keeps playing. The stale path should only tear down nodes it owns, or just return without touching shared refs/state.

Possible fix
 const abandonIfStale = (source?: AudioBufferSourceNode, analyser?: AnalyserNode) => {
 	if (isCurrentRequest()) {
 		return false;
 	}
 	if (source) {
 		try {
 			source.stop();
 		} catch {
 			// no-op: source was never started or already stopped
 		}
 	}
-	if (sourceRef.current === source) {
+	if (source && sourceRef.current === source) {
 		sourceRef.current = null;
 	}
-	if (analyserRef.current === analyser) {
+	if (analyser && analyserRef.current === analyser) {
 		analyserRef.current = null;
 	}
-	playbackDoneRef.current = null;
-	stopMetering();
-	setState("idle");
 	return true;
 };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interface/src/hooks/useTtsPlayback.ts` around lines 93 - 113, abandonIfStale
currently tears down shared state (playbackDoneRef, stopMetering, setState) for
any non-current request; change it so stale requests only stop their own nodes
and do not mutate shared refs/state. Specifically, in abandonIfStale (use
isCurrentRequest(), sourceRef.current, analyserRef.current) still stop() the
passed source/analyser if provided, and clear
sourceRef.current/analyserRef.current only if they strictly equal the passed
nodes, but remove the lines that set playbackDoneRef.current = null, call
stopMetering(), or setState("idle") when the request is stale.

Comment on lines +3729 to +3731
let model = crate::llm::model::SpacebotModel::make(llm_manager, model_name);
let agent = AgentBuilder::new(model).preamble(system).build();
let prompt = build_spoken_generation_prompt(input, speech_guidance, recent_spoken_responses);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the specific lines and surrounding context in the file
wc -l src/agent/channel.rs

Repository: spacedriveapp/spacebot

Length of output: 91


🏁 Script executed:

# Read the section around lines 3729-3731
sed -n '3720,3750p' src/agent/channel.rs

Repository: spacedriveapp/spacebot

Length of output: 1464


🏁 Script executed:

# Search for the build_spoken_generation_prompt function and related code to understand the agent's purpose
rg -B 20 -A 10 "build_spoken_generation_prompt" src/agent/channel.rs

Repository: spacedriveapp/spacebot

Length of output: 4226


🏁 Script executed:

# Search for SpacebotHook usage to understand the pattern
rg "SpacebotHook" -B 3 -A 3 src/

Repository: spacedriveapp/spacebot

Length of output: 43246


🏁 Script executed:

# Find other agent constructions in channel.rs to see the pattern
rg "AgentBuilder::new" src/agent/channel.rs -B 2 -A 5

Repository: spacedriveapp/spacebot

Length of output: 686


🏁 Script executed:

# Check what SpacebotHook provides
rg -A 20 "impl.*SpacebotHook" src/

Repository: spacedriveapp/spacebot

Length of output: 3381


Use the standard Spacebot agent pattern with hook and explicit max_turns.

This agent bypasses SpacebotHook and leaves max_turns implicit. For a channel-context agent, construct it as: AgentBuilder::new(model).preamble(system).hook(SpacebotHook::new(...)).max_turns(5).build(). The hook is required for all LLM processes per coding guidelines, and channel agents must have max_turns(5) explicitly set.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/channel.rs` around lines 3729 - 3731, The agent is built without
the required SpacebotHook and without explicit max_turns, bypassing the standard
Spacebot agent pattern; update the creation that currently uses
SpacebotModel::make(...) and AgentBuilder::new(model).preamble(system).build()
to include .hook(SpacebotHook::new(...)) and .max_turns(5) before .build(),
ensuring the hook is configured with the same channel/context parameters and
leaving build_spoken_generation_prompt(...) usage unchanged.

Document the new voice interface, speech personality system, working memory layers, and desktop voice hotkeys across the README and docs. Add a dedicated overlay voice shortcut so hold-to-talk works independently from the window toggle.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

♻️ Duplicate comments (1)
interface/src/routes/Overlay.tsx (1)

81-99: ⚠️ Potential issue | 🟠 Major

One assistant reply still becomes two transcript entries.

This is the same duplicate-row problem already called out earlier: outbound_message appends one assistant row and spoken_response appends the same reply again via full_text. Update the pending row instead of pushing a second assistant entry, ideally keyed by message_id.

Also applies to: 109-118

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interface/src/routes/Overlay.tsx` around lines 81 - 99, handleSpokenResponse
is currently appending a second assistant transcript entry (using
event.full_text) which duplicates the row added by outbound_message; instead of
pushing a new entry in setTranscript inside handleSpokenResponse, find the
pending assistant message by its message_id (from the SpokenResponseEvent) and
update that existing transcript entry (replace its text or merge fields) —
modify the setTranscript updater used in handleSpokenResponse (and the similar
block around lines 109-118) to map over previous entries, match on
entry.message_id === event.message_id (or another unique key added by
outbound_message), and return an updated entry rather than appending a new one;
keep the other state changes (setSpokenText, setVoiceState, setStatusText)
intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@desktop/src-tauri/src/main.rs`:
- Around line 50-52: toggle_voice_overlay currently always returns Ok(()) and
calls toggle_overlay which and the overlay helper functions
(create_overlay_window, show/hide/set_focus/emit) swallow or only log errors;
change the API to propagate failures instead of hiding them: update
toggle_voice_overlay's signature to return Result<(), String> (or a suitable
Error) and have it return Err(...) when toggle_overlay or any downstream call
fails, and modify toggle_overlay and create_overlay_window to return Results and
propagate errors (using ?), removing any let _ = or ignored Result uses in
show/hide/set_focus/emit paths so errors are either logged with context or
propagated upward; ensure each referenced function name (toggle_voice_overlay,
toggle_overlay, create_overlay_window, and the overlay show/hide/set_focus/emit
calls) surfaces failures rather than silently discarding them.

In `@docs/content/docs/`(core)/cortex.mdx:
- Around line 275-276: Update the readiness documentation to explicitly state
whether the API field name for the warmup/readiness signal remains
`bulletin_age_secs` for backwards compatibility (or if it was renamed), and make
the warmup payload example and the surrounding text consistent: if unchanged,
call out “the field remains `bulletin_age_secs` (seconds since last bulletin)”;
if renamed, show the new field name and note the old `bulletin_age_secs` is
deprecated/removed and map how to translate it. Also search within this document
for the warmup payload example and the phrase “knowledge synthesis” to ensure
both description and example reference the same field name.

In `@docs/content/docs/`(core)/memory.mdx:
- Around line 23-26: Replace the phrase "recall worker" in the
docs/content/docs/(core)/memory.mdx with the project-standard terms "branch
recall" or "recall tool" and update the explanatory sentence to match the
existing model flow "branch -> memory_recall tool"; also ensure any references
implying workers have channel/branch context are removed (workers are
context-free). Look for related mismatches in src/agent/channel.rs,
src/agent/branch.rs, and src/agent/worker.rs and align the documentation wording
to reflect that workers receive only a fresh prompt and task description, not
channel/branch context.
- Around line 207-209: Summary: The docs state SPEECH.md is always injected into
channel system prompts, but SPEECH.md is actually only used for spoken-reply
generation per the updated prompt layout in cortex.mdx; update memory.mdx to
avoid implying it's always-on. Fix: edit the paragraph that lists SPEECH.md in
memory.mdx to remove "every time" / "always-injected" language and explicitly
state that SPEECH.md is loaded only when spoken-reply/voice output is enabled
(per cortex.mdx), and clarify that the other identity files remain
always-injected; reference SPEECH.md and the prompt layout described in
cortex.mdx so readers know the conditional behavior.

In `@interface/src/routes/Overlay.tsx`:
- Line 28: The component currently hardcodes agentId = "main", which breaks
session creation, TTS profile lookup, and audio upload when the deployed default
agent ID differs; update the Overlay component to derive agentId from a prop or
shared state/context (e.g., a passed-in defaultAgentId or selectedAgentId from
app context) instead of the literal "main", and use that derived id wherever
session creation, TTS profile lookup, and audio upload occur (references: the
agentId useState declaration, any createSession/createVoiceSession calls, TTS
profile lookup logic, and audio upload handlers) so the component routes voice
traffic to the correct agent in all environments.
- Around line 100-106: The playback cleanup currently lives only in the
speak(...).then(...) success path so if speak() rejects ttsStartedRef.current
stays true and the overlay remains in speaking state; update the speak
invocation in Overlay.tsx to attach a .finally() callback (or use
async/try/finally) that always resets ttsStartedRef.current = false and calls
setVoiceState("idle") and setStatusText("Press Option+Shift+Space to talk"), and
optionally add a .catch(...) to log the error; reference the existing speak(...)
call, ttsStartedRef, setVoiceState, and setStatusText to locate and modify the
code.

---

Duplicate comments:
In `@interface/src/routes/Overlay.tsx`:
- Around line 81-99: handleSpokenResponse is currently appending a second
assistant transcript entry (using event.full_text) which duplicates the row
added by outbound_message; instead of pushing a new entry in setTranscript
inside handleSpokenResponse, find the pending assistant message by its
message_id (from the SpokenResponseEvent) and update that existing transcript
entry (replace its text or merge fields) — modify the setTranscript updater used
in handleSpokenResponse (and the similar block around lines 109-118) to map over
previous entries, match on entry.message_id === event.message_id (or another
unique key added by outbound_message), and return an updated entry rather than
appending a new one; keep the other state changes (setSpokenText, setVoiceState,
setStatusText) intact.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d191b8ba-7ca7-4734-ab13-74cf85be5434

📥 Commits

Reviewing files that changed from the base of the PR and between 7e9da98 and 241655e.

📒 Files selected for processing (10)
  • README.md
  • desktop/src-tauri/src/main.rs
  • docs/content/docs/(configuration)/config.mdx
  • docs/content/docs/(core)/agents.mdx
  • docs/content/docs/(core)/architecture.mdx
  • docs/content/docs/(core)/cortex.mdx
  • docs/content/docs/(core)/memory.mdx
  • docs/content/docs/(getting-started)/desktop.mdx
  • docs/content/docs/(getting-started)/quickstart.mdx
  • interface/src/routes/Overlay.tsx
✅ Files skipped from review due to trivial changes (2)
  • README.md
  • docs/content/docs/(core)/agents.mdx

Comment on lines +50 to +52
fn toggle_voice_overlay(app: tauri::AppHandle) -> Result<(), String> {
toggle_overlay(&app);
Ok(())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Surface overlay failures instead of always succeeding.

toggle_voice_overlay always returns Ok(()), create_overlay_window() only logs build failures, and the rest of the helpers ignore show / hide / set_focus / emit errors. If the overlay path breaks, the command still looks successful and you lose the only useful diagnostics for the hotkey flow. As per coding guidelines, "Don't silently discard errors. No let _ = on Results. Handle them, log them, or propagate them. The only exception is .ok() on channel sends where the receiver may be dropped".

Also applies to: 55-63, 67-75, 83-144, 146-149, 183-188

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@desktop/src-tauri/src/main.rs` around lines 50 - 52, toggle_voice_overlay
currently always returns Ok(()) and calls toggle_overlay which and the overlay
helper functions (create_overlay_window, show/hide/set_focus/emit) swallow or
only log errors; change the API to propagate failures instead of hiding them:
update toggle_voice_overlay's signature to return Result<(), String> (or a
suitable Error) and have it return Err(...) when toggle_overlay or any
downstream call fails, and modify toggle_overlay and create_overlay_window to
return Results and propagate errors (using ?), removing any let _ = or ignored
Result uses in show/hide/set_focus/emit paths so errors are either logged with
context or propagated upward; ensure each referenced function name
(toggle_voice_overlay, toggle_overlay, create_overlay_window, and the overlay
show/hide/set_focus/emit calls) surfaces failures rather than silently
discarding them.

Comment on lines +275 to 276
- knowledge synthesis is available and fresh enough for degraded-safe dispatch

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Clarify the backwards-compatible readiness field name.

This section now describes readiness in terms of knowledge-synthesis freshness, but the warmup payload example later on the page still exposes bulletin_age_secs. Please call out whether the API field name is still bulletin_age_secs for compatibility, otherwise operators will look for the wrong signal when debugging warmup/readiness.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/content/docs/`(core)/cortex.mdx around lines 275 - 276, Update the
readiness documentation to explicitly state whether the API field name for the
warmup/readiness signal remains `bulletin_age_secs` for backwards compatibility
(or if it was renamed), and make the warmup payload example and the surrounding
text consistent: if unchanged, call out “the field remains `bulletin_age_secs`
(seconds since last bulletin)”; if renamed, show the new field name and note the
old `bulletin_age_secs` is deprecated/removed and map how to translate it. Also
search within this document for the warmup payload example and the phrase
“knowledge synthesis” to ensure both description and example reference the same
field name.

Comment on lines +23 to +26
- **SQLite** -- the memory graph **and** working-memory event log. Graph rows with content, type, importance, timestamps, source. Association edges with weights and relation types. Working-memory events, synthesis blocks, and channel activity rendering also live here.
- **LanceDB** -- embeddings and search. Vector storage in Lance columnar format with HNSW indexing. Built-in full-text search via Tantivy. Hybrid search (vector + FTS) in one system.

The two are joined on memory ID. A recall worker queries LanceDB for semantic/keyword matches, then hits SQLite for graph traversal and metadata. No server processes -- both are embedded, everything is files in a data directory.
Graph-memory recall joins the two on memory ID. A recall worker queries LanceDB for semantic/keyword matches, then hits SQLite for graph traversal and metadata. Working memory does not use embeddings at all — it is SQL-first, temporal, and renderable even before an embedding model is warm.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use “branch recall” or “recall tool” here, not “recall worker.”

The rest of this page already documents recall as branch -> memory_recall tool, and workers are intentionally context-free. Calling this a “recall worker” muddies the execution boundary and points readers at the wrong process. Based on learnings: Applies to src/agent/{channel,branch,worker}.rs : Do not give workers channel context; workers receive only a fresh prompt and task description.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/content/docs/`(core)/memory.mdx around lines 23 - 26, Replace the phrase
"recall worker" in the docs/content/docs/(core)/memory.mdx with the
project-standard terms "branch recall" or "recall tool" and update the
explanatory sentence to match the existing model flow "branch -> memory_recall
tool"; also ensure any references implying workers have channel/branch context
are removed (workers are context-free). Look for related mismatches in
src/agent/channel.rs, src/agent/branch.rs, and src/agent/worker.rs and align the
documentation wording to reflect that workers receive only a fresh prompt and
task description, not channel/branch context.

Comment on lines +207 to +209
- **SPEECH.md** -- spoken cadence, delivery style, warmth, and anti-repetition rules for voice output

These are loaded into channel system prompts every time. They're files in the agent root directory (not the workspace), not database rows, because they change rarely and users should be able to edit them in a text editor.
These are loaded into channel system prompts every time. `SPEECH.md` is also used during spoken-reply generation when Voicebox output is enabled. They are files in the agent root directory (not the workspace), not database rows, because they change rarely and users should be able to edit them in a text editor or through the config UI.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

SPEECH.md reads as always-injected here.

This paragraph says the listed identity files are loaded into channel system prompts every time, but the updated prompt layout in docs/content/docs/(core)/cortex.mdx makes speech style conditional to spoken-reply generation. Separating SPEECH.md from the always-on list would keep text-only turns unambiguous.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/content/docs/`(core)/memory.mdx around lines 207 - 209, Summary: The
docs state SPEECH.md is always injected into channel system prompts, but
SPEECH.md is actually only used for spoken-reply generation per the updated
prompt layout in cortex.mdx; update memory.mdx to avoid implying it's always-on.
Fix: edit the paragraph that lists SPEECH.md in memory.mdx to remove "every
time" / "always-injected" language and explicitly state that SPEECH.md is loaded
only when spoken-reply/voice output is enabled (per cortex.mdx), and clarify
that the other identity files remain always-injected; reference SPEECH.md and
the prompt layout described in cortex.mdx so readers know the conditional
behavior.

export function Overlay() {
const [expanded, setExpanded] = useState(false);
const [voiceState, setVoiceState] = useState<VoiceState>("idle");
const [agentId] = useState("main"); // default agent
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't pin voice overlay traffic to agent main.

This component uses the literal "main" for session creation, TTS profile lookup, and audio upload. Any deployment whose default/selected agent has a different ID will route voice turns to the wrong agent or fail outright.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interface/src/routes/Overlay.tsx` at line 28, The component currently
hardcodes agentId = "main", which breaks session creation, TTS profile lookup,
and audio upload when the deployed default agent ID differs; update the Overlay
component to derive agentId from a prop or shared state/context (e.g., a
passed-in defaultAgentId or selectedAgentId from app context) instead of the
literal "main", and use that derived id wherever session creation, TTS profile
lookup, and audio upload occur (references: the agentId useState declaration,
any createSession/createVoiceSession calls, TTS profile lookup logic, and audio
upload handlers) so the component routes voice traffic to the correct agent in
all environments.

Comment on lines +100 to +106
// Play TTS
ttsStartedRef.current = true;
speak(event.spoken_text, agentId, profileId).then(() => {
ttsStartedRef.current = false;
setVoiceState("idle");
setStatusText("Press Option+Shift+Space to talk");
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use finally for playback cleanup.

If speak() rejects, ttsStartedRef.current never resets and the overlay stays in speaking, so later spoken responses are treated as already-playing forever.

🛠️ Suggested fix
-		ttsStartedRef.current = true;
-			speak(event.spoken_text, agentId, profileId).then(() => {
-				ttsStartedRef.current = false;
-				setVoiceState("idle");
-				setStatusText("Press Option+Shift+Space to talk");
-			});
+		ttsStartedRef.current = true;
+		void speak(event.spoken_text, agentId, profileId)
+			.catch((error) => {
+				console.error("Failed to play spoken reply:", error);
+				setStatusText("Couldn't play reply");
+			})
+			.finally(() => {
+				ttsStartedRef.current = false;
+				setVoiceState("idle");
+				setStatusText("Press Option+Shift+Space to talk");
+			});
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Play TTS
ttsStartedRef.current = true;
speak(event.spoken_text, agentId, profileId).then(() => {
ttsStartedRef.current = false;
setVoiceState("idle");
setStatusText("Press Option+Shift+Space to talk");
});
// Play TTS
ttsStartedRef.current = true;
void speak(event.spoken_text, agentId, profileId)
.catch((error) => {
console.error("Failed to play spoken reply:", error);
setStatusText("Couldn't play reply");
})
.finally(() => {
ttsStartedRef.current = false;
setVoiceState("idle");
setStatusText("Press Option+Shift+Space to talk");
});
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interface/src/routes/Overlay.tsx` around lines 100 - 106, The playback
cleanup currently lives only in the speak(...).then(...) success path so if
speak() rejects ttsStartedRef.current stays true and the overlay remains in
speaking state; update the speak invocation in Overlay.tsx to attach a
.finally() callback (or use async/try/finally) that always resets
ttsStartedRef.current = false and calls setVoiceState("idle") and
setStatusText("Press Option+Shift+Space to talk"), and optionally add a
.catch(...) to log the error; reference the existing speak(...) call,
ttsStartedRef, setVoiceState, and setStatusText to locate and modify the code.

@jamiepine jamiepine changed the base branch from main to feat/worker-orchestration-screen March 20, 2026 04:22
jamiepine and others added 11 commits March 20, 2026 05:45
…screen

feat: worker orchestration screen
fix: add provider prefix to zai-coding-plan default model
…ings

fix: skill install 500 error and chart dimension warnings
feat: Add Mattermost channel support
- Add "error sending request" to retriable error patterns to handle
  transient network failures (DNS, connection refused, etc.)
- Remove broken keepalive code that tried to clone non-Clone Browser
- Implement graceful browser shutdown in Drop to prevent Chrome core dumps
- Add comprehensive test suite (7 tests) for error classification
- Fix clippy warning about collapsible if statements

Fixes WebSocket connection errors when workers fail due to transient
API errors, preventing Chrome from being force-killed and dumping core.
fix: graceful browser shutdown and network error retries
Add collapsed/expanded overlay window states with Tauri resize
command so the pill stays bottom-pinned. Start overlay collapsed
by default. Swap sidebar orchestrate icon to HugeiconsIcon and
reformat AgentTasks.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants