Skip to content

Commit 273ec40

Browse files
committed
Add semantic embeddings query support
1 parent 834f311 commit 273ec40

31 files changed

Lines changed: 2915 additions & 606 deletions

AGENTS.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -247,6 +247,9 @@ The most important implementation seams are:
247247
control-plane behavior.
248248
- Top-level filesystem shortcut help must say shortcuts use the "default"
249249
workspace and that explicit targeting requires `afs fs <workspace> <command>`.
250+
- CLI error rendering must preserve help and usage blocks literally. Do not
251+
sentence-format, title-case, or append punctuation to command names, flags,
252+
examples, or subcommand lists.
250253
- Sync mount cold-start in Cloud-managed mode must hydrate from the workspace
251254
session's storage key/head checkpoint. Do not require direct Redis workspace
252255
metadata lookup by display name; cloud session Redis may expose the live root
@@ -256,6 +259,27 @@ The most important implementation seams are:
256259
supports `lex:`, `vec:`, `hyde:`, and `intent:` documents, and uses
257260
`--keyword` / `--semantic` for narrower modes. Do not reintroduce public
258261
`search` or `vsearch` commands unless the product direction changes again.
262+
- Semantic query embeddings must use a real provider. QMD uses a local GGUF
263+
embedding model with explicit query/document formatting; deterministic hash
264+
embeddings are acceptable only as test doubles, never as product behavior.
265+
- Semantic query provider/model settings are global runtime settings, not
266+
workspace config. Embeddings should be treated as on by default; if provider
267+
credentials are missing, return/report unavailable status without hard-failing
268+
normal query flows.
269+
- `AFS_EMBED_MODEL`, `AFS_EMBED_PROVIDER`, `AFS_EMBED_DIMENSIONS`, and
270+
`OPENAI_API_KEY` are read by the control-plane process. CLI help and
271+
troubleshooting copy must not imply that setting them only on an `afs query`
272+
invocation changes an already-running control plane.
273+
- Semantic embedding backfill must batch provider requests. Query chunks are
274+
individually capped, but a large workspace can still exceed OpenAI's total
275+
tokens-per-request limit if every pending chunk is embedded at once.
276+
- Semantic query can take longer than normal control-plane calls on first
277+
provider backfill. Keep query HTTP client timeouts separate from quick
278+
metadata/status calls so first-run embedding work does not fail at 30s.
279+
- Semantic query must not backfill embeddings as a side effect. Imports should
280+
start embedding creation in the control plane when the global provider is
281+
available, and existing workspaces should use an explicit query index create
282+
path for embedding backfill.
259283
- Workspace file/query CLI calls use resolved workspace routes under
260284
`/v1/workspaces/<id>/...`; when adding a scoped database route, add the
261285
matching resolved route and a regression test for workspace IDs.

README.md

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -213,9 +213,16 @@ If you want to search workspace contents:
213213
```
214214

215215
`grep` is for exact text evidence. `query` is for ranked conceptual search and
216-
falls back to keyword-ranked results when embeddings are disabled or
217-
unavailable. Keyword ranking uses RedisSearch BM25 query chunks when available,
218-
then falls back to direct content ranking.
216+
currently falls back to keyword-ranked results until hybrid vector/rerank is
217+
complete. Keyword ranking uses RedisSearch BM25 query chunks when available,
218+
then falls back to direct content ranking. Use `query --semantic` for
219+
vector-only retrieval. Semantic embeddings are globally enabled and use OpenAI
220+
when `OPENAI_API_KEY` is available in the control-plane environment. Override
221+
the default `openai:text-embedding-3-small` model with `AFS_EMBED_MODEL` in the
222+
same environment, then restart the control plane. Semantic queries read
223+
existing embedding indexes; imports start embedding creation in the background,
224+
and existing workspaces can be prepared with
225+
`afs fs <workspace> query index create --embeddings --wait`.
219226

220227
If you want commands with an optional workspace argument to use `my-repo` by
221228
default:

cmd/afs/afs_mcp_test.go

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -384,6 +384,7 @@ func TestAFSMCPFileQueryRanksWorkspaceContent(t *testing.T) {
384384

385385
func TestAFSMCPFileQueryTypedClausesAndSemanticUnavailable(t *testing.T) {
386386
t.Helper()
387+
t.Setenv("OPENAI_API_KEY", "")
387388

388389
server, closeFn := setupAFSMCPTestServer(t)
389390
defer closeFn()
@@ -415,8 +416,8 @@ func TestAFSMCPFileQueryTypedClausesAndSemanticUnavailable(t *testing.T) {
415416
if len(response.Results) != 1 || response.Results[0].Path != "/docs/checkpoints.md" {
416417
t.Fatalf("results = %#v, want scoped docs result", response.Results)
417418
}
418-
if len(response.Warnings) != 1 || !strings.Contains(response.Warnings[0], "Embeddings are disabled") {
419-
t.Fatalf("warnings = %#v, want embeddings disabled fallback", response.Warnings)
419+
if len(response.Warnings) != 1 || !strings.Contains(response.Warnings[0], "semantic clauses were keyword-ranked") {
420+
t.Fatalf("warnings = %#v, want semantic-clause fallback", response.Warnings)
420421
}
421422

422423
semantic := server.callTool(context.Background(), "file_query", map[string]any{

cmd/afs/afs_query_commands.go

Lines changed: 47 additions & 69 deletions
Original file line numberDiff line numberDiff line change
@@ -80,19 +80,8 @@ func runWorkspaceQuery(mode, workspace string, args []string) error {
8080
}
8181
defer remote.close()
8282

83-
cfg, err := workspaceQueryConfig(ctx, remote.controlPlane, remote.selection)
84-
if err != nil {
85-
return err
86-
}
8783
request := workspaceQueryRequest(remote.selection, opts)
88-
if opts.mode == mcptools.FileQueryModeKeyword || opts.mode == mcptools.FileQueryModeHybrid {
89-
return runWorkspaceKeywordQuery(ctx, remote, opts, request, cfg)
90-
}
91-
message := workspaceQueryUnavailableMessage(opts, remote.selection.Name, cfg)
92-
if opts.jsonOut {
93-
return encodeWorkspaceQueryUnavailable(request, message)
94-
}
95-
return errors.New(message)
84+
return runWorkspaceQueryRequest(ctx, remote, opts, request)
9685
}
9786

9887
func isWorkspaceQueryIndexInvocation(args []string) bool {
@@ -103,7 +92,7 @@ func isWorkspaceQueryIndexInvocation(args []string) bool {
10392
return true
10493
}
10594
switch strings.TrimSpace(args[1]) {
106-
case "status", "rebuild", "clean":
95+
case "status", "create", "rebuild", "clean":
10796
return true
10897
default:
10998
return false
@@ -293,49 +282,30 @@ func workspaceQueryRequest(selection workspaceSelection, opts workspaceQueryOpti
293282
return request
294283
}
295284

296-
func runWorkspaceKeywordQuery(ctx context.Context, remote *fsRemoteWorkspace, opts workspaceQueryOptions, request mcptools.FileQueryRequest, cfg controlplane.WorkspaceConfig) error {
285+
func runWorkspaceQueryRequest(ctx context.Context, remote *fsRemoteWorkspace, opts workspaceQueryOptions, request mcptools.FileQueryRequest) error {
297286
response, err := remote.controlPlane.QueryWorkspace(ctx, remote.selection.ID, request)
298287
if err != nil {
299288
return err
300289
}
301-
if !opts.jsonOut && opts.mode == mcptools.FileQueryModeHybrid && !cfg.Query.Embeddings.Enabled && !workspaceQueryHasSemanticClauses(request.Searches) {
302-
fmt.Fprintln(os.Stderr, "Warning: Embeddings disabled; using keyword retrieval.")
303-
}
304290
return writeWorkspaceQueryResponse(response, opts)
305291
}
306292

307-
func workspaceQueryHasSemanticClauses(searches []mcptools.FileQuerySearch) bool {
308-
for _, search := range searches {
309-
switch search.Type {
310-
case mcptools.FileQuerySearchVec, mcptools.FileQuerySearchHyde:
311-
return true
312-
}
313-
}
314-
return false
315-
}
316-
317-
func encodeWorkspaceQueryUnavailable(request mcptools.FileQueryRequest, message string) error {
318-
response := mcptools.FileQueryResponse{
319-
Status: mcptools.FileQueryStatusUnavailable,
320-
Workspace: request.Workspace,
321-
Path: request.Path,
322-
Query: request.Query,
323-
Searches: request.Searches,
324-
Intent: request.Intent,
325-
Results: []mcptools.FileQueryResult{},
326-
Warnings: []string{message},
327-
}
328-
enc := json.NewEncoder(os.Stdout)
329-
enc.SetIndent("", " ")
330-
return enc.Encode(response)
331-
}
332-
333293
func writeWorkspaceQueryResponse(response mcptools.FileQueryResponse, opts workspaceQueryOptions) error {
334294
if opts.jsonOut {
335295
enc := json.NewEncoder(os.Stdout)
336296
enc.SetIndent("", " ")
337297
return enc.Encode(response)
338298
}
299+
if response.Status == mcptools.FileQueryStatusUnavailable {
300+
message := "query is unavailable"
301+
for _, warning := range response.Warnings {
302+
if strings.TrimSpace(warning) != "" {
303+
message = strings.TrimSpace(warning)
304+
break
305+
}
306+
}
307+
return errors.New(message)
308+
}
339309
for _, warning := range response.Warnings {
340310
if strings.TrimSpace(warning) != "" {
341311
fmt.Fprintf(os.Stderr, "Warning: %s\n", warning)
@@ -403,13 +373,30 @@ func writeWorkspaceQueryResponse(response mcptools.FileQueryResponse, opts works
403373
}
404374

405375
func writeWorkspaceQueryFiles(response mcptools.FileQueryResponse, results []mcptools.FileQueryResult) error {
406-
for _, result := range results {
376+
for _, result := range workspaceQueryFileResults(results) {
407377
docID := workspaceQueryResultDocID(result)
408-
fmt.Fprintf(os.Stdout, "#%s,%s,%s\n", docID, workspaceQueryScoreNumber(result.Score), workspaceQueryResultURI(response.Workspace, result))
378+
fmt.Fprintf(os.Stdout, "#%s,%s,%s\n", docID, workspaceQueryScoreNumber(result.Score), workspaceQueryResultFileURI(response.Workspace, result))
409379
}
410380
return nil
411381
}
412382

383+
func workspaceQueryFileResults(results []mcptools.FileQueryResult) []mcptools.FileQueryResult {
384+
out := make([]mcptools.FileQueryResult, 0, len(results))
385+
positions := make(map[string]int, len(results))
386+
for _, result := range results {
387+
pathKey := normalizeFSRemotePath(result.Path)
388+
if index, ok := positions[pathKey]; ok {
389+
if result.Score > out[index].Score {
390+
out[index] = result
391+
}
392+
continue
393+
}
394+
positions[pathKey] = len(out)
395+
out = append(out, result)
396+
}
397+
return out
398+
}
399+
413400
func writeWorkspaceQueryCSV(response mcptools.FileQueryResponse, results []mcptools.FileQueryResult, opts workspaceQueryOptions) error {
414401
writer := csv.NewWriter(os.Stdout)
415402
if err := writer.Write([]string{"docid", "score", "file", "title", "context", "line", "snippet"}); err != nil {
@@ -611,12 +598,7 @@ func workspaceQueryResultSource(result mcptools.FileQueryResult) string {
611598
}
612599

613600
func workspaceQueryResultURI(workspace string, result mcptools.FileQueryResult) string {
614-
workspace = strings.Trim(strings.TrimSpace(workspace), "/")
615-
if workspace == "" {
616-
workspace = "workspace"
617-
}
618-
remotePath := normalizeFSRemotePath(result.Path)
619-
uri := "afs://" + workspace + remotePath
601+
uri := workspaceQueryResultFileURI(workspace, result)
620602
if result.StartLine > 0 {
621603
if result.EndLine > result.StartLine {
622604
uri += fmt.Sprintf(":%d-%d", result.StartLine, result.EndLine)
@@ -627,6 +609,16 @@ func workspaceQueryResultURI(workspace string, result mcptools.FileQueryResult)
627609
return uri
628610
}
629611

612+
func workspaceQueryResultFileURI(workspace string, result mcptools.FileQueryResult) string {
613+
workspace = strings.Trim(strings.TrimSpace(workspace), "/")
614+
if workspace == "" {
615+
workspace = "workspace"
616+
}
617+
remotePath := normalizeFSRemotePath(result.Path)
618+
uri := "afs://" + workspace + remotePath
619+
return uri
620+
}
621+
630622
func workspaceQueryResultDocID(result mcptools.FileQueryResult) string {
631623
if id := strings.TrimSpace(result.ChunkID); id != "" {
632624
return shortHash(id)
@@ -726,23 +718,6 @@ func workspaceQueryMetadataInt(metadata map[string]any, key string) (int, bool)
726718
}
727719
}
728720

729-
func workspaceQueryUnavailableMessage(opts workspaceQueryOptions, workspace string, cfg controlplane.WorkspaceConfig) string {
730-
switch opts.mode {
731-
case mcptools.FileQueryModeKeyword:
732-
return fmt.Sprintf("keyword query is not ready yet for workspace %q\nIt will use BM25 ranking through RedisSearch. Until then, use '%s grep <pattern>' for exact line matches.", workspace, filepath.Base(os.Args[0]))
733-
case mcptools.FileQueryModeSemantic:
734-
if !cfg.Query.Embeddings.Enabled {
735-
return fmt.Sprintf("semantic query is disabled for workspace %q\nEnable it with: %s ws config %s set query.embeddings.enabled true", workspace, filepath.Base(os.Args[0]), workspace)
736-
}
737-
return fmt.Sprintf("semantic query is not ready yet for workspace %q\nThe vector backend will land in the next retrieval slice.", workspace)
738-
default:
739-
if !cfg.Query.Embeddings.Enabled {
740-
return fmt.Sprintf("workspace query is not ready yet for workspace %q\nPlain query will use hybrid ranking and fall back to BM25 keywords when embeddings are disabled.\nUntil then, use '%s grep <pattern>' for exact line matches. Use '%s query --semantic <query>' when you specifically want vector-only search.", workspace, filepath.Base(os.Args[0]), filepath.Base(os.Args[0]))
741-
}
742-
return fmt.Sprintf("workspace query is not ready yet for workspace %q\nThe QMD-style hybrid query backend will land in the next retrieval slice.", workspace)
743-
}
744-
}
745-
746721
func workspaceQueryIsExplicitExpand(query string) bool {
747722
return strings.HasPrefix(strings.ToLower(strings.TrimSpace(query)), "expand:")
748723
}
@@ -757,7 +732,10 @@ QMD-style hybrid + rerank workspace query.
757732
Plain text runs hybrid retrieval by default. Use --keyword for keyword-ranked
758733
retrieval only, or --semantic for vector-only semantic search.
759734
760-
If embeddings are disabled, default query falls back to keyword ranked results.
735+
Default query currently falls back to keyword ranked results until hybrid
736+
vector/rerank is complete. Use --semantic for vector-only retrieval. Semantic
737+
embeddings are globally enabled and use OpenAI when OPENAI_API_KEY is set in
738+
the control-plane environment.
761739
Use grep when you know the exact text.
762740
763741
Typed query documents:

0 commit comments

Comments
 (0)