Skip to content

Commit 6a6fbaf

Browse files
valkirilovclaude
andcommitted
feat(redis-vector-search): add spec-compliant skill for embeddings and RAG
Introduces skills/redis-vector-search/ covering the four vector-* rules from skills/redis-development/rules/ in agentskills.io spec layout: vector-algorithm-choice → references/algorithm-choice.md (HNSW vs FLAT) vector-index-creation → references/index-creation.md (DIM, metric, etc.) vector-hybrid-search → references/hybrid-search.md (filtered vector) vector-rag-pattern → references/rag-pattern.md (full RAG pipeline) SKILL.md summarizes index configuration, algorithm choice, hybrid filtering, and the RAG pattern; the four reference files carry the full code samples. The skill positions itself as a layer on top of redis-query-engine, since vector fields live inside RQE indexes. Additive only: the source vector-*.md rules under skills/redis-development/ remain in place so the legacy compiled AGENTS.md continues to serve existing plugin consumers unchanged. They are removed in the final cleanup PR alongside the rest of the rules/ tree. Validation: - skill-validator check skills/redis-vector-search → passed (0 warnings) - npm run validate → rules + plugin validators green Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 51308c4 commit 6a6fbaf

5 files changed

Lines changed: 354 additions & 0 deletions

File tree

Lines changed: 140 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,140 @@
1+
---
2+
name: redis-vector-search
3+
description: Redis vector search guidance covering HNSW vs FLAT algorithm choice, vector index configuration (dims, distance metric, datatype), filtered hybrid search combining vector similarity with TAG or NUMERIC filters, and the RAG retrieval pattern with RedisVL. Use when defining a VECTOR field in FT.CREATE, integrating embeddings (OpenAI, Cohere, sentence-transformers), tuning HNSW parameters (M, EF_CONSTRUCTION, EF_RUNTIME), building a retrieval-augmented generation pipeline, or filtering vector results by attribute.
4+
license: MIT
5+
metadata:
6+
author: Redis, Inc.
7+
version: "0.1.0"
8+
---
9+
10+
# Redis Vector Search
11+
12+
Guidance for storing and searching embeddings in Redis. Covers index configuration, algorithm selection, hybrid filtering, and the RAG retrieval pattern with RedisVL.
13+
14+
## When to apply
15+
16+
- Defining a `VECTOR` field in `FT.CREATE` (raw RQE) or a RedisVL `IndexSchema`.
17+
- Choosing HNSW vs FLAT and tuning HNSW parameters.
18+
- Adding category, date, or tenant filters to a vector query.
19+
- Building a retrieval-augmented generation (RAG) pipeline on top of Redis.
20+
21+
This skill builds on the `redis-query-engine` skill — vector fields live inside RQE indexes and share the same `FT.CREATE` / `FT.SEARCH` machinery.
22+
23+
## 1. Configure the vector index properly
24+
25+
Three settings must match the embedding model:
26+
27+
- **`DIM`** — the model's output dimensionality (e.g. 1536 for OpenAI `text-embedding-3-small`). A mismatch produces silent garbage.
28+
- **`DISTANCE_METRIC`**`COSINE` for normalized text embeddings (the common case), `IP` for unnormalized inner-product, `L2` for raw Euclidean.
29+
- **`TYPE` / `datatype`** — usually `FLOAT32`. Use `FLOAT16` or quantized variants only when memory cost is a hard constraint.
30+
31+
Raw RQE:
32+
33+
```
34+
FT.CREATE idx:docs ON HASH PREFIX 1 doc:
35+
SCHEMA
36+
content TEXT
37+
embedding VECTOR HNSW 6
38+
TYPE FLOAT32
39+
DIM 1536
40+
DISTANCE_METRIC COSINE
41+
```
42+
43+
RedisVL:
44+
45+
```python
46+
schema = IndexSchema.from_dict({
47+
"index": {"name": "idx:docs", "prefix": "doc:"},
48+
"fields": [
49+
{"name": "content", "type": "text"},
50+
{"name": "embedding", "type": "vector", "attrs": {
51+
"dims": 1536, "algorithm": "HNSW",
52+
"datatype": "FLOAT32", "distance_metric": "COSINE",
53+
}},
54+
]
55+
})
56+
```
57+
58+
See [references/index-creation.md](references/index-creation.md) for redis-py and RedisVL variants.
59+
60+
## 2. HNSW vs FLAT
61+
62+
| Algorithm | Speed | Accuracy | Memory | Best for |
63+
|---|---|---|---|---|
64+
| **HNSW** | Fast (approximate) | ~95%+ recall (tunable) | Higher | Large datasets (>10k vectors), latency-sensitive |
65+
| **FLAT** | Slow (exact) | 100% | Lower | Small datasets (<10k), accuracy-critical |
66+
67+
Default to **HNSW** for any production-scale workload. Tuning levers:
68+
69+
- `M` — connections per node (16–64). Higher = better recall, more memory.
70+
- `EF_CONSTRUCTION` — build-time graph quality (100–500). Higher = better index, slower build.
71+
- `EF_RUNTIME` — query-time candidate-list size. Higher = better recall, slower queries.
72+
73+
Use **FLAT** when the corpus is small and you need exact results (e.g. semantic dedup over a few thousand items).
74+
75+
See [references/algorithm-choice.md](references/algorithm-choice.md).
76+
77+
## 3. Hybrid search — filter before vector
78+
79+
Apply attribute filters (TAG / NUMERIC) so the engine narrows the search space *before* the vector comparison. Don't fetch a wide result set and then filter client-side — that's slower and less accurate.
80+
81+
```python
82+
from redisvl.query import VectorQuery
83+
from redisvl.query.filter import Num, Tag
84+
85+
filters = (Tag("category") == "technology") & (Num("date") >= 2024)
86+
87+
query = VectorQuery(
88+
vector=query_embedding,
89+
vector_field_name="embedding",
90+
return_fields=["content", "category", "date"],
91+
num_results=10,
92+
filter_expression=filters,
93+
)
94+
results = index.query(query)
95+
```
96+
97+
For **text + vector fusion** (BM25-weighted text scoring combined with vector similarity), use `HybridQuery` on Redis ≥ 8.4 with redis-py ≥ 7.1, or `AggregateHybridQuery` on older Redis. That's a different "hybrid" from filtered vector search above.
98+
99+
See [references/hybrid-search.md](references/hybrid-search.md).
100+
101+
## 4. RAG pattern
102+
103+
Standard pipeline: embed the user query → vector search Redis → pass top-K context to the LLM.
104+
105+
```python
106+
# Index documents with embeddings
107+
records = [{"content": doc.content,
108+
"embedding": embed_model.encode(doc.content).tolist(),
109+
"source": doc.source}
110+
for doc in documents]
111+
index.load(records)
112+
113+
# Retrieve relevant context for a user question
114+
q_emb = embed_model.encode(user_question)
115+
results = index.query(VectorQuery(
116+
vector=q_emb,
117+
vector_field_name="embedding",
118+
return_fields=["content", "source"],
119+
num_results=5,
120+
))
121+
122+
# Generate with retrieved context
123+
context = "\n".join(r["content"] for r in results)
124+
response = llm.generate(f"Context: {context}\n\nQuestion: {user_question}")
125+
```
126+
127+
Practical tips:
128+
129+
- **Match metric to model.** Most modern text embedding models pair best with `COSINE`.
130+
- **Chunk long documents** before indexing — retrieval over 200–500-token chunks usually beats indexing whole pages.
131+
- **Batch inserts** with `index.load([...])` instead of one call per record.
132+
- **Pre-filter with attributes** (tenant, recency, document type) before the vector search.
133+
134+
See [references/rag-pattern.md](references/rag-pattern.md).
135+
136+
## References
137+
138+
- [Redis: Vectors](https://redis.io/docs/latest/develop/ai/search-and-query/vectors/)
139+
- [Redis: RAG quickstart](https://redis.io/docs/latest/develop/get-started/rag/)
140+
- [RedisVL documentation](https://docs.redisvl.com/en/latest/)
Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
# Choose HNSW vs FLAT Based on Requirements
2+
3+
Select the right algorithm based on your accuracy requirements and dataset size.
4+
5+
| Algorithm | Speed | Accuracy | Memory | Best For |
6+
|-----------|-------|----------|--------|----------|
7+
| HNSW | Fast (approximate) | ~95%+ recall tunable | Higher | Large datasets (>10k vectors) |
8+
| FLAT | Slower (exact) | 100% (exact) | Lower | Small datasets, accuracy-critical |
9+
10+
**Correct:** Use HNSW for large-scale production workloads.
11+
12+
```python
13+
from redisvl.schema import IndexSchema
14+
15+
# HNSW - fast approximate search, tunable accuracy
16+
schema = IndexSchema.from_dict({
17+
"index": {"name": "idx:docs", "prefix": "doc:"},
18+
"fields": [
19+
{"name": "embedding", "type": "vector", "attrs": {
20+
"dims": 1536,
21+
"algorithm": "HNSW",
22+
"distance_metric": "COSINE",
23+
"datatype": "FLOAT32",
24+
"m": 16, # Higher = more accurate, more memory
25+
"ef_construction": 200 # Higher = better index quality, slower build
26+
}}
27+
]
28+
})
29+
```
30+
31+
**Correct:** Use FLAT when exact results are required.
32+
33+
```python
34+
# FLAT - exact brute-force search, guaranteed accuracy
35+
schema = IndexSchema.from_dict({
36+
"index": {"name": "idx:small", "prefix": "small:"},
37+
"fields": [
38+
{"name": "embedding", "type": "vector", "attrs": {
39+
"dims": 1536,
40+
"algorithm": "FLAT",
41+
"distance_metric": "COSINE"
42+
}}
43+
]
44+
})
45+
```
46+
47+
**Tuning HNSW accuracy vs speed:**
48+
- `M`: Connections per node (16-64). Higher = better recall, more memory
49+
- `EF_CONSTRUCTION`: Build-time parameter (100-500). Higher = better graph quality
50+
- `EF_RUNTIME`: Query-time parameter. Higher = better recall, slower queries
51+
52+
Reference: [Redis Vector Search](https://redis.io/docs/latest/develop/ai/search-and-query/vectors/)
Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
# Use Hybrid Search for Better Results
2+
3+
Combine vector similarity with attribute filtering for more relevant results. In this rule, "hybrid" means filtered vector search. Redis and RedisVL also use "hybrid search" for text + vector fusion via `FT.HYBRID` / `HybridQuery`.
4+
5+
**Correct:** Apply filters to reduce search space.
6+
7+
```python
8+
from redisvl.query import VectorQuery
9+
from redisvl.query.filter import Num, Tag
10+
11+
filters = (Tag("category") == "technology") & (Num("date") >= 2024) & (Num("date") <= 2025)
12+
13+
query = VectorQuery(
14+
vector=query_embedding,
15+
vector_field_name="embedding",
16+
return_fields=["content", "category", "date"],
17+
num_results=10,
18+
filter_expression=filters
19+
)
20+
21+
results = index.query(query)
22+
```
23+
24+
**Incorrect:** Searching entire vector space when filters apply.
25+
26+
```python
27+
# Bad: No filter - searches all vectors then filters client-side
28+
results = index.query(VectorQuery(
29+
vector=query_embedding,
30+
vector_field_name="embedding",
31+
num_results=1000
32+
))
33+
# Client-side filtering - wasteful
34+
filtered = [r for r in results if r["category"] == "technology"]
35+
```
36+
37+
**Tips:**
38+
- Use TAG fields for category filters
39+
- Use NUMERIC fields for date/price ranges
40+
- Redis auto-selects the filtered vector execution strategy; tune `hybrid_policy` only when needed
41+
- For true text + vector fusion, use `HybridQuery` on Redis >= 8.4.0 with redis-py >= 7.1.0; use `AggregateHybridQuery` on earlier Redis versions
42+
43+
Reference: [Redis Vector Search](https://redis.io/docs/latest/develop/ai/search-and-query/vectors/)
Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
# Configure Vector Indexes Properly
2+
3+
Set the correct dimensions, algorithm, and distance metric for your embeddings. Vector indexes can be created via CLI, Redis Insight, or any client library.
4+
5+
**Correct:** Create index via Redis CLI or Insight.
6+
7+
```
8+
FT.CREATE idx:docs ON HASH PREFIX 1 doc:
9+
SCHEMA
10+
content TEXT
11+
embedding VECTOR HNSW 6
12+
TYPE FLOAT32
13+
DIM 1536
14+
DISTANCE_METRIC COSINE
15+
```
16+
17+
**Correct:** Create index via Python (redis-py).
18+
19+
```python
20+
from redis import Redis
21+
from redis.commands.search.field import TextField, VectorField
22+
from redis.commands.search.index_definition import IndexDefinition
23+
24+
r = Redis()
25+
26+
# Define schema with vector field
27+
schema = [
28+
TextField("content"),
29+
VectorField(
30+
"embedding",
31+
algorithm="HNSW",
32+
attributes={
33+
"TYPE": "FLOAT32",
34+
"DIM": 1536, # Must match your embedding model
35+
"DISTANCE_METRIC": "COSINE"
36+
}
37+
)
38+
]
39+
40+
r.ft("idx:docs").create_index(schema, definition=IndexDefinition(prefix=["doc:"]))
41+
```
42+
43+
**Correct:** Create index via RedisVL.
44+
45+
```python
46+
from redisvl.index import SearchIndex
47+
from redisvl.schema import IndexSchema
48+
49+
schema = IndexSchema.from_dict({
50+
"index": {"name": "idx:docs", "prefix": "doc:"},
51+
"fields": [
52+
{"name": "content", "type": "text"},
53+
{"name": "embedding", "type": "vector", "attrs": {
54+
"dims": 1536,
55+
"algorithm": "HNSW",
56+
"datatype": "FLOAT32",
57+
"distance_metric": "COSINE"
58+
}}
59+
]
60+
})
61+
62+
index = SearchIndex(schema)
63+
index.create(overwrite=True)
64+
```
65+
66+
**Incorrect:** Mismatched dimensions or wrong distance metric.
67+
68+
```python
69+
# Bad: Wrong dimensions for your model
70+
{"dims": 768} # But your selected embedding model outputs a different size
71+
72+
# Bad: Wrong metric for normalized embeddings
73+
{"distance_metric": "L2"} # When embeddings are normalized for COSINE
74+
```
75+
76+
Reference: [Redis Vector Search](https://redis.io/docs/latest/develop/ai/search-and-query/vectors/)
Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
# Implement RAG Pattern Correctly
2+
3+
Store documents with embeddings, retrieve relevant context, and pass to LLM.
4+
5+
**Correct:** Full RAG pipeline with RedisVL.
6+
7+
```python
8+
from redisvl.index import SearchIndex
9+
from redisvl.query import VectorQuery
10+
11+
# 1. Store documents with embeddings
12+
records = []
13+
for doc in documents:
14+
records.append({
15+
"content": doc["content"],
16+
"embedding": embed_model.encode(doc["content"]).tolist(),
17+
"source": doc["source"]
18+
})
19+
20+
index.load(records)
21+
22+
# 2. Query with vector similarity
23+
query_embedding = embed_model.encode(user_question)
24+
results = index.query(VectorQuery(
25+
vector=query_embedding,
26+
vector_field_name="embedding",
27+
return_fields=["content", "source"],
28+
num_results=5
29+
))
30+
31+
# 3. Pass context to LLM
32+
context = "\n".join([r["content"] for r in results])
33+
response = llm.generate(f"Context: {context}\n\nQuestion: {user_question}")
34+
```
35+
36+
**Best practices:**
37+
- Match your distance metric to your embedding model; many modern text embeddings already work well with COSINE
38+
- Batch inserts using `index.load()` with lists
39+
- Set appropriate M and EF_CONSTRUCTION for HNSW based on dataset size
40+
- Use filters to reduce the search space before vector comparison
41+
- Consider chunking long documents for better retrieval
42+
43+
Reference: [Redis RAG Quickstart](https://redis.io/docs/latest/develop/get-started/rag/)

0 commit comments

Comments
 (0)