You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
result =query_pipeline.run({"sparse_text_embedder": {"text": query}})
211
+
result =retrieval_pipeline.run({"sparse_text_embedder": {"text": query}})
209
212
210
213
print(result["sparse_retriever"]["documents"][0])
211
214
212
-
# Document(id=..., content: 'fastembed is supported by and maintained by Milvus.', sparse_embedding: vector with 48 non-zero elements)
215
+
# Document(id=..., content: 'full text search is supported by Milvus.', sparse_embedding: vector with 48 non-zero elements)
216
+
```
217
+
### Sparse retrieval with Milvus built-in BM25 function
218
+
Milvus provides a built-in BM25 function that can generate sparse vectors directly from text fields. This approach simplifies the pipeline construction compared to using Haystack's sparse embedders. The main differences are:
219
+
220
+
1. We need to specify a `BM25BuiltInFunction` in the document store with some field specification parameters.
221
+
2. We don't need to use the embedder explicitly since Milvus handles the sparse embedding in the Milvus server end.
222
+
3. The pipeline is simpler with fewer components and connections.
223
+
224
+
Below is a complete example using Milvus' built-in BM25 function. The code with `+` signs shows the simplified approach using Milvus' built-in functionality, while the code with `-` signs shows the original approach that requires explicit sparse embedding:
225
+
226
+
```diff
227
+
+ from milvus_haystack.function import BM25BuiltInFunction
# Document(id=..., content: 'fastembed is supported by and maintained by Milvus.', embedding: vector of size 1536, sparse_embedding: vector with 48 non-zero elements)
279
-
323
+
# Document(id=..., content: 'full text search is supported by Milvus.', embedding: vector of size 1536, sparse_embedding: vector with 48 non-zero elements)
324
+
```
325
+
### Hybrid retrieval with Milvus built-in BM25 function
326
+
Milvus provides a built-in BM25 function that can generate sparse vectors directly from text fields. This approach simplifies the pipeline construction compared to using Haystack's sparse embedders, making it a useful complement to semantic search. The main differences are:
327
+
328
+
1. We need to specify a `BM25BuiltInFunction` in the document store with some field specification parameters.
329
+
2. We don't need to use the embedder explicitly since Milvus handles the sparse embedding in the Milvus server end.
330
+
3. The pipeline is simpler with fewer components and connections, which is especially beneficial in hybrid retrieval setups.
331
+
332
+
Below is a complete example using Milvus' built-in BM25 function for hybrid retrieval. The code with `+` signs shows the simplified approach using Milvus' built-in functionality, while the code with `-` signs shows the original approach that requires explicit sparse embedding:
333
+
334
+
```diff
335
+
+ from milvus_haystack.function import BM25BuiltInFunction
0 commit comments