Release v0.11.0 · deepset-ai/haystack-experimental

🧪 New Experiments

Query Expander component

We are introducing a component that generates a list of semantically similar queries to improve retrieval recall in RAG systems.

from haystack.components.generators.chat.openai import OpenAIChatGenerator
from haystack_experimental.components.query import QueryExpander

expander = QueryExpander(
    chat_generator=OpenAIChatGenerator(model="gpt-4.1-mini"),
    n_expansions=3
)

result = expander.run(query="green energy sources")
print(result["queries"])
# Output: ['alternative query 1', 'alternative query 2', 'alternative query 3', 'green energy sources']
# Note: Up to 3 additional queries + 1 original query (if include_original_query=True)

# To control total number of queries:
expander = QueryExpander(n_expansions=2, include_original_query=True)  # Up to 3 total
# or
expander = QueryExpander(n_expansions=3, include_original_query=False)  # Exactly 3 total

feat: add QueryExpander component by @mpangrazzi in #331

🔀 New Document Routers

We're introducing two new Routers: DocumentTypeRouter and DocumentLengthRouter.

🖼️ New Multimodal Features

We introduced several new multimodal features, mostly focused on indexing and retrieval.
A notebook will be published soon to show practical usage examples.

multimodal support in AmazonBedrockChatGenerator
new image Converters
SentenceTransformersDocumentImageEmbedder: a component to compute embeddings for image-based documents
LLMDocumentContentExtractor: a component to extract textual content from image-based documents using a vision-enabled LLM

Related PRs

refactor: adopt pypdfium2 for PDF to image conversion by @anakin87 in #308
feat: multimodal support in AmazonBedrockChatGenerator by @anakin87 in #307
test: Fix mypy typing by @sjrl in #309
feat: Add DocumentToImageConent component to help enable RAG with image Documents by @sjrl in #311
chore: fix format for DocumentToImageContent by @anakin87 in #318
chore: ignore type errors in Bedrock monkey patches by @anakin87 in #322
feat: add SentenceTransformersDocumentImageEmbedder by @anakin87 in #319
feat: Add DocumentTypeRouter by @sjrl in #321
refactor: refactor multimodal components and utility functions by @anakin87 in #324
fix: Fix storage of file path in ImageContent by @sjrl in #325
refactor: Refactor converters to follow embedders directory structure by @sjrl in #333
feat: Add normalize_embeddings to SentenceTransformersDocumentImageEmbedder to match signature of other embedders by @sjrl in #335
feat: add DocumentLengthRouter component by @anakin87 in #334
feat: Add ImageFileToDocument converter by @sjrl in #336
feat: Add LLMDocumentContentExtractor to enable Vision-based LLMs to describe/convert an image into text by @sjrl in #338
docs: add usage examples to docstrings of multimodal components by @anakin87 in #340

Other Updates

refactor: synchronising/merging all pipeline related code with haystack main repository by @davidsbatista in #312
chore: align Haystack experimental Hatch scripts by @anakin87 in #315
chore: align experimental type checking with Haystack by @anakin87 in #320
refactor: Refactor experimental Pipeline to use inheritancee by @sjrl in #323
fix: refactor code and update init_params in debug_state by @Amnah199 in #317
chore: fix ruff linting error by @Amnah199 in #329
fix: Fix logger message for pipeline breakpoints by @sjrl in #327
fix: Fix validate_input becoming public method by @sjrl in #337
Refactor serialization of breakpoints by @Amnah199 in #332

New Contributors

@mpangrazzi made their first contribution in #331

Full Changelog: v0.10.0...v0.11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v0.11.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

🧪 New Experiments

Query Expander component

🔀 New Document Routers

🖼️ New Multimodal Features

Other Updates

New Contributors

Contributors

Uh oh!