Skip to content

v0.11.0

Choose a tag to compare

@anakin87 anakin87 released this 02 Jul 10:36
· 40 commits to main since this release
8f13872

🧪 New Experiments

Query Expander component

We are introducing a component that generates a list of semantically similar queries to improve retrieval recall in RAG systems.

from haystack.components.generators.chat.openai import OpenAIChatGenerator
from haystack_experimental.components.query import QueryExpander

expander = QueryExpander(
    chat_generator=OpenAIChatGenerator(model="gpt-4.1-mini"),
    n_expansions=3
)

result = expander.run(query="green energy sources")
print(result["queries"])
# Output: ['alternative query 1', 'alternative query 2', 'alternative query 3', 'green energy sources']
# Note: Up to 3 additional queries + 1 original query (if include_original_query=True)

# To control total number of queries:
expander = QueryExpander(n_expansions=2, include_original_query=True)  # Up to 3 total
# or
expander = QueryExpander(n_expansions=3, include_original_query=False)  # Exactly 3 total

🔀 New Document Routers

We're introducing two new Routers: DocumentTypeRouter and DocumentLengthRouter.

🖼️ New Multimodal Features

We introduced several new multimodal features, mostly focused on indexing and retrieval.
A notebook will be published soon to show practical usage examples.

Related PRs
  • refactor: adopt pypdfium2 for PDF to image conversion by @anakin87 in #308
  • feat: multimodal support in AmazonBedrockChatGenerator by @anakin87 in #307
  • test: Fix mypy typing by @sjrl in #309
  • feat: Add DocumentToImageConent component to help enable RAG with image Documents by @sjrl in #311
  • chore: fix format for DocumentToImageContent by @anakin87 in #318
  • chore: ignore type errors in Bedrock monkey patches by @anakin87 in #322
  • feat: add SentenceTransformersDocumentImageEmbedder by @anakin87 in #319
  • feat: Add DocumentTypeRouter by @sjrl in #321
  • refactor: refactor multimodal components and utility functions by @anakin87 in #324
  • fix: Fix storage of file path in ImageContent by @sjrl in #325
  • refactor: Refactor converters to follow embedders directory structure by @sjrl in #333
  • feat: Add normalize_embeddings to SentenceTransformersDocumentImageEmbedder to match signature of other embedders by @sjrl in #335
  • feat: add DocumentLengthRouter component by @anakin87 in #334
  • feat: Add ImageFileToDocument converter by @sjrl in #336
  • feat: Add LLMDocumentContentExtractor to enable Vision-based LLMs to describe/convert an image into text by @sjrl in #338
  • docs: add usage examples to docstrings of multimodal components by @anakin87 in #340

Other Updates

  • refactor: synchronising/merging all pipeline related code with haystack main repository by @davidsbatista in #312
  • chore: align Haystack experimental Hatch scripts by @anakin87 in #315
  • chore: align experimental type checking with Haystack by @anakin87 in #320
  • refactor: Refactor experimental Pipeline to use inheritancee by @sjrl in #323
  • fix: refactor code and update init_params in debug_state by @Amnah199 in #317
  • chore: fix ruff linting error by @Amnah199 in #329
  • fix: Fix logger message for pipeline breakpoints by @sjrl in #327
  • fix: Fix validate_input becoming public method by @sjrl in #337
  • Refactor serialization of breakpoints by @Amnah199 in #332

New Contributors

Full Changelog: v0.10.0...v0.11