Skip to content

[SDK][Python] Add Weaviate retrieval integration#5424

Draft
vincentkoc wants to merge 6 commits intomainfrom
vincentkoc-code/integration-weaviate
Draft

[SDK][Python] Add Weaviate retrieval integration#5424
vincentkoc wants to merge 6 commits intomainfrom
vincentkoc-code/integration-weaviate

Conversation

@vincentkoc
Copy link
Member

@vincentkoc vincentkoc commented Feb 26, 2026

Details

Adds first-pass Python retrieval integration wrapper for Weaviate.

This PR includes:

  • tracker function: track_weaviate
  • provider module export (__init__.py)
  • shared retrieval tracking helper used by provider wrappers
  • Weaviate integration documentation page
  • navigation update for the integration docs
  • provider-specific unit tests for wrapper behavior and metadata contract

Change checklist

  • User facing
  • Documentation update

Issues

  • Resolves #
  • OPIK-

Testing

  • cd sdks/python && PYTHONPATH=src pytest tests/unit/integrations/test_weaviate_tracker.py
  • Result: 2 passed

Documentation

  • Added docs/tracing/integrations/weaviate.mdx
  • Updated docs navigation in docs.yml

@github-actions github-actions bot added python Pull requests that update Python code Python SDK labels Feb 26, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Feb 26, 2026

📋 PR Linter Failed

Invalid Title Format. Your PR title must include a ticket/issue number and may optionally include component tags ([FE], [BE], etc.).

  • Internal contributors: Open a JIRA ticket and link to it: [OPIK-xxxx] or [CUST-xxxx] or [DND-xxxx] or [DEV-xxxx] [COMPONENT] Your change
  • External contributors: Open a Github Issue and link to it via its number: [issue-xxxx] [COMPONENT] Your change
  • No ticket: Use [NA] [COMPONENT] Your change (Issues section not required)

Example: [issue-3108] [BE] [FE] Fix authentication bug or [OPIK-1234] Fix bug or [NA] Update README


Incomplete Issues Section. You must reference at least one GitHub issue (#xxxx), Jira ticket (OPIK-xxxx), CUST ticket (CUST-xxxx), DEV ticket (DEV-xxxx), or DND ticket (DND-xxxx) under the ## Issues section.

1 similar comment
@github-actions
Copy link
Contributor

github-actions bot commented Feb 26, 2026

📋 PR Linter Failed

Invalid Title Format. Your PR title must include a ticket/issue number and may optionally include component tags ([FE], [BE], etc.).

  • Internal contributors: Open a JIRA ticket and link to it: [OPIK-xxxx] or [CUST-xxxx] or [DND-xxxx] or [DEV-xxxx] [COMPONENT] Your change
  • External contributors: Open a Github Issue and link to it via its number: [issue-xxxx] [COMPONENT] Your change
  • No ticket: Use [NA] [COMPONENT] Your change (Issues section not required)

Example: [issue-3108] [BE] [FE] Fix authentication bug or [OPIK-1234] Fix bug or [NA] Update README


Incomplete Issues Section. You must reference at least one GitHub issue (#xxxx), Jira ticket (OPIK-xxxx), CUST ticket (CUST-xxxx), DEV ticket (DEV-xxxx), or DND ticket (DND-xxxx) under the ## Issues section.

Comment on lines +5 to +9
from opik.integrations._retrieval_tracker import (
RetrievalTrackingConfig,
as_tuple,
patch_retrieval_client,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Direct symbol import from _retrieval_tracker violates the Python SDK guideline (sdks/python/AGENTS.md) that new code should prefer module-style imports; can we import the module (e.g. import opik.integrations._retrieval_tracker as retrieval_tracker) and reference retrieval_tracker.RetrievalTrackingConfig/as_tuple/patch_retrieval_client instead?

Finding type: AI Coding Guidelines


  • Apply fix with Baz
Other fix methods

Fix in Cursor

Prompt for AI Agents:

In sdks/python/src/opik/integrations/weaviate/opik_tracker.py around lines 5 to 9, the
file uses direct symbol imports from opik.integrations._retrieval_tracker. Refactor the
imports to use a module-style import (for example: import
opik.integrations._retrieval_tracker as retrieval_tracker) and update the function body
in track_weaviate to reference retrieval_tracker.RetrievalTrackingConfig,
retrieval_tracker.as_tuple, and retrieval_tracker.patch_retrieval_client instead of the
directly imported names. Ensure no other behavior changes and run tests/linters to
confirm imports follow the SDK guideline.

Comment on lines +12 to +29
def track_weaviate(weaviate_client_or_collection: Any, project_name: Optional[str] = None) -> Any:
"""Adds Opik tracking wrappers to a Weaviate client or collection/query object."""
config = RetrievalTrackingConfig(
provider="weaviate",
operation_paths=as_tuple(
[
"query.get",
"query.raw",
"query.hybrid",
"query.near_text",
"query.near_vector",
"query.fetch_objects",
"query.bm25",
]
),
project_name=project_name,
)
return patch_retrieval_client(weaviate_client_or_collection, config)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sdks/python/AGENTS.md mandates that behavior changes get tests; the new Weaviate integration wraps client methods via patch_retrieval_client but no new test under sdks/python/tests/ exercises track_weaviate or the tracking patch/idempotency, so this new behavior is unverified. Can we add unit/ integration coverage that exercises track_weaviate (and the shared patcher) to prove it patches the expected methods and stays idempotent?

Finding type: AI Coding Guidelines


  • Apply fix with Baz
Other fix methods

Fix in Cursor

Prompt for AI Agents:

In sdks/python/src/opik/integrations/weaviate/opik_tracker.py around lines 12-29, the
function `track_weaviate` adds a new integration but there are no tests verifying it
patches the expected methods or is idempotent. Add unit tests under sdks/python/tests/
that: create a minimal fake Weaviate client/collection object with the method names
listed in operation_paths (e.g., query.get, query.raw, etc.), monkeypatch or mock
`opik.integrations._retrieval_tracker.patch_retrieval_client` to capture the passed
client and config, call `track_weaviate` and assert patch_retrieval_client was called
once with provider == 'weaviate' and the exact operation_paths; also add a test that
calling `track_weaviate` twice does not double-wrap (idempotency) by asserting
patch_retrieval_client handles already-patched clients or that methods remain callable
only once-wrapped. Include cleanup and clear comments specifying which behavior each
test asserts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit 1e24723 addressed this comment by adding new unit tests in sdks/python/tests/unit/integrations/test_weaviate_tracker.py that exercise track_weaviate and verify it wraps nested collection.query.* methods (asserting opik.track is applied and produces the expected weaviate.<operation> names/metadata). It also adds an explicit idempotency test that calls track_weaviate twice on the same object and asserts no additional tracking wrappers are applied on the second call.

@github-actions github-actions bot added documentation Improvements or additions to documentation tests Including test files, or tests related like configuration. labels Feb 26, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Feb 26, 2026

📋 PR Linter Failed

Invalid Title Format. Your PR title must include a ticket/issue number and may optionally include component tags ([FE], [BE], etc.).

  • Internal contributors: Open a JIRA ticket and link to it: [OPIK-xxxx] or [CUST-xxxx] or [DND-xxxx] or [DEV-xxxx] [COMPONENT] Your change
  • External contributors: Open a Github Issue and link to it via its number: [issue-xxxx] [COMPONENT] Your change
  • No ticket: Use [NA] [COMPONENT] Your change (Issues section not required)

Example: [issue-3108] [BE] [FE] Fix authentication bug or [OPIK-1234] Fix bug or [NA] Update README


Incomplete Issues Section. You must reference at least one GitHub issue (#xxxx), Jira ticket (OPIK-xxxx), CUST ticket (CUST-xxxx), DEV ticket (DEV-xxxx), or DND ticket (DND-xxxx) under the ## Issues section.

1 similar comment
@github-actions
Copy link
Contributor

github-actions bot commented Feb 26, 2026

📋 PR Linter Failed

Invalid Title Format. Your PR title must include a ticket/issue number and may optionally include component tags ([FE], [BE], etc.).

  • Internal contributors: Open a JIRA ticket and link to it: [OPIK-xxxx] or [CUST-xxxx] or [DND-xxxx] or [DEV-xxxx] [COMPONENT] Your change
  • External contributors: Open a Github Issue and link to it via its number: [issue-xxxx] [COMPONENT] Your change
  • No ticket: Use [NA] [COMPONENT] Your change (Issues section not required)

Example: [issue-3108] [BE] [FE] Fix authentication bug or [OPIK-1234] Fix bug or [NA] Update README


Incomplete Issues Section. You must reference at least one GitHub issue (#xxxx), Jira ticket (OPIK-xxxx), CUST ticket (CUST-xxxx), DEV ticket (DEV-xxxx), or DND ticket (DND-xxxx) under the ## Issues section.

@github-actions
Copy link
Contributor

Comment on lines +17 to +29
```python
import weaviate
from opik.integrations.weaviate import track_weaviate

client = weaviate.connect_to_local()
collection = client.collections.get("docs")
collection = track_weaviate(collection, project_name="retrieval-demo")

result = collection.query.hybrid(
query="What is Opik?",
limit=5,
)
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hand-pasted track_weaviate snippet has no intent/trigger sentence, no required-vs-optional param notes, and no link to canonical/generated example, so readers can't tell when to use it or how to keep it synced; can we add a one-line intent note, mark which params are required/optional, and link to the autogenerated source with a maintenance note?

Finding types: Keep docs accurate


Want Baz to fix this for you? Activate Fixer

Other fix methods

Fix in Cursor

Prompt for AI Agents:

In apps/opik-documentation/documentation/fern/docs/tracing/integrations/weaviate.mdx
around lines 17-29, the usage snippet for track_weaviate lacks intent/context,
required-vs-optional parameter notes, and a link/maintenance pointer to the canonical
autogenerated SDK example. Edit the doc to: 1) insert a one-line intent/trigger sentence
immediately above the code block explaining when to use track_weaviate (e.g., "Use
track_weaviate to wrap a Weaviate collection so Opik records retrieval tool spans for
query operations"); 2) add a short inline comment or one-sentence note after the import
or before calling track_weaviate that lists which parameters are required vs optional
(for example: "project_name is optional and used for tagging; other client config is
passed through to the Weaviate client"); and 3) add a final sentence after the snippet
linking to the autogenerated SDK example (or noting its path) with a maintenance note
stating the canonical source file/path, who is responsible for refreshing it, and the
update cadence (for example: "This snippet is derived from <path/to/source> and should
be refreshed by the docs team when API changes occur or every 3 months"). Make these
changes concise and in-place so readers know why to use the snippet, what params they
must supply, and where to sync updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation Python SDK python Pull requests that update Python code tests Including test files, or tests related like configuration.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant