Skip to content

Add TwelveLabs video integration (Marengo embeddings + Pegasus analysis)#175

Open
mohit-twelvelabs wants to merge 1 commit into
gomate-community:mainfrom
mohit-twelvelabs:feat/twelvelabs-integration
Open

Add TwelveLabs video integration (Marengo embeddings + Pegasus analysis)#175
mohit-twelvelabs wants to merge 1 commit into
gomate-community:mainfrom
mohit-twelvelabs:feat/twelvelabs-integration

Conversation

@mohit-twelvelabs

Copy link
Copy Markdown

Hi! I'm Mohit, I work at TwelveLabs (@mohit-twelvelabs).

This PR adds an opt-in video modality to TrustRAG via TwelveLabs, so videos become first-class citizens in a RAG pipeline.

What it adds

  • TwelveLabsEmbedding (trustrag/modules/vector/embedding.py) — Marengo multimodal embeddings. Marengo embeds text, image, audio and video into one shared 512-dim space, so a plain-text query can be matched directly against video clips with the existing EmbeddingGenerator.cosine_similarity. It implements the standard generate_embeddings text interface (drop-in EmbeddingGenerator) plus embed_image/embed_audio.
  • TwelveLabsVideoAnalyzer — Pegasus video understanding (video → text), useful for turning videos into searchable/grounded passages (summaries, Q&A over a clip).
  • Registered as 'twelvelabs' in EmbeddingFactory alongside the existing providers.
  • Example (examples/vectors/twelvelabs_embedding_example.py), API-key-gated unit tests, twelvelabs>=1.2.8 in requirements.txt, and a README update-log entry.

Why it helps

TrustRAG already supports text/image embeddings and multimodal Q&A; this extends retrieval and grounding to video, a modality not currently covered, using the same factory/registry pattern as the other embedding providers.

Opt-in & non-breaking

No existing defaults or behavior change. The provider is only used when explicitly selected (EmbeddingFactory.create_embedding_generator('twelvelabs') or constructing the class directly), and the SDK is imported lazily inside the classes.

How it was tested

  • Live smoke test against the Marengo API: generate_embeddings([...]) returns shape (n, 512).
  • No-network unit tests: provider is registered in the factory; analyze() raises ValueError when no video source is given.
  • flake8 clean on all changed files (against the repo's .flake8).

You can grab a free API key at https://twelvelabs.io — there's a generous free tier.

- TwelveLabsEmbedding: Marengo multimodal embeddings (512-dim shared
  text/image/audio/video space), registered as 'twelvelabs' in EmbeddingFactory
- TwelveLabsVideoAnalyzer: Pegasus video-to-text understanding for grounding
- Example, unit tests (API-key gated), requirements and README update

Opt-in and non-breaking: no existing defaults or behavior change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant