Skip to content

Align TruLens Semantic Conventions with OTEL GenAI Semantic Conventions #2424

@joshreini1

Description

@joshreini1

Summary

TruLens defines its own OTEL semantic conventions under the ai.observability.* namespace (src/otel/semconv/trulens/otel/semconv/trace.py), but the official OpenTelemetry GenAI Semantic Conventions use the gen_ai.* namespace. Currently zero gen_ai.* attributes are emitted anywhere in the codebase — TruLens telemetry is invisible to standard OTEL GenAI dashboards and backends.

The Gap

Namespace mismatch

TruLens uses ai.observability.* for everything. OTEL GenAI uses gen_ai.*. There is zero overlap at the namespace level.

~17 attributes have conceptual equivalents but different keys

Concept TruLens Key OTEL GenAI Key
Model name ai.observability.cost.model gen_ai.request.model
Prompt tokens ai.observability.cost.num_prompt_tokens gen_ai.usage.input_tokens
Completion tokens ai.observability.cost.num_completion_tokens gen_ai.usage.output_tokens
Temperature ai.observability.generation.temperature (TS only) gen_ai.request.temperature
Input messages ai.observability.generation.input_messages (TS only) gen_ai.input.messages
Output messages ai.observability.generation.output_messages (TS only) gen_ai.output.messages
Retrieval query ai.observability.retrieval.query_text gen_ai.retrieval.query.text
Retrieved docs ai.observability.retrieval.retrieved_contexts gen_ai.retrieval.documents
Tool name ai.observability.mcp.tool_name gen_ai.tool.name
Tool args ai.observability.mcp.input_arguments gen_ai.tool.call.arguments
Tool result ai.observability.mcp.output_content gen_ai.tool.call.result
Operation type ai.observability.span_type (enum attr) gen_ai.operation.name

OTEL GenAI attributes TruLens doesn't capture at all

  • gen_ai.operation.name (chat, text_completion, embeddings, etc.)
  • gen_ai.provider.name (e.g., openai, anthropic)
  • gen_ai.response.model (actual model that responded, may differ from requested)
  • gen_ai.response.id, gen_ai.response.finish_reasons
  • gen_ai.request.max_tokens, gen_ai.request.top_p, gen_ai.request.top_k
  • gen_ai.request.frequency_penalty, gen_ai.request.presence_penalty
  • gen_ai.conversation.id (related: we filed Add conversation_id Support for Thread-Based Trace Grouping #2423 for conversation_id support)
  • gen_ai.tool.definitions, gen_ai.tool.call.id, gen_ai.tool.type
  • gen_ai.system_instructions
  • server.address / server.port
  • error.type

Python GENERATION class is empty

The GENERATION inner class in trace.py defines no attributes in Python — generation-specific attributes (model, temperature, messages, tokens) only exist in the TypeScript dashboard constants (src/dashboard/react_components/record_viewer_otel/src/constants/span.ts), suggesting they were planned but never wired into the Python instrumentation layer.

What

Phase 1: Emit gen_ai.* alongside ai.observability.*

  • On generation spans: emit gen_ai.operation.name, gen_ai.request.model, gen_ai.usage.input_tokens, gen_ai.usage.output_tokens, gen_ai.request.temperature, gen_ai.provider.name
  • On retrieval spans: emit gen_ai.retrieval.query.text, gen_ai.retrieval.documents
  • On tool spans: emit gen_ai.tool.name, gen_ai.tool.call.arguments, gen_ai.tool.call.result
  • Dual-emit to maintain backward compatibility — ai.observability.* attributes stay as-is

Phase 2: Populate the empty Python GENERATION class

  • Add model, temperature, input_messages, output_messages, input_token_count, output_token_count to the Python GENERATION class in trace.py
  • Wire them into provider instrumentation (OpenAI, LiteLLM, Google, Bedrock, Cortex endpoints)

Phase 3: Capture missing OTEL GenAI attributes

  • Add gen_ai.response.finish_reasons, gen_ai.response.id
  • Add gen_ai.request.max_tokens, gen_ai.request.top_p where available from provider responses
  • Add gen_ai.provider.name based on which TruLens provider is active
  • Add error.type for standardized error classification

What to keep as TruLens-specific (ai.observability.*)

These have no OTEL GenAI equivalent and should remain in the TruLens namespace:

  • All evaluation attributes (eval_root.*, eval.*)
  • Cost currency and dollar cost (cost.cost, cost.cost_currency)
  • Reasoning tokens (cost.num_reasoning_tokens)
  • Graph/workflow orchestration (graph_task.*, graph_node.*, workflow.*)
  • Reranking operations (reranking.*)
  • MCP-specific fields (mcp.server_name, mcp.input_schema, mcp.output_is_error, mcp.execution_time_ms)
  • Record/run management (record_id, run.name, input_id, span_groups)

References

Difficulty

Medium-Hard (phased approach recommended)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthelp wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions