|
| 1 | +# Observability with Logfire.jl |
| 2 | + |
| 3 | +[Logfire.jl](https://github.com/svilupp/Logfire.jl) provides OpenTelemetry-based observability for your LLM applications built with PromptingTools.jl. It automatically traces all your AI calls with detailed information about tokens, costs, messages, and latency. |
| 4 | + |
| 5 | +## Installation |
| 6 | + |
| 7 | +Logfire.jl is a separate package that provides a PromptingTools extension. Install it along with DotEnv for loading secrets: |
| 8 | + |
| 9 | +```julia |
| 10 | +using Pkg |
| 11 | +Pkg.add(["Logfire", "DotEnv"]) |
| 12 | +``` |
| 13 | + |
| 14 | +The extension is loaded automatically when both packages are present - no additional configuration needed. |
| 15 | + |
| 16 | +## Quick Start |
| 17 | + |
| 18 | +```julia |
| 19 | +using DotEnv |
| 20 | +DotEnv.load!() # Load LOGFIRE_TOKEN and API keys from .env file |
| 21 | + |
| 22 | +using Logfire, PromptingTools |
| 23 | + |
| 24 | +# 1. Configure Logfire (uses LOGFIRE_TOKEN env var, or pass token directly) |
| 25 | +Logfire.configure(service_name = "my-app") |
| 26 | + |
| 27 | +# 2. Instrument all registered models - wraps them with tracing schema |
| 28 | +Logfire.instrument_promptingtools!() |
| 29 | + |
| 30 | +# 3. Use PromptingTools as normal - traces are automatic! |
| 31 | +aigenerate("What is 2 + 2?"; model = "gpt4om") |
| 32 | +``` |
| 33 | + |
| 34 | +## How It Works |
| 35 | + |
| 36 | +The integration works by wrapping registered models in a Logfire tracing schema. When you call `instrument_promptingtools!()`, Logfire modifies the model registry to route all calls through its tracing layer. This means: |
| 37 | + |
| 38 | +- All `ai*` functions work exactly as before |
| 39 | +- No code changes needed in your existing workflows |
| 40 | +- Traces are captured automatically with rich metadata |
| 41 | + |
| 42 | +## What Gets Captured |
| 43 | + |
| 44 | +Each AI call creates a span with: |
| 45 | + |
| 46 | +- **Request parameters**: model, temperature, top_p, max_tokens, stop, penalties |
| 47 | +- **Usage metrics**: input/output/total tokens, latency, cost estimates |
| 48 | +- **Provider metadata**: model returned, status, finish_reason, response_id |
| 49 | +- **Conversation**: full message history (roles + content) |
| 50 | +- **Cache & streaming**: flags and chunk counts |
| 51 | +- **Tool/function calls**: count and payload |
| 52 | +- **Errors**: exceptions with span status set to error |
| 53 | + |
| 54 | +## Extras Field Reference |
| 55 | + |
| 56 | +PromptingTools populates `AIMessage.extras` with detailed metadata that Logfire.jl maps to OpenTelemetry GenAI semantic convention attributes. The fields use unified naming across providers for consistency. |
| 57 | + |
| 58 | +### Provider Metadata |
| 59 | + |
| 60 | +| Extras Key | Type | Description | OpenAI | Anthropic | |
| 61 | +|------------|------|-------------|--------|-----------| |
| 62 | +| `:model` | String | Actual model used (may differ from requested) | ✓ | ✓ | |
| 63 | +| `:response_id` | String | Provider's unique response identifier | ✓ | ✓ | |
| 64 | +| `:system_fingerprint` | String | OpenAI system fingerprint for determinism | ✓ | - | |
| 65 | +| `:service_tier` | String | Service tier used (e.g., "default", "standard") | ✓ | ✓ | |
| 66 | + |
| 67 | +### Unified Usage Keys |
| 68 | + |
| 69 | +These keys provide cross-provider compatibility. Use these for provider-agnostic code: |
| 70 | + |
| 71 | +| Extras Key | Type | Description | OpenAI Source | Anthropic Source | |
| 72 | +|------------|------|-------------|---------------|------------------| |
| 73 | +| `:cache_read_tokens` | Int | Tokens read from cache (cache hits) | `prompt_tokens_details.cached_tokens` | `cache_read_input_tokens` | |
| 74 | +| `:cache_write_tokens` | Int | Tokens written to cache | - | `cache_creation_input_tokens` | |
| 75 | +| `:reasoning_tokens` | Int | Chain-of-thought/reasoning tokens | `completion_tokens_details.reasoning_tokens` | - | |
| 76 | +| `:audio_input_tokens` | Int | Audio tokens in input | `prompt_tokens_details.audio_tokens` | - | |
| 77 | +| `:audio_output_tokens` | Int | Audio tokens in output | `completion_tokens_details.audio_tokens` | - | |
| 78 | +| `:accepted_prediction_tokens` | Int | Predicted tokens that were accepted | `completion_tokens_details.accepted_prediction_tokens` | - | |
| 79 | +| `:rejected_prediction_tokens` | Int | Predicted tokens that were rejected | `completion_tokens_details.rejected_prediction_tokens` | - | |
| 80 | + |
| 81 | +### Anthropic-Specific Keys |
| 82 | + |
| 83 | +| Extras Key | Type | Description | |
| 84 | +|------------|------|-------------| |
| 85 | +| `:cache_write_1h_tokens` | Int | Ephemeral 1-hour cache tokens | |
| 86 | +| `:cache_write_5m_tokens` | Int | Ephemeral 5-minute cache tokens | |
| 87 | +| `:web_search_requests` | Int | Server-side web search requests | |
| 88 | +| `:cache_creation_input_tokens` | Int | Original Anthropic key (backwards compat) | |
| 89 | +| `:cache_read_input_tokens` | Int | Original Anthropic key (backwards compat) | |
| 90 | + |
| 91 | +### Raw Provider Dicts |
| 92 | + |
| 93 | +For debugging or advanced use cases, the original nested structures are preserved: |
| 94 | + |
| 95 | +| Extras Key | Provider | Contents | |
| 96 | +|------------|----------|----------| |
| 97 | +| `:prompt_tokens_details` | OpenAI | `{:cached_tokens, :audio_tokens}` | |
| 98 | +| `:completion_tokens_details` | OpenAI | `{:reasoning_tokens, :audio_tokens, :accepted_prediction_tokens, :rejected_prediction_tokens}` | |
| 99 | +| `:cache_creation` | Anthropic | `{:ephemeral_1h_input_tokens, :ephemeral_5m_input_tokens}` | |
| 100 | +| `:server_tool_use` | Anthropic | `{:web_search_requests}` | |
| 101 | + |
| 102 | +### Example: Accessing Extras |
| 103 | + |
| 104 | +```julia |
| 105 | +using PromptingTools |
| 106 | + |
| 107 | +msg = aigenerate("What is 2+2?"; model="gpt4om") |
| 108 | + |
| 109 | +# Provider metadata |
| 110 | +println("Model used: ", msg.extras[:model]) |
| 111 | +println("Response ID: ", msg.extras[:response_id]) |
| 112 | + |
| 113 | +# Unified usage (works across providers) |
| 114 | +cache_hits = get(msg.extras, :cache_read_tokens, 0) |
| 115 | +reasoning = get(msg.extras, :reasoning_tokens, 0) |
| 116 | + |
| 117 | +# Raw OpenAI details (if needed) |
| 118 | +if haskey(msg.extras, :prompt_tokens_details) |
| 119 | + details = msg.extras[:prompt_tokens_details] |
| 120 | + println("Cached: ", get(details, :cached_tokens, 0)) |
| 121 | +end |
| 122 | +``` |
| 123 | + |
| 124 | +## Instrument Individual Models |
| 125 | + |
| 126 | +You don't have to instrument all models. For selective tracing, wrap only specific models: |
| 127 | + |
| 128 | +```julia |
| 129 | +Logfire.instrument_promptingtools_model!("my-local-llm") |
| 130 | +``` |
| 131 | + |
| 132 | +This reuses the model's registered PromptingTools schema, so provider-specific behavior is preserved. |
| 133 | + |
| 134 | +## Alternative Backends |
| 135 | + |
| 136 | +You don't have to use Logfire cloud. Send traces to any OpenTelemetry-compatible backend using standard environment variables: |
| 137 | + |
| 138 | +| Variable | Purpose | |
| 139 | +|----------|---------| |
| 140 | +| `OTEL_EXPORTER_OTLP_ENDPOINT` | Backend URL (e.g., `http://localhost:4318`) | |
| 141 | +| `OTEL_EXPORTER_OTLP_HEADERS` | Custom headers (e.g., `Authorization=Bearer token`) | |
| 142 | + |
| 143 | +### Local Development with Jaeger |
| 144 | + |
| 145 | +```bash |
| 146 | +# Start Jaeger |
| 147 | +docker run --rm -p 16686:16686 -p 4318:4318 jaegertracing/all-in-one:latest |
| 148 | +``` |
| 149 | + |
| 150 | +```julia |
| 151 | +using Logfire |
| 152 | + |
| 153 | +ENV["OTEL_EXPORTER_OTLP_ENDPOINT"] = "http://localhost:4318" |
| 154 | + |
| 155 | +Logfire.configure( |
| 156 | + service_name = "my-app", |
| 157 | + send_to_logfire = :always # Export even without Logfire token |
| 158 | +) |
| 159 | + |
| 160 | +Logfire.instrument_promptingtools!() |
| 161 | +# Now use PromptingTools normally - traces go to Jaeger |
| 162 | +``` |
| 163 | + |
| 164 | +View traces at: http://localhost:16686 |
| 165 | + |
| 166 | +### Using with Langfuse |
| 167 | + |
| 168 | +```julia |
| 169 | +ENV["OTEL_EXPORTER_OTLP_ENDPOINT"] = "https://cloud.langfuse.com/api/public/otel" |
| 170 | +ENV["OTEL_EXPORTER_OTLP_HEADERS"] = "Authorization=Basic <base64-credentials>" |
| 171 | + |
| 172 | +Logfire.configure(service_name = "my-llm-app", send_to_logfire = :always) |
| 173 | +``` |
| 174 | + |
| 175 | +## Recommended: Pydantic Logfire |
| 176 | + |
| 177 | +While you can use any OTLP-compatible backend, we strongly recommend [Pydantic Logfire](https://pydantic.dev/logfire). Their free tier provides hundreds of thousands of traced conversations per month, which is more than enough for most use cases. The UI is purpose-built for LLM observability with excellent visualization of conversations, token usage, and costs. |
| 178 | + |
| 179 | +## Authentication |
| 180 | + |
| 181 | +- Provide your Logfire token via `Logfire.configure(token = "...")` or set `ENV["LOGFIRE_TOKEN"]` |
| 182 | +- Use `DotEnv.load!()` to load tokens from a project-local `.env` file (recommended for per-project configuration) |
| 183 | + |
| 184 | +## Example |
| 185 | + |
| 186 | +See the full example at [`examples/observability_with_logfire.jl`](https://github.com/svilupp/PromptingTools.jl/blob/main/examples/observability_with_logfire.jl). |
| 187 | + |
| 188 | +## Further Reading |
| 189 | + |
| 190 | +- [Logfire.jl Documentation](https://svilupp.github.io/Logfire.jl/dev) |
| 191 | +- [Logfire.jl GitHub](https://github.com/svilupp/Logfire.jl) |
| 192 | +- [Pydantic Logfire](https://pydantic.dev/logfire) |
0 commit comments