Skip to content

Commit d7e105f

Browse files
authored
Add Logfire.jl observability support (#322)
1 parent 39d8402 commit d7e105f

15 files changed

+1122
-32
lines changed

CHANGELOG.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,13 +8,20 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
88

99
### Added
1010

11+
### Fixed
12+
1113
### Updated
1214

1315
## [0.88.0]
1416

1517
### Added
1618
- Added support for OpenAI's Responses API (`/responses` endpoint) via `OpenAIResponseSchema`. Supports reasoning traces, multi-turn conversations with `previous_response_id`, and structured extraction with `aiextract`. Use `aigenerate(OpenAIResponseSchema(), prompt; model="o4-mini")` for reasoning models (access via `result.extras[:reasoning_content]`). See `examples/working_with_responses_api.jl`. Note: Many features are not supported yet, eg, built-in tools, etc.
1719
- Added support for streaming responses with `OpenAIResponseSchema` via a dedicated `StreamCallback` flavor. See `examples/working_with_responses_api.jl`.
20+
- Added comprehensive observability metadata to `AIMessage.extras` for Logfire.jl integration (provider metadata, unified usage keys, cache/reasoning tokens). See `examples/observability_with_logfire.jl`.
21+
22+
### Fixed
23+
- Fixed `return_all` parameter not being handled correctly in tracer wrappers for `aiextract`, `aitools`, `aiscan`, and `aiimage`. Previously, when using `TracerSchema` or `SaverSchema`, these functions would pass through the raw vector result instead of returning a single message when `return_all=false` (the default).
24+
- Fixed `aigenerate` and `aiextract` for `OpenAIResponseSchema` ignoring the `return_all` parameter, which broke compatibility with the tracer infrastructure and other patterns that rely on `return_all=true`.
1825

1926
## [0.87.0]
2027

README.md

Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -103,6 +103,7 @@ For more practical examples, see the `examples/` folder and the [Advanced Exampl
103103
- [Using MistralAI API and other OpenAI-compatible APIs](#using-mistralai-api-and-other-openai-compatible-apis)
104104
- [Using OpenAI Responses API](#using-openai-responses-api)
105105
- [Using Anthropic Models](#using-anthropic-models)
106+
- [Advanced Observability with Logfire.jl](#advanced-observability-with-logfirejl)
106107
- [More Examples](#more-examples)
107108
- [Package Interface](#package-interface)
108109
- [Frequently Asked Questions](#frequently-asked-questions)
@@ -657,6 +658,65 @@ msg = aigenerate(
657658
```
658659

659660

661+
### Advanced Observability with Logfire.jl
662+
663+
[Logfire.jl](https://github.com/svilupp/Logfire.jl) provides OpenTelemetry-based observability for your LLM applications. It automatically traces all your AI calls with detailed information about tokens, costs, messages, and latency.
664+
665+
**Quick Setup:**
666+
667+
```julia
668+
using Pkg
669+
Pkg.add(["Logfire", "DotEnv"]) # Install Logfire.jl to enable the extension
670+
671+
using DotEnv
672+
DotEnv.load!() # Load LOGFIRE_TOKEN and API keys from .env file
673+
674+
using Logfire, PromptingTools
675+
676+
# 1. Configure Logfire (uses LOGFIRE_TOKEN env var, or pass token directly)
677+
Logfire.configure(service_name = "my-app")
678+
679+
# 2. Instrument all registered models - wraps them with tracing schema
680+
Logfire.instrument_promptingtools!()
681+
682+
# 3. Use PromptingTools as normal - traces are automatic!
683+
aigenerate("What is 2 + 2?"; model = "gpt4om")
684+
```
685+
686+
**What Gets Captured:**
687+
- Token usage (input/output/total) and cost estimates
688+
- Full conversation history (system, user, assistant messages)
689+
- Model parameters (temperature, max_tokens, etc.)
690+
- Latency measurements and cache/streaming flags
691+
- Tool/function calls and structured extraction results
692+
693+
**Instrument Individual Models:**
694+
695+
You don't have to instrument all models. Wrap only specific models for selective tracing:
696+
697+
```julia
698+
Logfire.instrument_promptingtools_model!("my-local-llm")
699+
```
700+
701+
**Alternative Backends:**
702+
703+
You don't have to use Logfire cloud - send traces to any OpenTelemetry-compatible backend:
704+
705+
```julia
706+
# Local development with Jaeger
707+
ENV["OTEL_EXPORTER_OTLP_ENDPOINT"] = "http://localhost:4318"
708+
Logfire.configure(service_name = "my-app", send_to_logfire = :always)
709+
710+
# Or use Langfuse
711+
ENV["OTEL_EXPORTER_OTLP_ENDPOINT"] = "https://cloud.langfuse.com/api/public/otel"
712+
ENV["OTEL_EXPORTER_OTLP_HEADERS"] = "Authorization=Basic <base64-credentials>"
713+
Logfire.configure(service_name = "my-app", send_to_logfire = :always)
714+
```
715+
716+
That said, I strongly recommend [Pydantic Logfire](https://pydantic.dev/logfire) - their free tier provides hundreds of thousands of traced conversations per month, which is more than enough for most use cases.
717+
718+
See the [Logfire.jl documentation](https://svilupp.github.io/Logfire.jl/dev) and [`examples/observability_with_logfire.jl`](examples/observability_with_logfire.jl) for more details.
719+
660720
### More Examples
661721

662722
TBU...

docs/src/.vitepress/config.mts

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,7 @@ export default defineConfig({
5454
{ text: 'RAGTools', link: '/extra_tools/rag_tools_intro' },
5555
{ text: 'RAGTools Migration', link: '/ragtools_migration' },
5656
{ text: 'APITools', link: '/extra_tools/api_tools_intro' },
57+
{ text: 'Observability (Logfire)', link: '/extra_tools/observability_logfire' },
5758
]
5859
},
5960
],
@@ -92,7 +93,8 @@ export default defineConfig({
9293
{ text: 'Extra Tools', collapsed: true, items: [
9394
{ text: 'Text Utilities', link: '/extra_tools/text_utilities_intro' },
9495
{ text: 'AgentTools', link: '/extra_tools/agent_tools_intro' },
95-
{ text: 'APITools', link: '/extra_tools/api_tools_intro' }]
96+
{ text: 'APITools', link: '/extra_tools/api_tools_intro' },
97+
{ text: 'Observability (Logfire)', link: '/extra_tools/observability_logfire' }]
9698
},
9799
],
98100
},
Lines changed: 192 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,192 @@
1+
# Observability with Logfire.jl
2+
3+
[Logfire.jl](https://github.com/svilupp/Logfire.jl) provides OpenTelemetry-based observability for your LLM applications built with PromptingTools.jl. It automatically traces all your AI calls with detailed information about tokens, costs, messages, and latency.
4+
5+
## Installation
6+
7+
Logfire.jl is a separate package that provides a PromptingTools extension. Install it along with DotEnv for loading secrets:
8+
9+
```julia
10+
using Pkg
11+
Pkg.add(["Logfire", "DotEnv"])
12+
```
13+
14+
The extension is loaded automatically when both packages are present - no additional configuration needed.
15+
16+
## Quick Start
17+
18+
```julia
19+
using DotEnv
20+
DotEnv.load!() # Load LOGFIRE_TOKEN and API keys from .env file
21+
22+
using Logfire, PromptingTools
23+
24+
# 1. Configure Logfire (uses LOGFIRE_TOKEN env var, or pass token directly)
25+
Logfire.configure(service_name = "my-app")
26+
27+
# 2. Instrument all registered models - wraps them with tracing schema
28+
Logfire.instrument_promptingtools!()
29+
30+
# 3. Use PromptingTools as normal - traces are automatic!
31+
aigenerate("What is 2 + 2?"; model = "gpt4om")
32+
```
33+
34+
## How It Works
35+
36+
The integration works by wrapping registered models in a Logfire tracing schema. When you call `instrument_promptingtools!()`, Logfire modifies the model registry to route all calls through its tracing layer. This means:
37+
38+
- All `ai*` functions work exactly as before
39+
- No code changes needed in your existing workflows
40+
- Traces are captured automatically with rich metadata
41+
42+
## What Gets Captured
43+
44+
Each AI call creates a span with:
45+
46+
- **Request parameters**: model, temperature, top_p, max_tokens, stop, penalties
47+
- **Usage metrics**: input/output/total tokens, latency, cost estimates
48+
- **Provider metadata**: model returned, status, finish_reason, response_id
49+
- **Conversation**: full message history (roles + content)
50+
- **Cache & streaming**: flags and chunk counts
51+
- **Tool/function calls**: count and payload
52+
- **Errors**: exceptions with span status set to error
53+
54+
## Extras Field Reference
55+
56+
PromptingTools populates `AIMessage.extras` with detailed metadata that Logfire.jl maps to OpenTelemetry GenAI semantic convention attributes. The fields use unified naming across providers for consistency.
57+
58+
### Provider Metadata
59+
60+
| Extras Key | Type | Description | OpenAI | Anthropic |
61+
|------------|------|-------------|--------|-----------|
62+
| `:model` | String | Actual model used (may differ from requested) |||
63+
| `:response_id` | String | Provider's unique response identifier |||
64+
| `:system_fingerprint` | String | OpenAI system fingerprint for determinism || - |
65+
| `:service_tier` | String | Service tier used (e.g., "default", "standard") |||
66+
67+
### Unified Usage Keys
68+
69+
These keys provide cross-provider compatibility. Use these for provider-agnostic code:
70+
71+
| Extras Key | Type | Description | OpenAI Source | Anthropic Source |
72+
|------------|------|-------------|---------------|------------------|
73+
| `:cache_read_tokens` | Int | Tokens read from cache (cache hits) | `prompt_tokens_details.cached_tokens` | `cache_read_input_tokens` |
74+
| `:cache_write_tokens` | Int | Tokens written to cache | - | `cache_creation_input_tokens` |
75+
| `:reasoning_tokens` | Int | Chain-of-thought/reasoning tokens | `completion_tokens_details.reasoning_tokens` | - |
76+
| `:audio_input_tokens` | Int | Audio tokens in input | `prompt_tokens_details.audio_tokens` | - |
77+
| `:audio_output_tokens` | Int | Audio tokens in output | `completion_tokens_details.audio_tokens` | - |
78+
| `:accepted_prediction_tokens` | Int | Predicted tokens that were accepted | `completion_tokens_details.accepted_prediction_tokens` | - |
79+
| `:rejected_prediction_tokens` | Int | Predicted tokens that were rejected | `completion_tokens_details.rejected_prediction_tokens` | - |
80+
81+
### Anthropic-Specific Keys
82+
83+
| Extras Key | Type | Description |
84+
|------------|------|-------------|
85+
| `:cache_write_1h_tokens` | Int | Ephemeral 1-hour cache tokens |
86+
| `:cache_write_5m_tokens` | Int | Ephemeral 5-minute cache tokens |
87+
| `:web_search_requests` | Int | Server-side web search requests |
88+
| `:cache_creation_input_tokens` | Int | Original Anthropic key (backwards compat) |
89+
| `:cache_read_input_tokens` | Int | Original Anthropic key (backwards compat) |
90+
91+
### Raw Provider Dicts
92+
93+
For debugging or advanced use cases, the original nested structures are preserved:
94+
95+
| Extras Key | Provider | Contents |
96+
|------------|----------|----------|
97+
| `:prompt_tokens_details` | OpenAI | `{:cached_tokens, :audio_tokens}` |
98+
| `:completion_tokens_details` | OpenAI | `{:reasoning_tokens, :audio_tokens, :accepted_prediction_tokens, :rejected_prediction_tokens}` |
99+
| `:cache_creation` | Anthropic | `{:ephemeral_1h_input_tokens, :ephemeral_5m_input_tokens}` |
100+
| `:server_tool_use` | Anthropic | `{:web_search_requests}` |
101+
102+
### Example: Accessing Extras
103+
104+
```julia
105+
using PromptingTools
106+
107+
msg = aigenerate("What is 2+2?"; model="gpt4om")
108+
109+
# Provider metadata
110+
println("Model used: ", msg.extras[:model])
111+
println("Response ID: ", msg.extras[:response_id])
112+
113+
# Unified usage (works across providers)
114+
cache_hits = get(msg.extras, :cache_read_tokens, 0)
115+
reasoning = get(msg.extras, :reasoning_tokens, 0)
116+
117+
# Raw OpenAI details (if needed)
118+
if haskey(msg.extras, :prompt_tokens_details)
119+
details = msg.extras[:prompt_tokens_details]
120+
println("Cached: ", get(details, :cached_tokens, 0))
121+
end
122+
```
123+
124+
## Instrument Individual Models
125+
126+
You don't have to instrument all models. For selective tracing, wrap only specific models:
127+
128+
```julia
129+
Logfire.instrument_promptingtools_model!("my-local-llm")
130+
```
131+
132+
This reuses the model's registered PromptingTools schema, so provider-specific behavior is preserved.
133+
134+
## Alternative Backends
135+
136+
You don't have to use Logfire cloud. Send traces to any OpenTelemetry-compatible backend using standard environment variables:
137+
138+
| Variable | Purpose |
139+
|----------|---------|
140+
| `OTEL_EXPORTER_OTLP_ENDPOINT` | Backend URL (e.g., `http://localhost:4318`) |
141+
| `OTEL_EXPORTER_OTLP_HEADERS` | Custom headers (e.g., `Authorization=Bearer token`) |
142+
143+
### Local Development with Jaeger
144+
145+
```bash
146+
# Start Jaeger
147+
docker run --rm -p 16686:16686 -p 4318:4318 jaegertracing/all-in-one:latest
148+
```
149+
150+
```julia
151+
using Logfire
152+
153+
ENV["OTEL_EXPORTER_OTLP_ENDPOINT"] = "http://localhost:4318"
154+
155+
Logfire.configure(
156+
service_name = "my-app",
157+
send_to_logfire = :always # Export even without Logfire token
158+
)
159+
160+
Logfire.instrument_promptingtools!()
161+
# Now use PromptingTools normally - traces go to Jaeger
162+
```
163+
164+
View traces at: http://localhost:16686
165+
166+
### Using with Langfuse
167+
168+
```julia
169+
ENV["OTEL_EXPORTER_OTLP_ENDPOINT"] = "https://cloud.langfuse.com/api/public/otel"
170+
ENV["OTEL_EXPORTER_OTLP_HEADERS"] = "Authorization=Basic <base64-credentials>"
171+
172+
Logfire.configure(service_name = "my-llm-app", send_to_logfire = :always)
173+
```
174+
175+
## Recommended: Pydantic Logfire
176+
177+
While you can use any OTLP-compatible backend, we strongly recommend [Pydantic Logfire](https://pydantic.dev/logfire). Their free tier provides hundreds of thousands of traced conversations per month, which is more than enough for most use cases. The UI is purpose-built for LLM observability with excellent visualization of conversations, token usage, and costs.
178+
179+
## Authentication
180+
181+
- Provide your Logfire token via `Logfire.configure(token = "...")` or set `ENV["LOGFIRE_TOKEN"]`
182+
- Use `DotEnv.load!()` to load tokens from a project-local `.env` file (recommended for per-project configuration)
183+
184+
## Example
185+
186+
See the full example at [`examples/observability_with_logfire.jl`](https://github.com/svilupp/PromptingTools.jl/blob/main/examples/observability_with_logfire.jl).
187+
188+
## Further Reading
189+
190+
- [Logfire.jl Documentation](https://svilupp.github.io/Logfire.jl/dev)
191+
- [Logfire.jl GitHub](https://github.com/svilupp/Logfire.jl)
192+
- [Pydantic Logfire](https://pydantic.dev/logfire)

0 commit comments

Comments
 (0)