Skip to content

ENG-916: Add service.version tracking and tenant identity across all deployments#137

Merged
prathamesh-sonpatki merged 7 commits intomainfrom
prathamesh/eng-916-mcp-add-serviceversion-tracking-across-all-per-customer
Apr 27, 2026
Merged

ENG-916: Add service.version tracking and tenant identity across all deployments#137
prathamesh-sonpatki merged 7 commits intomainfrom
prathamesh/eng-916-mcp-add-serviceversion-tracking-across-all-per-customer

Conversation

@prathamesh-sonpatki
Copy link
Copy Markdown
Member

@prathamesh-sonpatki prathamesh-sonpatki commented Apr 24, 2026

Summary

  • Root cause fix: Dockerfile used plain go build with no -ldflags — every Docker image baked in Version="dev". Added ARG VERSION/COMMIT and pass via -ldflags so service.version matches the git tag in production.
  • CI: Pass VERSION and COMMIT build-args to all 4 docker build steps (docker-registry.last9.io and ECR, both release-branch and main-branch paths).
  • Tenant identity on all telemetry: Moved OTel init after PopulateAPICfg so cfg.OrgSlug and cfg.ClusterID are available at resource build time. Added last9.tenant and last9.cluster_id as OTel resource attributes — auto-propagates to every span, metric, and log.
  • Version inventory metric: last9_mcp_server_info observable gauge with version, commit, tenant, cluster_id labels. Query: count by (version, tenant) (last9_mcp_server_info).
  • Restart/deploy detection: Change event fires on every startup (non-dev builds) with version + commit. Same version + same commit = restart; new version = deploy. Read-only token → 403 → warning logged, server continues.

Test plan

  • Build with -X main.Version=v0.0.0-test — verify service.version=v0.0.0-test on emitted spans
  • Run locally — verify last9_mcp_server_info{version="v0.0.0-test", commit=..., tenant=..., cluster_id=...} in OTLP output
  • Make a tool call — verify last9.tenant and last9.cluster_id on span resource attributes
  • Goreleaser dry-run — confirm ldflags inject correctly
  • Check change event fires on startup; check 403 path logs warning cleanly

Closes: https://linear.app/last9/issue/ENG-916

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added deployment change event tracking to report server startup events.
    • Introduced server info metrics for observability.
  • Improvements

    • Enhanced telemetry with tenant and cluster identification support.
    • Version and build metadata now embedded in server binary.
  • Chores

    • Optimized Docker build process to pass build-time arguments for version tracking.

…deployments

- Fix root cause: Dockerfile used plain `go build` with no -ldflags, baking
  Version="dev" into every Docker image. Add ARG VERSION/COMMIT and pass them
  via -ldflags so service.version matches the git tag in production.
- Pass build-args (VERSION, COMMIT) to all four docker build steps in CI
  (docker-registry and ECR, both release-branch and main-branch paths).
- Move OTel init after PopulateAPICfg so cfg.OrgSlug and cfg.ClusterID are
  available at resource build time; add last9.tenant and last9.cluster_id as
  resource attributes — propagates to all spans, metrics, and logs automatically.
- Emit last9_mcp_server_info observable gauge with version, commit, tenant,
  cluster_id labels for cross-customer version inventory dashboards.
- Emit "MCP server start" change event on every startup (non-dev builds) with
  version + commit attributes; same version + new commit = restart detection,
  new version = deploy detection. Non-2xx (e.g. read-only token) logs a warning.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@prathamesh-sonpatki
Copy link
Copy Markdown
Member Author

@coderabbitai do a full review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 24, 2026

@prathamesh-sonpatki Sure! I'll kick off a full review of this PR right away.

✅ Actions performed

Full review triggered.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 24, 2026

📝 Walkthrough

Walkthrough

Changes enable Docker builds to embed version and commit metadata into binaries through build-time arguments. Application startup is reordered to initialize auth and API configuration before telemetry setup, enriching telemetry with tenant/cluster attributes and enabling asynchronous deployment change event emission.

Changes

Cohort / File(s) Summary
Docker Build Configuration
.github/workflows/release.yaml, Dockerfile
Docker builds now receive and inject VERSION, COMMIT, and BUILD_TIME as build-time arguments; release workflow updated to pass version/commit parameters to Buildx and ECR push steps via build-args and environment variables.
Telemetry Enhancement
internal/telemetry/setup.go
InitProviders function signature expanded to accept tenant and clusterID parameters, which are injected as OpenTelemetry resource attributes (last9.tenant, last9.cluster_id) for enhanced observability context.
Application Initialization
main.go
Startup sequence reordered to initialize auth token manager and API configuration before telemetry setup; adds asynchronous deployment change event emission to /change_events endpoint and registers observable gauge last9_mcp_server_info for server identity metrics.

Sequence Diagram(s)

sequenceDiagram
    participant main as main.go
    participant auth as Auth Manager
    participant api as API Config
    participant otel as OpenTelemetry
    participant gauge as Observable Gauge
    participant events as Change Events Endpoint

    main->>auth: Initialize token manager
    auth-->>main: Token ready
    
    main->>api: Populate API configuration
    api-->>main: Config initialized (tenant, clusterID)
    
    main->>otel: InitProviders(version, tenant, clusterID)
    otel->>otel: Add resource attributes
    otel-->>main: Providers initialized
    
    main->>gauge: Register last9_mcp_server_info
    gauge-->>main: Gauge registered
    
    main->>events: emitDeployChangeEvent (async)
    events->>events: Marshal event with version/commit
    events-->>main: HTTP PUT /change_events
Loading
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately reflects the main objective of the changeset: adding service version tracking and tenant identity across all deployments through Docker build arguments, telemetry enhancements, and metrics/change event emission.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch prathamesh/eng-916-mcp-add-serviceversion-tracking-across-all-per-customer

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/release.yaml:
- Around line 165-167: The workflow's build-args block is missing the BUILD_TIME
value expected by the Dockerfile ARG BUILD_TIME=unknown; update the build-args
(alongside VERSION and COMMIT) to pass BUILD_TIME (for example using
github.event.head_commit.timestamp or another CI timestamp) so the Docker build
receives the timestamp instead of defaulting to "unknown"; modify the same
build-args block where VERSION and COMMIT are set to include BUILD_TIME.

In `@main.go`:
- Around line 22-37: The import block in main.go mixes standard library and
local imports; reorder imports so standard library packages (bytes,
encoding/json, net/http) appear first, then a blank line, then third-party
packages (github.com/...), then another blank line and finally local packages
like "last9-mcp/internal/constants"; update the import grouping accordingly and
run goimports/gofmt (or your editor's import organizer) to apply the
conventional stdlib → third-party → local ordering.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

Run ID: 61dbd42c-b17f-4993-95e3-a3f835c5a30e

📥 Commits

Reviewing files that changed from the base of the PR and between 361aeef and 6657925.

📒 Files selected for processing (4)
  • .github/workflows/release.yaml
  • Dockerfile
  • internal/telemetry/setup.go
  • main.go

Comment thread .github/workflows/release.yaml
Comment thread main.go Outdated
prathamesh-sonpatki and others added 2 commits April 24, 2026 20:01
…sistency

- Remove emitDeployChangeEvent (API shape unclear, deferred to follow-up)
- Fix import groups: stdlib → external → internal (goimports order)
- Remove duplicate mcp.server.version resource attribute (service.version already set)
- Use last9. prefix consistently on gauge attributes (last9.tenant, last9.cluster_id)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… double blank line

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@prathamesh-sonpatki
Copy link
Copy Markdown
Member Author

MCP evals71/71 passed ✓ (80s)

Suite Passed Failed
datasource_selection 4/4 0
exception_investigation 7/7 0
get_alert_config 6/6 0
get_alerts 6/6 0
log_query 13/13 0
trace_query 12/12 0
type_coercion 23/23 0

Evaluated against SHA 678ea63248867bcaecece540eacd585c3ecfe65f.

coderabbitai[bot]
coderabbitai Bot previously approved these changes Apr 27, 2026
kneurgao
kneurgao previously approved these changes Apr 27, 2026
…cate resource attributes

- Add BUILD_TIME build-arg to all four docker build invocations in
  release.yaml (both buildx and raw docker CLI paths for ECR and
  docker-registry); previously always baked in "unknown"
- Remove last9.tenant and last9.cluster_id from last9_mcp_server_info
  gauge callback — already present as OTel resource attributes via
  InitProviders, no need to duplicate at metric-point level

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@prathamesh-sonpatki
Copy link
Copy Markdown
Member Author

MCP evals71/71 passed ✓ (53s)

Suite Passed Failed
datasource_selection 4/4 0
exception_investigation 7/7 0
get_alert_config 6/6 0
get_alerts 6/6 0
log_query 13/13 0
trace_query 12/12 0
type_coercion 23/23 0

Evaluated against SHA 15630deec38e367fa9f953295a3775f19baedf25.

prathamesh-sonpatki and others added 2 commits April 27, 2026 17:18
The test was asserting Timestamp == startTimeParam (1770649445). After
the fix in #138 where MakePromRangeAPIQuery and MakePromLabelValuesAPIQuery
switched to endTimeParam, the correct expected timestamp is endTimeParam
(1770653045 = 2026-02-09T16:04:05Z). Update assertion accordingly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ceversion-tracking-across-all-per-customer' into prathamesh/eng-916-mcp-add-serviceversion-tracking-across-all-per-customer
@prathamesh-sonpatki prathamesh-sonpatki enabled auto-merge (squash) April 27, 2026 11:59
@prathamesh-sonpatki
Copy link
Copy Markdown
Member Author

MCP evals71/71 passed ✓ (51s)

Suite Passed Failed
datasource_selection 4/4 0
exception_investigation 7/7 0
get_alert_config 6/6 0
get_alerts 6/6 0
log_query 13/13 0
trace_query 12/12 0
type_coercion 23/23 0

Evaluated against SHA b73af036be2de6f15980cf0f75164109376b9002.

@prathamesh-sonpatki prathamesh-sonpatki merged commit 9a40e29 into main Apr 27, 2026
7 checks passed
@prathamesh-sonpatki prathamesh-sonpatki deleted the prathamesh/eng-916-mcp-add-serviceversion-tracking-across-all-per-customer branch April 27, 2026 12:49
@prathamesh-sonpatki prathamesh-sonpatki mentioned this pull request Apr 28, 2026
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants