ENG-916: Add service.version tracking and tenant identity across all deployments#137
Conversation
…deployments - Fix root cause: Dockerfile used plain `go build` with no -ldflags, baking Version="dev" into every Docker image. Add ARG VERSION/COMMIT and pass them via -ldflags so service.version matches the git tag in production. - Pass build-args (VERSION, COMMIT) to all four docker build steps in CI (docker-registry and ECR, both release-branch and main-branch paths). - Move OTel init after PopulateAPICfg so cfg.OrgSlug and cfg.ClusterID are available at resource build time; add last9.tenant and last9.cluster_id as resource attributes — propagates to all spans, metrics, and logs automatically. - Emit last9_mcp_server_info observable gauge with version, commit, tenant, cluster_id labels for cross-customer version inventory dashboards. - Emit "MCP server start" change event on every startup (non-dev builds) with version + commit attributes; same version + new commit = restart detection, new version = deploy detection. Non-2xx (e.g. read-only token) logs a warning. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
@coderabbitai do a full review |
|
✅ Actions performedFull review triggered. |
📝 WalkthroughWalkthroughChanges enable Docker builds to embed version and commit metadata into binaries through build-time arguments. Application startup is reordered to initialize auth and API configuration before telemetry setup, enriching telemetry with tenant/cluster attributes and enabling asynchronous deployment change event emission. Changes
Sequence Diagram(s)sequenceDiagram
participant main as main.go
participant auth as Auth Manager
participant api as API Config
participant otel as OpenTelemetry
participant gauge as Observable Gauge
participant events as Change Events Endpoint
main->>auth: Initialize token manager
auth-->>main: Token ready
main->>api: Populate API configuration
api-->>main: Config initialized (tenant, clusterID)
main->>otel: InitProviders(version, tenant, clusterID)
otel->>otel: Add resource attributes
otel-->>main: Providers initialized
main->>gauge: Register last9_mcp_server_info
gauge-->>main: Gauge registered
main->>events: emitDeployChangeEvent (async)
events->>events: Marshal event with version/commit
events-->>main: HTTP PUT /change_events
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/release.yaml:
- Around line 165-167: The workflow's build-args block is missing the BUILD_TIME
value expected by the Dockerfile ARG BUILD_TIME=unknown; update the build-args
(alongside VERSION and COMMIT) to pass BUILD_TIME (for example using
github.event.head_commit.timestamp or another CI timestamp) so the Docker build
receives the timestamp instead of defaulting to "unknown"; modify the same
build-args block where VERSION and COMMIT are set to include BUILD_TIME.
In `@main.go`:
- Around line 22-37: The import block in main.go mixes standard library and
local imports; reorder imports so standard library packages (bytes,
encoding/json, net/http) appear first, then a blank line, then third-party
packages (github.com/...), then another blank line and finally local packages
like "last9-mcp/internal/constants"; update the import grouping accordingly and
run goimports/gofmt (or your editor's import organizer) to apply the
conventional stdlib → third-party → local ordering.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI (base), Organization UI (inherited)
Review profile: ASSERTIVE
Plan: Pro
Run ID: 61dbd42c-b17f-4993-95e3-a3f835c5a30e
📒 Files selected for processing (4)
.github/workflows/release.yamlDockerfileinternal/telemetry/setup.gomain.go
…sistency - Remove emitDeployChangeEvent (API shape unclear, deferred to follow-up) - Fix import groups: stdlib → external → internal (goimports order) - Remove duplicate mcp.server.version resource attribute (service.version already set) - Use last9. prefix consistently on gauge attributes (last9.tenant, last9.cluster_id) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… double blank line Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
MCP evals — 71/71 passed ✓ (80s)
Evaluated against SHA |
…cate resource attributes - Add BUILD_TIME build-arg to all four docker build invocations in release.yaml (both buildx and raw docker CLI paths for ECR and docker-registry); previously always baked in "unknown" - Remove last9.tenant and last9.cluster_id from last9_mcp_server_info gauge callback — already present as OTel resource attributes via InitProviders, no need to duplicate at metric-point level Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ecffd83
…acking-across-all-per-customer
|
MCP evals — 71/71 passed ✓ (53s)
Evaluated against SHA |
The test was asserting Timestamp == startTimeParam (1770649445). After the fix in #138 where MakePromRangeAPIQuery and MakePromLabelValuesAPIQuery switched to endTimeParam, the correct expected timestamp is endTimeParam (1770653045 = 2026-02-09T16:04:05Z). Update assertion accordingly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ceversion-tracking-across-all-per-customer' into prathamesh/eng-916-mcp-add-serviceversion-tracking-across-all-per-customer
|
MCP evals — 71/71 passed ✓ (51s)
Evaluated against SHA |
Summary
Dockerfileused plaingo buildwith no-ldflags— every Docker image baked inVersion="dev". AddedARG VERSION/COMMITand pass via-ldflagssoservice.versionmatches the git tag in production.VERSIONandCOMMITbuild-args to all 4 docker build steps (docker-registry.last9.io and ECR, both release-branch and main-branch paths).PopulateAPICfgsocfg.OrgSlugandcfg.ClusterIDare available at resource build time. Addedlast9.tenantandlast9.cluster_idas OTel resource attributes — auto-propagates to every span, metric, and log.last9_mcp_server_infoobservable gauge withversion,commit,tenant,cluster_idlabels. Query:count by (version, tenant) (last9_mcp_server_info).version+commit. Same version + same commit = restart; new version = deploy. Read-only token → 403 → warning logged, server continues.Test plan
-X main.Version=v0.0.0-test— verifyservice.version=v0.0.0-teston emitted spanslast9_mcp_server_info{version="v0.0.0-test", commit=..., tenant=..., cluster_id=...}in OTLP outputlast9.tenantandlast9.cluster_idon span resource attributesCloses: https://linear.app/last9/issue/ENG-916
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Improvements
Chores