Skip to content

[k8s] Consider k8s.cluster.version and k8s.node.version entities #3529

@perk

Description

@perk

Area(s)

area:k8s

What's missing?

There is currently no semantic convention for Kubernetes version information. Specifically:

  1. No node-level version attribute — The kubelet version running on each node (node.status.nodeInfo.kubeletVersion) has no corresponding attribute. The existing k8s.node.* attributes cover name, uid, label, annotation, and condition, but not the kubelet version.

  2. No cluster-level version attribute — The Kubernetes API server version (returned by the /version endpoint) has no corresponding attribute.

These are critical for observability during cluster upgrades, where Kubernetes' version skew policy allows kubelets to be up to 3 minor versions behind the API server. Without version attributes, operators cannot correlate telemetry with the Kubernetes version running on each node or the control plane.

Describe the solution you'd like

Add two new attributes to the k8s registry:

k8s.cluster.version

  • Type: string
  • Brief: The Kubernetes version of the cluster API server.
  • Note: Reflects the version returned by the Kubernetes API server's /version endpoint. In HA clusters undergoing a rolling control plane upgrade, this may reflect the version of any individual API server instance and can vary between collection cycles until the upgrade completes.
  • Examples: ["1.30.2", "1.29.5"]

k8s.node.version

  • Type: string
  • Brief: The Kubernetes version of the kubelet running on the Node.
  • Note: This is the kubelet version as reported in node.status.nodeInfo.kubeletVersion. During cluster upgrades, different nodes (including control plane nodes) may report different versions due to rolling updates and the Kubernetes version skew policy.
  • Examples: ["1.30.2", "1.29.5"]

Why two separate attributes?

These represent different components with independent version lifecycles:

Attribute Source Scope
k8s.cluster.version API server /version endpoint Per cluster
k8s.node.version nodeInfo.kubeletVersion per node Per node

During a typical cluster upgrade:

  1. Control plane API servers are upgraded first (one at a time in HA setups)
  2. Kubelet on each control plane node is upgraded next
  3. Worker node kubelets are upgraded last

This means a single control plane node can temporarily have its API server at v1.30 while its kubelet is still at v1.29. Worker nodes may lag further behind. Both attributes are needed to fully represent the upgrade state.

Implementors

These attributes would be consumed by:

Tip

React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    No status

    Status

    Need triage

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions