The _captureGauges() method re-submits all stored gauge values on every 10-second flush, but unlike counters (which call clear() at L272) the gauge map is never cleared:
|
_captureGauges () { |
|
this._captureTree(this._gauges, (node, name, tags) => { |
|
this._client.gauge(name, node.value, tags) |
|
}) |
|
} |
|
|
|
_captureCounters () { |
|
this._captureTree(this._counters, (node, name, tags) => { |
|
this._client.increment(name, node.value, tags) |
|
}) |
|
|
|
this._counters.clear() |
|
} |
|
|
|
_captureHistograms () { |
|
this._captureTree(this._histograms, (node, name, tags) => { |
|
let stats = node.value |
|
|
|
// Stats can contain garbage data when a value was never recorded. |
|
if (stats.count === 0) { |
|
stats = { max: 0, min: 0, sum: 0, avg: 0, median: 0, p95: 0, count: 0 } |
|
} |
|
|
|
this._client.gauge(`${name}.min`, stats.min, tags) |
|
this._client.gauge(`${name}.max`, stats.max, tags) |
|
this._client.increment(`${name}.sum`, stats.sum, tags) |
|
this._client.increment(`${name}.total`, stats.sum, tags) |
|
this._client.gauge(`${name}.avg`, stats.avg, tags) |
|
this._client.increment(`${name}.count`, stats.count, tags) |
|
this._client.gauge(`${name}.median`, stats.median, tags) |
|
this._client.gauge(`${name}.95percentile`, stats.p95, tags) |
|
|
|
node.value.reset() |
|
}) |
|
} |
Problem: If a gauge stops being updated (e.g. data source fails), the last known value continues to be submitted every 10 seconds for the lifetime of the process. This prevents Datadog "no data" alerts from triggering, effectively masking outages.
Requested options (any of these would help):
- A flag to disable client-side gauge aggregation entirely (similar to Go client's WithoutClientSideAggregation)
- A configurable TTL for cached gauges (e.g. stop re-submitting after 5 minutes of no updates)
- Exposing reset() on the public DogStatsD interface so users can manually clear the cache
- Current workaround: Periodically restart the Node.js process to clear the in-memory gauge cache.
The _captureGauges() method re-submits all stored gauge values on every 10-second flush, but unlike counters (which call clear() at L272) the gauge map is never cleared:
dd-trace-js/packages/dd-trace/src/dogstatsd.js
Lines 261 to 295 in 5de82d7
Problem: If a gauge stops being updated (e.g. data source fails), the last known value continues to be submitted every 10 seconds for the lifetime of the process. This prevents Datadog "no data" alerts from triggering, effectively masking outages.
Requested options (any of these would help):