Skip to content

fix(otelhttp): report correct read bytes in client instrumentation#8684

Open
Xio-Shark wants to merge 4 commits intoopen-telemetry:mainfrom
Xio-Shark:fix-otelhttp-read-bytes
Open

fix(otelhttp): report correct read bytes in client instrumentation#8684
Xio-Shark wants to merge 4 commits intoopen-telemetry:mainfrom
Xio-Shark:fix-otelhttp-read-bytes

Conversation

@Xio-Shark
Copy link
Copy Markdown

Summary: 修复 otelhttp client instrumentation 中 read bytes 统计不准确问题

Changes:

  • 新增回归测试覆盖 body read timing 问题
  • 保证只在 body 最终状态确定后记录 read bytes
  • 避免 EOF / Close 双路径重复记录

Motivation:

Testing:

  • go test ./instrumentation/net/http/otelhttp/...

@Xio-Shark Xio-Shark requested review from a team and dmathieu as code owners March 17, 2026 11:15
@linux-foundation-easycla
Copy link
Copy Markdown

linux-foundation-easycla Bot commented Mar 17, 2026

CLA Signed

The committers listed above are authorized under a signed CLA.

@flc1125 flc1125 added the blocked: CLA Waiting on CLA to be signed before progress can be made label Mar 17, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 14e286e8a8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +312 to +313
_ = wb.closeBody()
wb.recordMetricsOnce()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid closing response body when Read returns EOF

wrappedBody.Read now calls closeBody() on every io.EOF, which changes behavior for upgraded/duplex responses (io.ReadWriteCloser) from transparent wrapping to forced shutdown. In protocol-switch or CONNECT-style streams, a read-side EOF can happen while the write side is still valid (TCP half-close); this automatic close tears down the underlying connection before user code calls Close, so subsequent writes can fail unexpectedly. The previous implementation finalized metrics/span on EOF without closing the body, which preserved caller control over stream lifetime.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Author

Friendly follow-up: EasyCLA already reported success on March 17, 2026, but this PR still has the blocked: CLA label. @flc1125 could a maintainer please remove that label so the PR can proceed to normal review? Thanks!

@github-actions github-actions Bot requested a review from sonalgaud12 April 18, 2026 19:55
Copy link
Copy Markdown
Author

Pushed follow-up commit 890ab4f34 to address the unresolved EOF/Close review concern.

What changed:

  • wrappedBody.Read() no longer calls closeBody() on io.EOF
  • metrics/span finalization still happens on EOF via recordMetricsOnce()
  • added a regression test to verify an io.ReadWriteCloser remains writable after EOF until the caller explicitly closes it

Local validation:

  • go test . -run 'TestWrappedBodyReadEOFThenCloseRecordsOnce|TestWrappedBodyReadEOFKeepsReadWriteCloserOpenUntilClose|TestTransportProtocolSwitch' -count=1
  • go test .

Both passed in instrumentation/net/http/otelhttp.

The blocked: CLA label is still present even though EasyCLA already reported success on March 17, 2026, so after this code-level review concern is out of the way the remaining blocker should still be maintainer-side label cleanup.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 890ab4f346

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +168 to +170
trackedBody = newRequestBodyTracker(body, func() {
if currentTrackedBody() == trackedBody {
recordMetrics()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Synchronize recordMetrics callback publication

The request-body onClose callback can run on the transport write goroutine while RoundTrip later reassigns recordMetrics after t.rt.RoundTrip(r) returns, but this read/write of the captured function variable is unsynchronized. In practice (notably with net/http's concurrent write loop), this is a data race and can cause nondeterministic behavior, including invoking the stale no-op callback and dropping http.client.request.body.size/duration emission for that request. Please publish/access the callback through synchronization (e.g., mutex/atomic or a stable closure that reads synchronized state) before any concurrent close path can observe it.

Useful? React with 👍 / 👎.

@flc1125 flc1125 removed the blocked: CLA Waiting on CLA to be signed before progress can be made label Apr 19, 2026
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is not necessary, please remove it.

Additionally, please read first: https://github.com/open-telemetry/community/blob/main/policies/genai.md#frequently-asked-questions---contributors

@Xio-Shark Xio-Shark force-pushed the fix-otelhttp-read-bytes branch from 890ab4f to 8757c34 Compare April 19, 2026 15:06
@Xio-Shark
Copy link
Copy Markdown
Author

Hi @flc1125, I've pushed the follow-up on top of the rebased branch.

Changes included:

  • kept upgraded bodies open after EOF
  • serialized metrics callback access to avoid the callback/assignment race around recordMetrics
  • preserved error metrics on transport failures

Validation:

  • in instrumentation/net/http/otelhttp:
    • go build ./...
    • go test . -run 'TestWrappedBodyReadEOFKeepsReadWriteCloserOpenUntilClose|TestTransportMetricsReportFinalRequestBodySize|TestTransportErrorTypeMetricAttribute' -count=1

@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@Xio-Shark
Copy link
Copy Markdown
Author

Hi @flc1125, thanks for the review. I've addressed all three points:

  1. Agent-run doc removed — The rebased branch no longer contains docs/agent-runs/2026-03-17-otelhttp-read-bytes.md.

  2. P1 EOF/Close behavior — Fixed in eec1f7d19: wrappedBody.Read() no longer calls closeBody() on io.EOF. A regression test verifies that upgraded bodies (io.ReadWriteCloser) remain writable after EOF until the caller explicitly closes them.

  3. P1 recordMetrics callback race — Fixed in 2aa72ff1f: the callback publication path is now serialized with a mutex so the transport write goroutine and RoundTrip cannot race on the captured function variable.

  4. Error metrics preservation — Added in e91b7421d: ensures body-read errors are still reflected in http.client.request.body.size even when EOF does not trigger close.

I've also read the OpenTelemetry GenAI policy you linked.

Could you please take another look when you have a moment? The blocked: CLA label is still attached even though EasyCLA reported success on March 17—would appreciate it if a maintainer could clear that as well. Thanks!

@Xio-Shark
Copy link
Copy Markdown
Author

Hi @flc1125, friendly reminder that the EasyCLA check was already passed on 2026-03-17 ("The committers listed above are authorized under a signed CLA."). Could you please help remove the "blocked: CLA" label so this PR can move forward to review? Thank you!

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 21, 2026

Codecov Report

❌ Patch coverage is 98.92473% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 82.4%. Comparing base (c5160bb) to head (e91b742).
⚠️ Report is 13 commits behind head on main.

Files with missing lines Patch % Lines
instrumentation/net/http/otelhttp/transport.go 98.9% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##            main   #8684     +/-   ##
=======================================
- Coverage   83.3%   82.4%   -1.0%     
=======================================
  Files        193     193             
  Lines      15826   19794   +3968     
=======================================
+ Hits       13191   16314   +3123     
- Misses      2152    2997    +845     
  Partials     483     483             
Files with missing lines Coverage Δ
instrumentation/net/http/otelhttp/transport.go 94.5% <98.9%> (+1.7%) ⬆️

... and 186 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants