Enable trace methods#2294
Conversation
Add .env/.env.local (the local validation RPC endpoint lives there and is never committed) and CLAUDE.local.md to .gitignore.
Implement trace_block, trace_transaction, trace_call, trace_callMany, trace_replayTransaction and trace_replayBlockTransactions in the Parity/OpenEthereum format, with trace, stateDiff and vmTrace outputs. Output is validated byte-for-byte against Polygon erigon across the block, transaction and replay methods, including Bor specifics: real coinbase via CalculateCoinbase, base-fee burn to the burnt contract, exact intrinsic gas and gross gasUsed, precompile-call filtering, and state-sync transaction tracing. Includes unit and RPC integration tests.
The Parity trace_* methods are expensive and archive-oriented, so they are not exposed by default. Add the rpc.enabletrace flag (default off); when it is unset the trace namespace is not registered.
|
@claude review once |
There was a problem hiding this comment.
Pull request overview
Adds an opt-in Parity/OpenEthereum-compatible trace_* RPC namespace to Bor’s tracer stack, aimed at archive-node operators who need Parity-style execution traces (trace/stateDiff/vmTrace) for indexing and debugging.
Changes:
- Introduces
trace_*RPC methods via a newtracenamespace, gated by--rpc.enabletrace/enabletraceconfig. - Implements Parity-compatible trace conversion,
stateDiff, and opcode-levelvmTrace, plus extensive unit tests. - Updates operator-facing documentation, default config testdata, version patch, and gitignore.
Reviewed changes
Copilot reviewed 23 out of 24 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| README.md | Documents the new opt-in trace_* namespace and operational requirements. |
| params/version.go | Bumps patch version. |
| internal/cli/server/testdata/default.toml | Adds enabletrace default value in JsonRPC test config. |
| internal/cli/server/server.go | Registers trace RPC APIs only when EnableTrace is set. |
| internal/cli/server/flags.go | Adds --rpc.enabletrace CLI flag. |
| internal/cli/server/config.go | Adds JsonRPC.EnableTrace config field and default. |
| eth/tracers/parity.go | Adds Parity replay result types and mined-tx trace implementation. |
| eth/tracers/parity_vmtrace.go | Adds Parity-compatible opcode-level vmTrace tracer implementation. |
| eth/tracers/parity_vmtrace_values_test.go | Adds targeted vmTrace value assertions (push/mem). |
| eth/tracers/parity_vmtrace_test.go | Adds end-to-end vmTrace tests for contract + EOA cases. |
| eth/tracers/parity_test.go | Adds unit tests for traceTypes parsing and metadata stripping. |
| eth/tracers/parity_statediff.go | Implements Parity-compatible stateDiff conversion and feeless adjustments. |
| eth/tracers/parity_statediff_test.go | Adds unit + end-to-end tests for stateDiff behavior. |
| eth/tracers/parity_replay_tx.go | Implements trace_replayTransaction. |
| eth/tracers/parity_replay_tx_test.go | Tests trace_replayTransaction behavior and errors. |
| eth/tracers/parity_replay_block.go | Implements trace_replayBlockTransactions. |
| eth/tracers/parity_replay_block_test.go | Tests trace_replayBlockTransactions behavior and errors. |
| eth/tracers/parity_gas_test.go | Locks gas accounting semantics against expected Parity/Erigon behavior. |
| eth/tracers/parity_calltracer.go | Adds a wrapper tracer to capture post-execution gas-left for gross gasUsed. |
| eth/tracers/parity_call.go | Implements trace_call and trace_callMany. |
| eth/tracers/parity_call_test.go | Adds tests for trace_call / trace_callMany and input decoding. |
| eth/tracers/api.go | Adds Parity trace types, TraceAPI registration, and trace_block parity implementation; adjusts default tracer timeout. |
| eth/tracers/api_test.go | Adds parity trace tests and RPC registration tests; adjusts a global-directory test to avoid parallel races. |
| .gitignore | Ignores local env secret files. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| `trace_callMany`). These methods are **disabled by default** because they are | ||
| expensive and require an archive node. | ||
|
|
||
| Enable with the `--rpc.enabletrace` flag (or `JsonRPC.EnableTrace = true` in TOML): |
| // by default before being forcefully aborted. Parity-style trace_* replay of | ||
| // heavy historical transactions (deep account-abstraction call trees) can | ||
| // exceed a few seconds, so the default is generous; the trace namespace is | ||
| // opt-in (--rpc.enabletrace) and intended for archive nodes. | ||
| defaultTraceTimeout = 60 * time.Second |
| last := addr[common.AddressLength-1] | ||
| if last < 0x01 || last > 0x0a { | ||
| return false |
| // Result / error follow erigon semantics: | ||
| // - success: result present, no error | ||
| // - revert: result present (gasUsed + revert output) AND error "Reverted" | ||
| // - other error (out of gas, ...): result omitted, error = raw EVM string |
| ```toml | ||
| [JsonRPC] | ||
| EnableTrace = true | ||
| ``` |
|
checking on the issues and will push changes after then, thanks |
…FrameToParityTraces
… namespace opt-in
|
@BlockReq-AmyPan yes, addressed in my last commits (including tests). Gonna check sonar issues now |
- Extract parityFrameCtx struct (9-param fn → 3-param), fixing param count warning - Split TestConvertCallFrameToParityTraces (298 lines, complexity 91) into 9 focused per-scenario tests - Add mustConvertFrame/parityTestCtx helpers to reduce test boilerplate duplication - Extract anonymous structs in parity_vmtrace_values_test.go as named types - Add explanatory comment inside OnTxEnd empty function body
…layResult - Add parityExecInput struct bundling (tx, msg, txctx, vmctx, statedb, config) so the 7-param sequence is not repeated in parityTraceTx, parityStateDiffFor and parityVMTraceFor signatures - Add withStateCopy() helper for pre-execution state snapshots - Extract buildReplayResult() helper: the stateDiff/vmTrace/trace dispatch block was repeated in ReplayTransaction, replayBlockTransactions, and traceCallExec; now lives in one place - Update all three callers to use buildReplayResult
trace_block (traceBlockParityByHash) and trace_replayBlockTransactions (replayBlockTransactions) each carried an identical copy of the block pre-state setup and per-transaction message/context construction, which SonarCloud flagged as duplicated new code. Extract the shared logic into parityBlockExec plus setupParityBlockExec (parent state + beacon-root/parent-hash system calls) and txInput (message + per-tx Context). Both methods now build on the single copy; the diverging per-tx loop bodies (flatten vs ReplayResult) remain local. Also apply gofmt to api_test.go.
8fae1ad to
b99944c
Compare
Move all Parity trace_* additions out of api.go into dedicated files (parity_block.go: types, TraceAPI, trace_block; parity_convert.go: call frame conversion), returning api.go to its upstream size. TraceTransaction and transactionParity now share canonicalTxTraceEnv, removing their duplicated mined-tx setup (SonarCloud duplication). Split the four functions over the cognitive-complexity threshold: convertCallFrameToParityTraces into per-concern builders (frame accessors, trace type, action, result, error), buildParityStateDiff into per-account- state helpers (deleted/created/modified), parityTraceTx into config/decode/ gas/context helpers, and vmTraceMemRegion into a table-driven opcode spec. Add unit tests for the extracted helpers (mem-region table, push counts, tracer interrupt/sub-frame/mem capture, stateDiff field encoding, root gas boundary, callMany tuple decoding, block resolution errors, feeless gas debit) to lift the diffguard mutation score above the tier thresholds.
The parity test files each carried a near-identical copy of the chain setup boilerplate (funded accounts, per-block value transfer, backend construction), which SonarCloud flagged as duplicated new code. Consolidate them in parity_harness_test.go: newTransferChainAPI (value- transfer chain), newPrefundedContractAPI (genesis-deployed contract call), replayVMTraceFrame (vmTrace replay + decode) and newRegisteredRPCServer (in-proc RPC registration). Test assertions are unchanged.
|
|
@claude review once |
| func (api *API) TraceBlockParity(ctx context.Context, number rpc.BlockNumber, config *TraceConfig) ([]*ParityTrace, error) { | ||
| block, err := api.blockByNumber(ctx, number) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("failed to get block: %w", err) | ||
| } | ||
|
|
||
| return api.traceBlockParityByHash(ctx, block.Hash(), config) | ||
| } | ||
|
|
||
| // TraceBlockParityByNumber returns Parity-format traces for a block by number. | ||
| // This method is exposed as debug_traceBlockParityByNumber in the RPC API. | ||
| func (api *API) TraceBlockParityByNumber(ctx context.Context, number rpc.BlockNumber, config *TraceConfig) ([]*ParityTrace, error) { | ||
| return api.TraceBlockParity(ctx, number, config) | ||
| } | ||
|
|
||
| // TraceBlockParityByHash returns Parity-format traces for a block by hash. | ||
| // This method is exposed as debug_traceBlockParityByHash in the RPC API. | ||
| func (api *API) TraceBlockParityByHash(ctx context.Context, hash common.Hash, config *TraceConfig) ([]*ParityTrace, error) { |
There was a problem hiding this comment.
🟡 The three block-level Parity methods (TraceBlockParity, TraceBlockParityByNumber, TraceBlockParityByHash at eth/tracers/parity_block.go:86-108) are defined on *API rather than *TraceAPI, so they are registered under the debug namespace unconditionally via tracers.APIs() at server.go:296 — i.e., they are reachable as debug_traceBlockParity[ByNumber|ByHash] without setting --rpc.enabletrace. The five tx/call-level methods (Transaction, Call, CallMany, ReplayTransaction, ReplayBlockTransactions) live on *TraceAPI and are correctly gated. This contradicts the PR summary's claim that "the namespace is opt-in only via --rpc.enabletrace" — strictly, only the trace namespace is opt-in. Practical exposure is limited (debug is not in the default --http.api list), and the behavior appears intentional (doc comments name the debug_ methods explicitly and TestDebugTraceBlockParityRPCRegistration at api_test.go:1788 asserts the debug registration), so this is a documentation/consistency inconsistency rather than a code defect. Consider either moving the three methods to *TraceAPI (dropping the debug alias) or updating the README to note that block-level Parity traces are also reachable via debug.
Extended reasoning...
What's happening
Three of the six Parity-format trace operations are exposed via two RPC surfaces, and only one of them is gated by --rpc.enabletrace:
func (api *API) TraceBlockParity(...)ateth/tracers/parity_block.go:86→debug_traceBlockParityfunc (api *API) TraceBlockParityByNumber(...)ateth/tracers/parity_block.go:97→debug_traceBlockParityByNumberfunc (api *API) TraceBlockParityByHash(...)ateth/tracers/parity_block.go:103→debug_traceBlockParityByHash
These are methods on *API. In internal/cli/server/server.go:296, tracers.APIs(srv.backend.APIBackend) is registered unconditionally, and tracers.APIs returns *API under the debug namespace. Only after this does the code check config.JsonRPC.EnableTrace (server.go:302) to conditionally register tracers.TraceAPIs — which returns *TraceAPI under the trace namespace. Since Go's RPC framework reflects on the receiver and exports every RPC-compatible exported method, any exported method on *API becomes debug_<lowerFirstName> automatically.
The other five Parity operations (Transaction, Call, CallMany, ReplayTransaction, ReplayBlockTransactions) are methods on *TraceAPI, so they only appear when --rpc.enabletrace is set. The gating is therefore inconsistent for the block-level trio.
Why this contradicts the PR summary
The PR summary states: "The namespace is opt-in only via --rpc.enabletrace (default off) so existing nodes are unaffected." Strictly, only the trace namespace is opt-in. An operator who exposes the debug namespace (e.g., --http.api=eth,net,web3,debug on an archive node) gets three block-level Parity trace operations without ever setting --rpc.enabletrace, and would reasonably not expect them based on the summary.
Step-by-step proof
- Start bor with the debug namespace exposed and
--rpc.enabletraceunset (or absent):bor server --http --http.api=eth,net,web3,debug. tracers.APIs(...)runs at server.go:296 → registers a{Namespace: "debug", Service: NewAPI(backend)}entry.- The subsequent
if config.JsonRPC.EnableTraceguard at server.go:302 is skipped, sotracers.TraceAPIsis not registered — thetracenamespace is absent (as intended). - Go's RPC framework reflects on the
*APIservice and, becauseTraceBlockParity(ctx, number, config)has an RPC-compatible signature (returns([]*ParityTrace, error)) and starts with an uppercase letter, exports it asdebug_traceBlockParity. Same for the ByNumber/ByHash variants. - A call to
debug_traceBlockParity" 0x1succeeds (or returns an execution error), whereastrace_block "0x1"returns method-not-found. The testTestDebugTraceBlockParityRPCRegistrationat api_test.go:1788 baselines exactly this behavior (registers onlyAPIs, assertsdebug_traceBlockParityis present).
Why the refutations are correct on intent but the finding still stands
The refuting verifiers point out — accurately — that this is deliberate:
- Each of the three method's doc comments explicitly says "This method is exposed as debug_traceBlockParity in the RPC API." (parity_block.go:85, 96, 102).
TestDebugTraceBlockParityRPCRegistrationat api_test.go:1788 exercises the default (APIs) API set and asserts thatdebug_traceBlockParityis registered — this is a positive assertion of the debug exposure.- The
debugnamespace is off by default (default--http.api" = "eth,net,web3,txpool,bor"), and it already contains equally expensive tracers (debug_traceBlock,debug_traceTransaction, etc.).
Given all three signals — comments, tests, and existing debug tracers — this is intentional dual-exposure, not a bug. The remaining concern is purely a documentation/consistency one: the PR summary's blanket "opt-in only" claim is misleading for the three block-level methods.
Impact
- No new attack surface: operators already accept expensive tracers when they enable the
debugnamespace, anddebug_traceBlockis comparable in cost. - No functional regression: the code paths and outputs match the gated
trace_blockpath. - The only real risk is operator surprise: an archive operator reading the PR summary may not realize that block-level Parity traces are already reachable via
debugif they enable that namespace for any other reason.
Suggested fix (either is fine)
- Move the three methods to
*TraceAPI, dropping thedebug_alias. This makes gating uniform across all six Parity methods and matches the PR summary. - Keep the current design and update the README's rollout notes to mention that the three block-level methods are additionally reachable as
debug_traceBlockParity[ByNumber|ByHash]for operators who cannot flip--rpc.enabletrace. UpdateTestDebugTraceBlockParityRPCRegistration's comment to spell out why the debug exposure is intentional.
|
|
||
| **Requirements:** | ||
| - Archive node (`--gcmode archive`) | ||
| - Sufficient RPC timeout for deep blocks (`--rpc.evmtimeout` ≥ 60s recommended) |
There was a problem hiding this comment.
🟡 The README's --rpc.evmtimeout ≥ 60s recommended line under the trace namespace is misleading: --rpc.evmtimeout maps to RPCEVMTimeout, which is only read by eth_call / eth_simulate / graphql — never by anything under eth/tracers/. Setting it has no effect on trace_* methods (they use their own defaultParityTraceTimeout = 60s internally). Consider dropping the line entirely (the 60s trace-phase default already gives operators what the README claims to want), or replacing it with the knob that actually bounds a long trace call's wall-clock — the HTTP server write timeout (--http.timeout.write, default 30s).
Extended reasoning...
What is wrong
README.md:69 tells operators enabling the new Parity trace namespace to set --rpc.evmtimeout ≥ 60s. That flag is wired to ethconfig.RPCEVMTimeout and is consumed only by:
internal/ethapi/api.go(DoCall—eth_call,eth_estimateGas)internal/ethapi/simulate.go(eth_simulate)graphql/graphql.go
grep -r RPCEVMTimeout eth/tracers/ returns zero hits. Neither debug_trace* nor the new trace_* methods read it.
How the trace path actually times out
The tracers ship their own constants:
eth/tracers/api.go:54—defaultTraceTimeout = 5 * time.Second(used bydebug_trace*when the caller does not setTraceConfig.Timeout).eth/tracers/parity.go:28—defaultParityTraceTimeout = 60 * time.Second(used byparityTraceConfigfor everytrace_*call, since the Parity RPC methods do not exposeTimeoutto the caller).
So the trace phase is already capped at 60s internally, regardless of --rpc.evmtimeout. The only external knob that governs a long trace_* call is the HTTP server write timeout (--http.timeout.write, default 30s), which will cut the RPC response off well before the 60s tracer cap on default settings.
Step-by-step proof
- Operator follows README:
bor server --rpc.enabletrace --rpc.evmtimeout 60s --gcmode archive. --rpc.evmtimeout 60ssetsJsonRPC.RPCEVMTimeout = 60sin the config, threaded intoethconfig.Config.RPCEVMTimeout.- Operator calls
trace_replayBlockTransactionsfor a heavy historical block. - Request routes to
TraceAPI.ReplayBlockTransactions→replayBlockTransactions→buildReplayResult→parityTraceTx→traceTx(cfg = parityTraceConfig(userCfg)). - In
parityTraceConfig(parity.go:184),cfg.Timeoutis set todefaultParityTraceTimeout.String()(60s).RPCEVMTimeoutis not read anywhere on this path. - In
traceTxthe timeout comes fromcfg.Timeout— the 60s constant — not fromapi.b.RPCEVMTimeout(). - Meanwhile the HTTP handler enforces its own write timeout (default 30s,
jsonrpc.timeouts.write); if the reply is not written in time the client sees a broken connection well before the 60s tracer cap.
The operator has set a flag that changes literally nothing about their trace timeouts.
Impact
Documentation-only. The trace namespace works correctly out of the box because the internal 60s default is already what the README claims to be recommending. Nothing crashes, no wrong data is returned; the advice is simply a no-op for the namespace it targets. The practical harm is that an operator debugging a hung trace_* call will look at the flag they set (--rpc.evmtimeout) instead of the flag that actually matters (--http.timeout.write).
How to fix
Two good options in README.md (and in the PR description, which repeats the same line under "Recommended companion flags"):
- Drop the
--rpc.evmtimeoutbullet entirely. The internaldefaultParityTraceTimeout = 60salready gives the intended behavior. - Replace it with
--http.timeout.write(default 30s), which is the real ceiling on a longtrace_*call. Suggested wording: "Sufficient HTTP write timeout for deep blocks (--http.timeout.write≥ 60s recommended, since heavy trace_replay* calls can take longer than the 30s default before the internal 60s trace cap trips)."
Severity
nit — documentation error, not a code-correctness bug. Merging as-is does not break the feature. Worth fixing before release notes go out so operators are not chasing a no-op flag.
| func (api *API) parityStateDiffFor(ctx context.Context, in parityExecInput, feeless bool) (parityStateDiff, error) { | ||
| prestate := "prestateTracer" | ||
| cfg := &TraceConfig{ | ||
| Tracer: &prestate, | ||
| TracerConfig: json.RawMessage(`{"diffMode":true}`), | ||
| } | ||
| if in.config != nil { | ||
| cfg.Reexec = in.config.Reexec | ||
| cfg.Timeout = in.config.Timeout | ||
| } |
There was a problem hiding this comment.
🟡 The Parity trace namespace applies a 60s per-tx timeout (defaultParityTraceTimeout) only inside parityTraceTx via parityTraceConfig. The stateDiff phase at eth/tracers/parity_statediff.go:239-248 and the vmTrace phase at eth/tracers/parity_vmtrace.go:446-452 build their own TraceConfig and only copy Timeout when in.config.Timeout is non-nil — but every trace_* RPC entry point (ReplayTransaction, replayBlockTransactions, traceCallExec, Block) constructs parityExecInput with a nil config, so those two phases fall through to defaultTraceTimeout = 5s. Effect: trace_replayTransaction(hash, ["trace","stateDiff","vmTrace"]) on a heavy tx passes the 60s trace phase but times out on the heavier stateDiff/vmTrace phases at 5s — the very case the 60s default was introduced for. RPC signatures do not accept a caller Timeout, so there is no workaround. Fix: reuse parityTraceConfig's default-timeout logic in both helpers (or thread all three configs through one builder) so all phases share the 60s ceiling.
Extended reasoning...
What's wrong
parityTraceConfig at eth/tracers/parity.go:211-227 is explicit about the design intent — its comment states "Use the Parity-specific (longer) timeout when the caller hasn't set one explicitly. This avoids inflating defaultTraceTimeout for debug_trace callers."* — and it applies the 60s defaultParityTraceTimeout when cfg.Timeout == nil. That default is applied only where parityTraceConfig is called, which is only inside parityTraceTx.
The two peer phases build their own configs without reusing this helper:
parityStateDiffFor(eth/tracers/parity_statediff.go:239-248) —cfg := &TraceConfig{Tracer: &prestate, TracerConfig: ...}; thenif in.config != nil { cfg.Reexec = in.config.Reexec; cfg.Timeout = in.config.Timeout }. No fallback if the copied Timeout is nil.parityVMTraceFor(eth/tracers/parity_vmtrace.go:446-452) — same shape.
When either in.config or in.config.Timeout is nil, traceTx at eth/tracers/api.go:1039-1042 falls back to defaultTraceTimeout = 5 * time.Second.
Every RPC path leaves in.config nil
trace_replayTransaction—eth/tracers/parity_replay_tx.go:60:in := parityExecInput{tx: tx, msg: msg, txctx: txctx, vmctx: vmctx, statedb: statedb}(no config).trace_replayBlockTransactions—eth/tracers/parity_replay_block.go:73: same shape, no config.trace_call/trace_callMany—eth/tracers/parity_call.go:176:in := parityExecInput{tx: tx, msg: msg, txctx: new(Context), vmctx: blockCtx, statedb: statedb}(no config).trace_blocksetsconfigonparityExecInputinparity_block.go:145, but the RPC-visibleBlockmethod atparity_block.go:69callsTraceBlockParity(ctx, number, nil), so the config is nil in practice.
The RPC method signatures do not accept a TraceConfig, so a caller cannot override the timeout to work around this.
Step-by-step proof
- Operator calls
trace_replayTransaction(0xdeadbeef, ["trace","stateDiff","vmTrace"])on a heavy historical tx (deep account-abstraction call tree) whose real execution costs, say, ~10s. ReplayTransactionbuildsinwithconfig: niland callsbuildReplayResult(..., set, false, false).buildReplayResultruns the requested tracers in order.stateDiffis first:parityStateDiffForexecutes withcfg.Timeout == nil.traceTxsees a nil timeout and setstimeout = defaultTraceTimeout = 5s(api.go:1039-1042). After 5s the deadline fires,tracer.Stop(errors.New("execution timeout"))runs, andtraceTxreturns"tracing failed: execution timeout".buildReplayResultpropagates the error; the whole RPC call returns an error with no result — even though the trace phase (had it run) would have completed within the intended 60s budget.
Why this is a real gap, not just a doc issue
defaultParityTraceTimeout's own comment says "Parity-style replay of heavy historical transactions (deep account-abstraction call trees) can take well over 5s". vmTrace is the heaviest of the three tracers (it captures every opcode's stack/memory/storage), so it's ironically the one most likely to hit the 5s ceiling — the opposite of the design.
Impact
- Non-consensus, opt-in namespace (
--rpc.enabletrace, off by default). - Deterministic error (
context.DeadlineExceeded), not silent bad data — an operator gets a clear failure. - No workaround from the RPC surface: the three methods don't accept a caller-supplied Timeout.
- The failure mode contradicts the PR's stated 60s design contract for heavy traces.
Fix
Replace the ad-hoc cfg := &TraceConfig{...} in both helpers with a call to a small shared builder (analogous to parityTraceConfig) that sets Tracer/TracerConfig for the specific tracer and applies the same default-timeout logic. One-liner alternative in each site: after copying from in.config, add
if cfg.Timeout == nil {
s := defaultParityTraceTimeout.String()
cfg.Timeout = &s
}so all three phases share the 60s ceiling.
| parent := t.stack[len(t.stack)-1] | ||
| prefix := t.rootPrefix | ||
| if parent.pending != nil { | ||
| // Attach the sub-frame to the call op that opened it. | ||
| parent.pending.op.Sub = frame | ||
| prefix = parent.pending.op.Idx | ||
| } | ||
| t.stack = append(t.stack, &vmTraceState{frame: frame, prefix: prefix}) | ||
| } |
There was a problem hiding this comment.
🟡 🟡 The Parity vmTrace tracer produces an empty ops list ({"code":"0x...","ops":[]}) for post-Madhugiri Bor state-sync transactions when replayed via trace_replayTransaction / trace_replayBlockTransactions with the "vmTrace" trace type. WrapStateSyncHooks injects a synthetic OnEnter(depth=0) at OnTxStart so a root frame exists, but no OnOpcode ever fires at that scope — parityVMTracer attaches sub-frames only via parent.pending.op.Sub, so every real per-event OnEnter creates an orphan frame that is pushed onto the stack, populated, then popped without ever being linked into t.root. Fix: inject a synthetic call op into the root when the first child arrives without a pending op, or detect the state-sync case at OnTxStart and route event frames as top-level ops. Contradicts the PR summary byte-identical parity claim for post-Madhugiri state-sync txs with vmTrace, but the trace namespace is opt-in (--rpc.enabletrace, default off) and non-consensus.
Extended reasoning...
What is wrong
parityVMTracer.OnEnter (eth/tracers/parity_vmtrace.go:172-179) attaches a sub-frame to its parent only via parent.pending.op.Sub, and parent.pending is set exclusively inside OnOpcode. For a normal transaction this is fine because the root call runs real EVM ops before any sub-call fires, so an opening call op is always pending when a child arrives. Bor state-sync transactions break that invariant.
For a post-Madhugiri state-sync tx, traceTx (eth/tracers/api.go) wraps the tracer via WrapStateSyncHooks. That wrapper does two things at OnTxStart (state_sync_tracing_hooks.go:66-83): forwards to inner.OnTxStart and then synthetically fires inner.OnEnter(depth=0, CALL, BorSystemAddress, stateReceiverAddress) to give the tracer a single root frame. Every real subsequent OnEnter/OnExit/OnOpcode is shifted by +1 in depth. But the synthetic root scope is never actually executed — ApplyStateSyncEvents iterates through each event and calls vmenv.Call at unwrapped depth 0, which becomes depth 1 after shift. Nothing ever calls OnOpcode at depth 0, so root_state.pending stays nil for the entire tx.
Step-by-step proof
OnTxStartfires. WrapStateSyncHooks forwards toparityVMTracer.OnTxStart(captures statedb), then synthetically callsinner.OnEnter(depth=0, CALL, ...). parity_vmtrace.go:166 branches ondepth==0: setst.root = frameandt.stack = [{root_state}].root_state.pendingis nil.ApplyStateSyncEventsprocesses the first event.vmenv.CalltriggersOnEnter(depth=0, ...); wrapper shifts to depth=1.- parity_vmtrace.go:172-179 runs with
parent = root_state,parent.pending == nil. Theif parent.pending != nilblock at line 174-178 is SKIPPED —parent.pending.op.Sub = framenever executes. The frame is only pushed tot.stack. - The event EVM execution now fires OnOpcodes into the orphan frame — its Ops slice fills, but nothing references it from
t.root. - Event ends →
OnExit(depth=0)shifted to depth=1 → parity_vmtrace.go:184 pops the orphan offt.stack. No reference reachable fromt.rootremains. - Pattern repeats for every subsequent state-sync event. All frames orphaned.
OnTxEnd→ syntheticOnExit(depth=0)pops the root off the stack.GetResultmarshalst.root:{"code":"0x...<stateReceiverContract code>","ops":[]}. Ops is [], and noops[i].Subsubtree references any of the recorded event data.
Why other paths work
The callTracer wrapped by parityCallTracer (used for the "trace" and "stateDiff" trace types) builds its call tree by depth alone — a +1 depth shift is sufficient, so state-sync txs are correctly represented in that output. The vmTrace tracer uses a fundamentally different attachment model (pending-op.Sub), which needs an owning op at the synthetic root — one that does not exist.
The 520-sample validation campaign reports 40/40 for trace_replayBlockTransactions, but that either did not include a post-Madhugiri state-sync tx with vmTrace requested, or Erigon returns the same empty ops and it appeared as a symmetric match. The unit tests (parity_vmtrace_*_test.go) all use hand-driven scopes or transfer-chain harnesses — none exercise WrapStateSyncHooks + vmTrace.
Impact
Silent correctness bug. Trigger scenario: post-Madhugiri (mainnet block ≥ 80,084,800) archive node with --rpc.enabletrace, calling trace_replayTransaction or trace_replayBlockTransactions with ["vmTrace"] on a state-sync tx hash returns {"vmTrace":{"code":"0x...","ops":[]}, ...} instead of the actual per-event opcode sequence. Downstream indexers reading Bor vmTrace of state-sync txs receive an empty ops list and will not detect the missing data unless they cross-check against another client.
Fix
Three viable options: (1) at OnEnter, when parent.pending is nil and depth > 0, add a synthetic call op record to the parent frame and use it as pending for the child sub-frame attachment; (2) detect the state-sync case at OnTxStart (via tx.Type()) and treat the shifted depth-1 frames as top-level ops rather than nested subs; (3) add a subframes []*vmTraceFrame field to vmTraceFrame and populate it directly when there is no pending op. Option (1) is the least invasive.
Severity
nit. Opt-in namespace (--rpc.enabletrace, default off), non-consensus, narrow surface (only state-sync tx + vmTrace subset — a real Parity-parity divergence but only on Bor pseudo-transactions). Fixing it is an incremental improvement, not a merge blocker.



Summary
Adds a Parity/OpenEthereum-compatible
trace_*RPC namespace to Bor, targeting archive-node operators who need OpenEthereum-style execution traces for indexing, debugging, and MEV tooling. Six methods are implemented:trace_block,trace_transaction,trace_replayTransaction,trace_replayBlockTransactions,trace_call, andtrace_callMany— covering all three Parity trace types (trace, stateDiff, vmTrace). Output is validated byte-identical against a live Polygon Erigon v3.3.7 reference (Chainstack) across 520 random mainnet samples spanning the full chain history: 499/520 identical (95.96%); all mined-tx methods (trace_transaction,trace_replayTransaction,trace_replayBlockTransactions) reach 120/120 (100%). The remaining 21 diffs are two accepted gaps documented below. The namespace is opt-in only via--rpc.enabletrace(default off) so existing nodes are unaffected.Executed tests
go test -race ./eth/tracers/...— all pass. Table-driven tests cover parity trace conversion, stateDiff encoding (created/deleted/modified/storage), vmTrace push and mem regions, gas accounting, and the feeless sender correction.scripts_trace_validate.pydrew random blocks across the full mainnet chain and compared every method's output byte-for-byte against Chainstack Polygon Erigon v3.3.7. Results:trace_transactiontrace_replayTransactiontrace_replayBlockTransactionstrace_blocktrace_call/trace_callMany*
trace_blockdiffs = pre-Madhugiri state-sync gap (see Rollout notes).trace_calldiffs = underfunded-sender bailout gap + 3 Chainstack-side "first run for tx" errors.0x4cf4912,0x2277c00,0x3545626,0x4acf9d3inspected individually withjqdiff vs reference for edge cases (SELFDESTRUCT, CREATE revert, precompile calls, state-sync tx).Rollout notes
Not consensus-affecting. All new code is in
eth/tracers/(RPC layer only); no changes to block production, state transition, or consensus. Existing call sites (debug_traceTransaction,eth_call, etc.) are byte-for-byte unchanged.Backwards-compatible. Opt-in only. The
tracenamespace is not registered unless--rpc.enabletraceis set (orJsonRPC.EnableTrace = truein TOML). No operator impact without the flag.Requires archive node.
--gcmode archiveis a hard prerequisite; pruned nodes will return errors on historical state lookups.Recommended companion flags:
Known gaps (documented, not bugs):
trace_blockomits the state-sync pseudo-tx for pre-Madhugiri blocks (< 80,084,800 mainnet). Post-Madhugiri (all recent blocks) is correct.trace_call/trace_callManyerrors on underfunded simulated senders instead of returning a bailout trace (gas bailout deferred — requires changes tocore/state_transition.go).