Skip to content

Conversation

@0xForerunner
Copy link
Collaborator

Audit findings:

All requests which are to be forwarded should not be treated equally. Specifically, there are eth_ and miner_ requests. If for some reason miner_ succeeds on the L2-EL but fails on the builder (or vice versa), it can result in discrepancy with critical information for consensus, such as GasPrice and GasLimit. This can lead to a block created by the builder with a higher gas limit than what L2-EL expects. As a result, the L2-EL will mark this block invalid and potentially flag all the future builder blocks as invalid invalid as well, effectively disabling builder functionality.

Recommendation(s): Increase the robustness of the forwarding mechanism. The miner_ should be forwarder to the builder only if the call to EL-L2 succeeds. If the builder fails to process the request, the failure should be logged. In addition, the system should set the execution_status to Disabled to prevent further issues.

@0xForerunner
Copy link
Collaborator Author

I've set execution mode to fallback here, although I'm not convinced this is the best way to handle this situation. I think setting the health to partial content is probably sufficient. Let me know what you think!

@0xForerunner 0xForerunner force-pushed the forerunner/inconsistant-miner-api branch from 8a2b679 to 6ea2e32 Compare May 15, 2025 20:25
@vercel
Copy link

vercel bot commented May 15, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
rollup-boost ⬜️ Ignored (Inspect) Visit Preview May 28, 2025 7:02pm

@0xForerunner 0xForerunner changed the base branch from kit/engine-loop to main May 15, 2025 20:26
@0xForerunner
Copy link
Collaborator Author

Since #201 was closed I cherry-picked this back onto main.

src/proxy.rs Outdated
builder_client.forward(builder_req, builder_method),
service.l2_client.forward(l2_req, method)
);
if builder_res.is_ok() != l2_res.is_ok() {
Copy link
Contributor

@karankurbur karankurbur May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If both builder and l2 are not_ok(), shouldn't we still update execution mode and health?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No because we can return that error to the caller, and they can handle that as they wish.

src/proxy.rs Outdated
let mut execution_mode = service.execution_mode.lock();
if *execution_mode == ExecutionMode::Enabled {
*execution_mode = ExecutionMode::Disabled;
// Drop before aquiring health lock
Copy link
Collaborator

@0xOsiris 0xOsiris May 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The health status here isn't sticky. It will switch back to Healthy on the next health check interval in the HealthHandle if the builders unsafe head is up to date. So, I'm not sure if it's even worth updating the probes unless you can inform the Health handle not to update the health until the DA limits discrepancy is resolved.

Although it will likely trigger a conductor failover as is, so maybe worth it

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally the health status here sticks until there is no discrepancy in DA limits between the sequencer, and the builder

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think letting health status revert should be okay, as long as a new leader election is triggered.

@ferranbt
Copy link
Contributor

I will check more in detail next week but can we add a unit test to check this behaviour?

@0xForerunner
Copy link
Collaborator Author

@ferranbt Yeah will definitely add tests. Was just waiting to get settled on the solution. Just pushed a couple commits and I think this is currently as good as we can do. Added a retry layer into our HTTP client which should hopefully prevent ever running into this situation.

@0xForerunner 0xForerunner requested review from 0xOsiris and ferranbt May 21, 2025 02:48
@0xForerunner 0xForerunner force-pushed the forerunner/inconsistant-miner-api branch from d6d1d9a to 20538b1 Compare May 28, 2025 01:42
@0xForerunner 0xForerunner requested a review from ferranbt May 28, 2025 02:16
@0xForerunner 0xForerunner requested a review from avalonche May 28, 2025 04:25
params: Value,
_result: Value,
) -> Pin<Box<dyn Future<Output = Option<Value>> + Send>> {
println!("method: {:?}", method);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
println!("method: {:?}", method);

// restart the builder
client.unpause_container(harness.builder.id()).await?;

// let _res = engine_api.set_max_da_size(val, val).await?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stray comment?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, TY!

@0xForerunner 0xForerunner enabled auto-merge (squash) May 28, 2025 19:03
@0xForerunner 0xForerunner merged commit 535a4ae into main May 28, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants