wasmparser: merge VisitSimdOperator back into VisitOperator #2184

nagisa · 2025-05-09T12:12:52Z

The simd feature remains in place, but now does not affect the API the way it used to. Rather parts of implementation pertaining to SIMD in wasmparser get disabled, reducing the compilation time and artifact size impact slightly, though less than in the original approach.

The results on my machine are roughly:

origin/main simd: 4.26s producing 30M .rlib
origin/main !simd: 3.49s producing 23M .rlib
this simd: 4.19s producing a 26M .rlib
this !simd: 3.96s producing a 25M .rlib

I'm not quite sure why the rlib size decreased so significantly when the SIMD feature is enabled (I guess less inlining is occurring?) and there is indeed a .5s hit on the compilation time to this PR.

Are those half a second worth a clunkier API? I'm not convinced :)

Fixes #2181

nagisa · 2025-05-09T12:13:02Z

cc @Robbepop

Robbepop · 2025-05-09T12:49:56Z

Are those half a second worth a clunkier API? I'm not convinced :)

Since the API has to be mirrored on the user side it is significantly more than 0.5s. In Wasmi's case the difference between SIMD and non SIMD is very much worth the effort.

nagisa · 2025-05-09T13:35:16Z

How bad is the impact if you implement all of those methods with a plain call to a #[cold] method that returns "disabled"/panics when SIMD is disabled?

Robbepop · 2025-05-09T14:28:59Z

How bad is the impact if you implement all of those methods with a plain call to a #[cold] method that returns "disabled"/panics when SIMD is disabled?

That's basically what I did before the change to use an optional simd crate feature (via macro) and as said prior the impact was large enough to suffice the changes from the perspective of users such as Wasmi.

To provide you with concrete numbers for Wasmi:

cargo build -p wasmi: ~4.8s
cargo build -p wasmi --features simd: ~7.3s

Given that most Wasmi users do not have a need for Wasm simd this is a huge win and would be an unnecessary overhead if it was baked in.

The `simd` feature remains in place, but now does not affect the API in this way. Rather parts of implementation pertaining to SIMD in wasmparser get disabled, reducing the compilation time and artifact size impact slightly, though less than in the original approach. The results on my machine are roughly: * origin/main simd: 4.26s producing 30M .rlib * origin/main !simd: 3.49s producing 23M .rlib * this simd: 4.19s producing a 26M .rlib * this !simd: 3.96s producing a 25M .rlib I'm not quite sure why the rlib size decreased so significantly when the SIMD feature is enabled (I guess less inlining is occurring?) and there is indeed a .5s hit on the compilation time to this PR. Are those half a second worth a clunkier API? I'm not convinced :)

alexcrichton

@Robbepop I'm curious if you can detail the numbers a bit more? This is part of what I was afraid of in the original split that maintaining such a split is pretty cumbersome and IMO is only worth it if it has few downsides. Handling the split has enough downsides already and @nagisa's use case seems reasonable to me to support and it's easily supported with the split we have today.

For example are you measuring total compile time of wasmi from nothing? Those numbers are so small that 2.5s to me doesn't seem like it wouldn't really push the needle. You say this is a huge win but can you clarify how? In my experience fresh builds happen rarely enough or in situations where 2.5s doesn't really matter, but what are you thinking of? Is there some other operation which 2.5s is a significant portion of that reducing the overall operation is beneficial?

Personally I prefer the split in this PR which is that the feature gate only affects runtime and snips out the validator and other portions where possible. I think API-wise this is a much better way to carry conditional features. I'll also point out that your measurements weren't against this style of API, so the exact discrepancy may be much different with this API.

alexcrichton · 2025-05-09T22:42:05Z

crates/wasmparser/src/lib.rs

@@ -924,7 +915,7 @@ macro_rules! define_for_each_simd_operator {
        }
    };
 }
-#[cfg(feature = "simd")]
+
 _for_each_operator_group!(define_for_each_simd_operator);


I think this could get dropped in theory with this change, right? The idea being that the macro would always have all the instructions?

Robbepop · 2025-05-10T11:56:19Z

Those numbers are so small that 2.5s to me doesn't seem like it wouldn't really push the needle. You say this is a huge win but can you clarify how? In my experience fresh builds happen rarely enough or in situations where 2.5s doesn't really matter, but what are you thinking of? Is there some other operation which 2.5s is a significant portion of that reducing the overall operation is beneficial?

I feel like we are operating from different angles. For Wasmi, being as lightweight as possible is its niche in which it finds its use compared to other Wasm runtimes such as Wasmer and Wasmtime. So while 2.5s compile time reduction is insignificant for those other Wasm runtimes, it certainly is not for Wasmi to be a viable alternative for certain users.

I thought it was possible for wasmparser to provide an implementation for a wide variety of users, yet, I consistently find myself in the unfortunate position to be "annoying party" that hinders wasmparser feature development or progression because it is not aligned with Wasmi's somewhat unique or niche goals and I really do feel sorry about this.

tl;dr: maybe it would be better for everyone involved if Wasmi forks wasmparser again since Wasmi's goals seems to be too unaligned with others. I would still contribute to wasmparser if valuable to upstream users.

alexcrichton · 2025-05-13T05:25:28Z

I agree wasmi/wasmtime have different constraints, but I also want to clarify that, IMO, that's a good thing. Neither should necessarily take precedence over the other where possible and the goal is to satisfy as broad of a set of use cases as we can. I that sense while I do want to acknowledge that you're swimming upstream a bit (insofar as wasmparser doesn't automatically fit your use case exactly) you most definitely should not feel sorry for feature requests that risk coming at a cost of other embeddings. While I do think it's possible to have a change that's "too far" I don't think we're necessarily there yet here.

I'm trying to, for myself at least, explore the design space here. I don't know much about wasmi's constraints so I'm leaning on you to tell me things. I try to infer what I can but I'm unable to infer why ~2s of more build time would have such a large impact, hence my probing for more information. Apologies if that comes off as something along the lines of "surely this use case isn't that important" as that's most definitely not what I mean! To me 2s depends a lot on context. If Cargo slept 2 seconds every time I typed cargo build I'd be very sad as an example.

Any large-ish change in service of one embedding vs another always needs to be balance. For example I realize that not all embeddings are interested in supporting the component model which is why I've tried to ensure that everything is separated so it can be turned off at any time. I do find it tough to satisfy the two needs of "I want to support everything" in Wasmtime vs "I want to support only a subset" in wasmi (at least for simd-less builds) from an API-design perspective of this crate. I think it's worth iterating on designs though where possible, so for example the one proposed by @nagisa here I think is relatively-objectively better from an API-design standpoint but there's still the open question of does it solve the build time/build size issues in the same way the current solution does.

@Robbepop would you be up for testing out wasmi against a change like this? Basically comparing the simd-less build of today with the simd-less build of a PR like this?

Robbepop · 2025-05-13T08:41:26Z

I too highly prefer a common uniform solution such as wasmparser compared to having to maintain my own fork.
Yet, I completely understand that Wasmi's needs are kinda niche which sometimes makes finding abstractions and solutions that satisfy all users can be extremely difficult and time consuming. So I felt sorry because Wasmi's needs seem to cause so much of difficulties.

Though I want to make clear that I am also very thankful that wasmparser still tries its best to support even those niche use cases, e.g. simd, component-model features and the btree based collections, visitor trait based dispatch etc.. I am looking forward to the trait based WasmFeatures as it sounds very promising to solve many performance issues in a generic way.

I agree that the API proposed by @nagisa is better than the current one from a usability perspective and I would love wasmparser to change (back) to it if it did not regress compile times.

I try to infer what I can but I'm unable to infer why ~2s of more build time would have such a large impact, hence my probing for more information.

Again, I am very sorry about the sparse information from my side. Though, the only answer I can give is that there were Wasmi users in the past which critiqued Wasmi for being not lightweight enough. The reality is that Wasmi competes not only with JIT based Wasm runtimes such as Wasmtime and Wasmer but mostly with other Wasm interpreters and "singlepass" JITs such as WAMR's fast interpreter, Wasmer's singlepass JIT, Wasm3 etc. Especially the other interpreters are even more lightweight than Wasmi, and I am trying since some time (more or less) successfully to drop the numbers to be more competitive in that regard. E.g. Wasmi was compiling in 8s (debug) or 15s (release) and now is compiling in 5s (debug) or 10s (release). With these numbers in mind, 2.5s overhead seems big again I hope. Same argumentation applies to artifact size, though I have not as heavily invested in that area so far.

would you be up for testing out wasmi against a change like this? Basically comparing the simd-less build of today with the simd-less build of a PR like this?

Yes, I am very willing to do more tests. Though, I am currently at RustWeek so not sure how much time I will have this week. Great talk by the way!

nagisa · 2025-05-13T09:45:31Z

FWIW I just went with unconditional features = ["simd"] and bounding my JoinVisitor trait impl on VisitOperator + VisitSimdOperator. Doing so does not really regress anything in particular compared to the situation I had with the version of wasmparser I'm migrating from, other than that users now get to implement not one but two traits.

I'm not currently blocked on anything in particular and there is no rush from my side to get some resolution here. I'm happy to make myself available to help drive some sort of an improvement to the current situation, but at the same time I'm fine if we decide that we should leave the situation as-is for the time being.

alexcrichton · 2025-05-22T17:55:34Z

With these numbers in mind, 2.5s overhead seems big again I hope. Same argumentation applies to artifact size, though I have not as heavily invested in that area so far.

Thanks for the explanation, and makes sense!

Yes, I am very willing to do more tests. Though, I am currently at RustWeek so not sure how much time I will have this week. Great talk by the way!

Heh thanks! Ok and with what @nagisa said as well let's put this on old until @Robbepop you get a chance to evaluate it. Sounds like there's no rush though, so don't sweat it too much.

nagisa force-pushed the merges-simd-visitor branch from f2f7b31 to 45d4974 Compare May 9, 2025 12:37

nagisa force-pushed the merges-simd-visitor branch from 45d4974 to 6c9ef22 Compare May 9, 2025 14:35

alexcrichton reviewed May 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

wasmparser: merge VisitSimdOperator back into VisitOperator #2184

wasmparser: merge VisitSimdOperator back into VisitOperator #2184

Uh oh!

nagisa commented May 9, 2025 •

edited

Loading

Uh oh!

nagisa commented May 9, 2025

Uh oh!

Robbepop commented May 9, 2025

Uh oh!

nagisa commented May 9, 2025

Uh oh!

Robbepop commented May 9, 2025 •

edited

Loading

Uh oh!

alexcrichton left a comment

Uh oh!

alexcrichton May 9, 2025

Uh oh!

Robbepop commented May 10, 2025 •

edited

Loading

Uh oh!

alexcrichton commented May 13, 2025

Uh oh!

Robbepop commented May 13, 2025

Uh oh!

nagisa commented May 13, 2025

Uh oh!

alexcrichton commented May 22, 2025

Uh oh!

Uh oh!

wasmparser: merge VisitSimdOperator back into VisitOperator #2184

Are you sure you want to change the base?

wasmparser: merge VisitSimdOperator back into VisitOperator #2184

Uh oh!

Conversation

nagisa commented May 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nagisa commented May 9, 2025

Uh oh!

Robbepop commented May 9, 2025

Uh oh!

nagisa commented May 9, 2025

Uh oh!

Robbepop commented May 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexcrichton left a comment

Choose a reason for hiding this comment

Uh oh!

alexcrichton May 9, 2025

Choose a reason for hiding this comment

Uh oh!

Robbepop commented May 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexcrichton commented May 13, 2025

Uh oh!

Robbepop commented May 13, 2025

Uh oh!

nagisa commented May 13, 2025

Uh oh!

alexcrichton commented May 22, 2025

Uh oh!

Uh oh!

nagisa commented May 9, 2025 •

edited

Loading

Robbepop commented May 9, 2025 •

edited

Loading

Robbepop commented May 10, 2025 •

edited

Loading