Cache locality and speed

Talking with @tlively , we realized that `--roundtrip` will restructure code back into a cache-friendly form, since it serializes it and then reads it, and when we read it, we allocate adjacent instructions contiguously in an arena. Imagine we begin with unoptimized code, then optimizations quickly add pointers to arbitrary places in memory, but doing a `--roundtrip` can "fix" that, and might be worth it if we run more optimizations afterwards.

To measure this, I took a large unoptimized Kotlin testcase I have. `-O3` takes 50 seconds, a second `-O3` after it takes 25 seconds (it makes sense it would be faster, since after the first cycle, there is a lot less code). Adding a `--roundtrip` between the two adds 2 seconds for the roundtrip itself, but makes the total time 2 seconds faster. So ignoring the roundtrip's time, we gain 4 seconds on the second `-O3`, which is something like **15%** faster.

Perhaps we should try to reuse instructions when rewriting more - we do that in OptimizeInstructions in some places, but it does make the code more complex. Perhaps helper utilities can do that in nice ways though.

In theory we could consider doing some reordering ("defrag") that is more efficient that roundtrip, automatically after enough passes have been run.

cc #4165 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Cache locality and speed #7453

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Cache locality and speed #7453

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions