Skip to content

Cache locality and speed #7453

Open
Open
@kripken

Description

@kripken

Talking with @tlively , we realized that --roundtrip will restructure code back into a cache-friendly form, since it serializes it and then reads it, and when we read it, we allocate adjacent instructions contiguously in an arena. Imagine we begin with unoptimized code, then optimizations quickly add pointers to arbitrary places in memory, but doing a --roundtrip can "fix" that, and might be worth it if we run more optimizations afterwards.

To measure this, I took a large unoptimized Kotlin testcase I have. -O3 takes 50 seconds, a second -O3 after it takes 25 seconds (it makes sense it would be faster, since after the first cycle, there is a lot less code). Adding a --roundtrip between the two adds 2 seconds for the roundtrip itself, but makes the total time 2 seconds faster. So ignoring the roundtrip's time, we gain 4 seconds on the second -O3, which is something like 15% faster.

Perhaps we should try to reuse instructions when rewriting more - we do that in OptimizeInstructions in some places, but it does make the code more complex. Perhaps helper utilities can do that in nice ways though.

In theory we could consider doing some reordering ("defrag") that is more efficient that roundtrip, automatically after enough passes have been run.

cc #4165

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions