Skip to content

[AP][MassLegalizer] Resolved Issues in Partitioner #3199

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

AlexandreSinger
Copy link
Contributor

Found some low-hanging fruit in the partial legalizer regarding how we merge the windows and how we partition the blocks.

I found that the windows were not being fully merged (i.e. there still existed overlap between windows). I updated the algorithm to iteratively retry to merge if the bounding box of the window changes. To maintain good performance, I added a spatial lookup to narrow down the region where overlap can occur.

The partitioning of blocks had a major bug in it where it was not actually moving blocks over most of the time. This was causing the partial legalizer to struggle to resolve of a legal solution. To resolve this, I completely rewrote the partitioning algorithm to make it more stable. The new algorithm starts by putting the blocks on the side they want (if their solved position is on the left of the partition, the will go into the left partition). It then moves the blocks closest to the partition line until the overfill of both partitions is as balanced as can be.

These two optimizations together seem to have improved WL by around 2% and CPD by around 1%.

Found some low-hanging fruit in the partial legalizer regarding how we
merge the windows and how we partition the blocks.

I found that the windows were not being fully merged (i.e. there still
existed overlap between windows). I updated the algorithm to iteratively
retry to merge if the bounding box of the window changes. To maintain
good performance, I added a spatial lookup to narrow down the region
where overlap can occur.

The partitioning of blocks had a major bug in it where it was not
actually moving blocks over most of the time. This was causing the
partial legalizer to struggle to resolve of a legal solution. To resolve
this, I completely rewrote the partitioning algorithm to make it more
stable. The new algorithm starts by putting the blocks on the side they
want (if their solved position is on the left of the partition, the will
go into the left partition). It then moves the blocks closest to the
partition line until the overfill of both partitions is as balanced as
can be.

These two optimizations together seem to have improved WL by around 2%
and CPD by around 1%.
@github-actions github-actions bot added VPR VPR FPGA Placement & Routing Tool lang-cpp C/C++ code labels Jul 11, 2025
@AlexandreSinger
Copy link
Contributor Author

Results on Titan compared to Baseline AP (no fixed blocks):

circuit post_gp_hpwl post_fl_hpwl post_dp_hpwl total_wirelength post_gp_cpd post_fl_cpd post_dp_cpd crit_path_delay
LU230_stratixiv_arch_timing.blif 0.94 0.97 1.03 1.03 1.00 1.13 1.00 0.97
LU_Network_stratixiv_arch_timing.blif 1.13 1.10 0.91 0.94 1.07 1.15 0.92 0.88
SLAM_spheric_stratixiv_arch_timing.blif 0.26 0.70 0.79 0.79 0.70 0.96 0.97 0.96
bitcoin_miner_stratixiv_arch_timing.blif 0.95 1.01 1.03 1.02 1.12 0.69 0.89 0.95
bitonic_mesh_stratixiv_arch_timing.blif 0.82 0.91 0.99 0.98 0.85 0.97 0.96 1.03
cholesky_bdti_stratixiv_arch_timing.blif 1.02 1.05 1.03 1.01 0.90 1.03 0.97 0.95
cholesky_mc_stratixiv_arch_timing.blif 1.04 1.04 0.98 1.00 1.52 0.98 0.98 0.98
dart_stratixiv_arch_timing.blif 0.99 0.99 1.03 1.02 0.90 0.92 1.01 1.04
denoise_stratixiv_arch_timing.blif 0.82 0.83 0.96 0.99 0.97 0.97 1.01 1.01
des90_stratixiv_arch_timing.blif 1.01 0.90 0.95 0.96 0.99 1.01 1.03 1.04
directrf_stratixiv_arch_timing.blif 1.12 1.18 0.89 0.89 0.99 1.10 1.06 1.07
gsm_switch_stratixiv_arch_timing.blif 1.01 1.03 0.96 0.98 1.36 1.00 1.07 1.09
mes_noc_stratixiv_arch_timing.blif 0.94 0.92 1.04 1.03 1.04 1.02 0.96 0.97
minres_stratixiv_arch_timing.blif 1.01 1.05 1.03 1.03 1.02 1.06 1.10 0.89
neuron_stratixiv_arch_timing.blif 0.98 0.93 0.96 0.97 0.96 1.03 0.98 0.94
openCV_stratixiv_arch_timing.blif 1.04 1.01 1.01 1.01 1.01 0.91 1.03 1.08
segmentation_stratixiv_arch_timing.blif 0.94 1.04 0.98 0.97 0.98 0.99 0.99 0.98
sparcT1_chip2_stratixiv_arch_timing.blif 0.96 0.98 1.00 1.00 0.86 0.97 0.88 0.90
sparcT1_core_stratixiv_arch_timing.blif 0.98 1.04 1.02 1.01 0.96 1.07 0.96 1.00
sparcT2_core_stratixiv_arch_timing.blif 0.99 0.96 0.99 0.99 1.19 0.94 1.00 0.98
stap_qrd_stratixiv_arch_timing.blif 1.13 1.08 0.95 0.96 1.27 1.14 1.02 1.05
stereo_vision_stratixiv_arch_timing.blif 1.03 0.98 1.00 1.00 0.91 0.94 0.99 1.01
                 
GEOMEAN 0.93 0.98 0.98 0.98 1.01 0.99 0.99 0.99

Overall a healthy improvement across the board. This shows that the full legalizer is doing its job well, since improving the partial legalizer improved the overall QoR.

Copy link
Contributor

@vaughnbetz vaughnbetz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few possible changes and a commenting clarification (or maybe a change to xmax behaviour, depending on what the answer is).

// Get the lower and upper bounds in the spatial lookup. This will
// give a list of all windows which could possibly overlap right now
// (at least in the x dimension).
// NOTE: There may be diplicates; but thats ok since these checks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

diplicates -> duplicates

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't you inert region.xmax into the map, and make upper a function of region.xmax, instead of using region.xmax-1 on both insertion and checking for the bound? Aren't the regions inclusive where they include their xmax too?
If there's a reason you're not including the xmax boundary you should comment what it is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ill improve the comments. It is correct to subtract the 1 here. You are correct that the regions are inclusive (they spill right over to the crust from xmin to xmax); however we do not consider two windows to overlap if the xmax of one window is the same as the xmin of another window (i.e. they are touching on one side). Hence, for this map, we can exclude searching for any windows that begin at the xmax coordinate since we do not care if any windows are found there.

Thinking about this more though, this only works if the windows are always at integer locations (which they are right now, but I would like to keep this general if I can). So really this should be xmax - epsilon, where epsilon is some reasonably small number.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a really good point. Looking at my code, this was very unclear. I have used a more general way of handling this which is much better now. We now exclude the border from being included in this map.

Made the window merging boundary condition a bit more obvious.

Outlined common code into helper methods.
Copy link
Contributor

@vaughnbetz vaughnbetz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just one typo ...

@AlexandreSinger AlexandreSinger merged commit bf921db into verilog-to-routing:master Jul 14, 2025
29 checks passed
@AlexandreSinger AlexandreSinger deleted the feature-ap-partial-legalizer branch July 14, 2025 20:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lang-cpp C/C++ code VPR VPR FPGA Placement & Routing Tool
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants