Skip to content

[AP][MassLegalizer] Revistited Mass Legalizer #2997

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

AlexandreSinger
Copy link
Contributor

Found that the mass legalizer was not spreading out the blocks well enough according to the mass.

Revistied the spatial partitioning in the mass legalizer. Before, we just cut the window in half in the larger dimension. This was fine, however it may create an inbalanced cut which can cause things to not spread well. Instead, we now search for the best partition by trying different partition lines and computing how balanced the partition is. Although this is more expensive than before, by creating more balanced partitions, it should allow the mass legalizer to converge faster. Time in the mass legalizer is also dominated by partitioning the blocks, so increasing the time to choose the partition line should not have that large of an effect anyways.

Found an oversight with how blocks were partitioned when one of the partitions become overfilled. Fixed this issue.

@github-actions github-actions bot added VPR VPR FPGA Placement & Routing Tool lang-cpp C/C++ code labels Apr 22, 2025
@AlexandreSinger
Copy link
Contributor Author

Results on the largest VTR circuits with fixed IO and timing turned off:

Metric Change over Baseline
post_gp_hpwl 1.266305542
post_fl_hpwl 1.078644755
post_dp_hpwl 1.009669516
total_wirelength 1.00871561
post_gp_overfilled_bins 1.374204126
post_gp_avg_overfill 0.2909285826
post_fl_avg_disp 1.101741122
post_fl_max_disp 1.053551428
ap_gp_runtime 0.9868260052

The important thing to notice here is that the average overfill per overfillled bin after global placement decreased by 70%. The number of overfilled bins after global placement increased by 40%, however this demonstrates that the mass legalizer is able to spread things out more. Things are still not perfectly mass legal though.

This had a slightly negative affect on wirelength; however, this is likely due to the mass abstraction not being good. Since we were not following the mass abstraction exactly, we may have been getting lucky.

Running Titan now to see how it gets affected by these changes.

Found that the mass legalizer was not spreading out the blocks well
enough according to the mass.

Revistied the spatial partitioning in the mass legalizer. Before, we
just cut the window in half in the larger dimension. This was fine,
however it may create an inbalanced cut which can cause things to not
spread well. Instead, we now search for the best partition by trying
different partition lines and computing how balanced the partition is.
Although this is more expensive than before, by creating more balanced
partitions, it should allow the mass legalizer to converge faster. Time
in the mass legalizer is also dominated by partitioning the blocks, so
increasing the time to choose the partition line should not have that
large of an effect anyways.

Found an oversight with how blocks were partitioned when one of the
partitions become overfilled. Fixed this issue.
@AlexandreSinger AlexandreSinger force-pushed the feature-ap-mass-legalizer branch from 6efc7ac to f336abb Compare April 22, 2025 20:20
@AlexandreSinger
Copy link
Contributor Author

Titan results (no fixed IOs, no timing analysis):

Metric Change over Baseline
post_gp_hpwl 1.420
post_fl_hpwl 0.804
post_dp_hpwl 1.006
total_wirelength 1.002
post_gp_overfilled_bins 1.403
post_gp_avg_overfill 0.766
post_fl_avg_disp 0.709
post_fl_max_disp 0.824
ap_gp_runtime 1.076

The number of overfilled bins increased, but the average overfill decreased. This implies that things are spreading better across the device. Since things are becoming more "legal" the post-GP hpwl got worst; but that is to be expected.

Since a more legal solution is being produced by GP, the post-FL hpwl improved by around 20%. This is likely caused by atoms being more spread out. This is also shown in the average and max displacements improving.

The GP runtime took an 8% hit on average. I should be transparent, a couple of smaller circuits were hit hard by this change. For example, stereo vision (the smallest circuit) increased in GP runtime by 50% due to the time in the GP stage being dominated by the legalizer and not having as many blocks to spread. I do think that overall this is a good change.

@vaughnbetz What do you think about these results? For the VTR benchmarks I did not notice a run time increase at all; but for a couple of the Titan results, the more expensive way of finding partitions seems to show itself.

@AlexandreSinger
Copy link
Contributor Author

@amin1377 FYI.

@vaughnbetz
Copy link
Contributor

For results I'll go with whatever you think is best. The flow is rapidly changing so if you think this is a good long term change (and it seems to be) we should merge it even if some QoR metrics (final wirelength) are a tie or slightly worse. Legality looks to be improved and wire length looks to be maybe a wash between Titan and VTR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lang-cpp C/C++ code VPR VPR FPGA Placement & Routing Tool
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants