Use amrex::Scan::PrefixSum in InitPlasmaParticles#1314
Merged
MaxThevenet merged 10 commits intoHi-PACE:developmentfrom Mar 27, 2026
Merged
Conversation
…_InitPlasmaParticles
…_InitPlasmaParticles
…_InitPlasmaParticles
…_InitPlasmaParticles
…_InitPlasmaParticles
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR replaces the many amrex::Scan::ExclusiveSum (one per max_ppc) with three amrex::Scan::PrefixSum (one per fine patch level) while still keeping the order of initialized plasma particles the same. amrex::Scan::PrefixSum has the advantage of not needing to allocate memory for each element on the GPU so that we can scan over all ppc in a single operation. In the case of a fine patch, typically only a small region in the center contains the max_ppc. To limit elements in the scan, the scan is split into each fine patch level. The respective bounding box of the level is determined in the initial reduction to limit the number of cells the scan operates on for the higher levels. This method also works on the base level if a plasma radius is used. The scan also allows for more flexibility in changing the order in which plasma particles are laid out in memory. Because the bounds of a ParallelForRNG are now different, this will change the output of RNG related quantities for plasma and beam etc.
Using
examples/get_started/inputs_mesh_refinementwhich has a 32^2 ref ratio plasma fine patch and withelec.ppc = 4 4 ion.ppc= 4 4added, the new version is more than an order of magnitude faster while using a little less memory.dev:
PR:
constisconst)