Skip to content

Use amrex::Scan::PrefixSum in InitPlasmaParticles#1314

Merged
MaxThevenet merged 10 commits intoHi-PACE:developmentfrom
AlexanderSinn:Use_amrex__Scan__PrefixSum_in_InitPlasmaParticles
Mar 27, 2026
Merged

Use amrex::Scan::PrefixSum in InitPlasmaParticles#1314
MaxThevenet merged 10 commits intoHi-PACE:developmentfrom
AlexanderSinn:Use_amrex__Scan__PrefixSum_in_InitPlasmaParticles

Conversation

@AlexanderSinn
Copy link
Copy Markdown
Member

@AlexanderSinn AlexanderSinn commented Nov 20, 2025

This PR replaces the many amrex::Scan::ExclusiveSum (one per max_ppc) with three amrex::Scan::PrefixSum (one per fine patch level) while still keeping the order of initialized plasma particles the same. amrex::Scan::PrefixSum has the advantage of not needing to allocate memory for each element on the GPU so that we can scan over all ppc in a single operation. In the case of a fine patch, typically only a small region in the center contains the max_ppc. To limit elements in the scan, the scan is split into each fine patch level. The respective bounding box of the level is determined in the initial reduction to limit the number of cells the scan operates on for the higher levels. This method also works on the base level if a plasma radius is used. The scan also allows for more flexibility in changing the order in which plasma particles are laid out in memory. Because the bounds of a ParallelForRNG are now different, this will change the output of RNG related quantities for plasma and beam etc.

Using examples/get_started/inputs_mesh_refinement which has a 32^2 ref ratio plasma fine patch and with elec.ppc = 4 4 ion.ppc= 4 4 added, the new version is more than an order of magnitude faster while using a little less memory.

dev:

Name                                                   NCalls  Excl. Min  Excl. Avg  Excl. Max   Max %
PlasmaParticleContainer::InitParticles()                    2      3.469      3.469      3.469   3.42%

Name                                             Nalloc  Nfree    AvgMem    MaxMem
PlasmaParticleContainer::InitParticles()          98334  98334   464 MiB  5763 MiB

PR:

Scan ilev 0, bx ((0,0,0) (1022,1022,16) (0,0,0)), numPts 17790993
Scan ilev 1, bx ((495,495,17) (527,527,16384) (0,0,0)), numPts 17824752

Name                                                   NCalls  Excl. Min  Excl. Avg  Excl. Max   Max %
PlasmaParticleContainer::InitParticles()                    2     0.2886     0.2886     0.2886   0.29%

Name                                             Nalloc  Nfree    AvgMem    MaxMem
PlasmaParticleContainer::InitParticles()             34     34   336 MiB  5757 MiB
  • Small enough (< few 100s of lines), otherwise it should probably be split into smaller PRs
  • Tested (describe the tests in the PR description)
  • Runs on GPU (basic: the code compiles and run well with the new module)
  • Contains an automated test (checksum and/or comparison with theory)
  • Documented: all elements (classes and their members, functions, namespaces, etc.) are documented
  • Constified (All that can be const is const)
  • Code is clean (no unwanted comments, )
  • Style and code conventions are respected at the bottom of https://github.com/Hi-PACE/hipace
  • Proper label and GitHub project, if applicable

@AlexanderSinn AlexanderSinn added component: plasma About the plasma species performance optimization, benchmark, profiling, etc. labels Dec 3, 2025
Copy link
Copy Markdown
Member

@MaxThevenet MaxThevenet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR!

@MaxThevenet MaxThevenet merged commit d47a7d7 into Hi-PACE:development Mar 27, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component: plasma About the plasma species performance optimization, benchmark, profiling, etc.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants