Skip to content

Parallel HDF5: Try fixing strict collective requirements of HDF5 >= 2.0#1862

Draft
franzpoeschel wants to merge 85 commits intoopenPMD:devfrom
franzpoeschel:fix-parallel-hdf5
Draft

Parallel HDF5: Try fixing strict collective requirements of HDF5 >= 2.0#1862
franzpoeschel wants to merge 85 commits intoopenPMD:devfrom
franzpoeschel:fix-parallel-hdf5

Conversation

@franzpoeschel
Copy link
Contributor

@franzpoeschel franzpoeschel commented Mar 11, 2026

It seems that HDF5 has become quite a bit pickier about metadata definitions in parallel setups with versions 2.0 and 2.1, leading to hangups.
Earlier, it was enough to define them consistently across ranks, now we apparently have to keep the exact same order of operations.

This is bad for the Span API which runs internal flushes for structure setup.

  • No longer flush inside the Span API, instead flush already at resetDataset()
  • Try if we can only enqueue operations and run them later, should be fine for the ordering
  • Ensure that only a select type of operation runs in structural setup flushes
  • Remove flushParticlesPath and flushMeshesPath functions, these unnecessarily leaked attribute flushes into the structure setup
  • Fix tests...
  • Guard setDirty calls
  • There are some commits deactivating a handful of tests, deal with them

}
} // namespace

std::future<void> AbstractIOHandlerImpl::flush(FlushLevel l)

Check warning

Code scanning / CodeQL

Poorly documented large function Warning

Poorly documented function: fewer than 2% comments for a function of 428 lines.
while (!(*m_handler).m_work.empty())
{
IOTask &i = (*m_handler).m_work.front();
// verifyFlushType(i.operation, l);

Check notice

Code scanning / CodeQL

Commented-out code Note

This comment appears to contain commented-out code.
{
template <typename T, typename RecordType>
PostProcessConvertedAttributeImpl<T, RecordType>::
PostProcessConvertedAttributeImpl(RecordType record_in, handler_t reader_in)

Check notice

Code scanning / CodeQL

Large object passed by value Note

This parameter of type
Mesh
is 112 bytes - consider passing a const pointer/reference instead.
Comment on lines +698 to +701
REQUIRE(r.numAttributes() == 0);
// TODO: unitSI
// REQUIRE(r["x"].numAttributes() == 0);
// REQUIRE(r["y"].numAttributes() == 0);

Check notice

Code scanning / CodeQL

Commented-out code Note test

This comment appears to contain commented-out code.
MPI_COMM_WORLD);
i.parseBase();
REQUIRE(i.rankTable(/* collective = */ true) == compare);
// REQUIRE(i.rankTable(/* collective = */ true) == compare);

Check notice

Code scanning / CodeQL

Commented-out code Note test

This comment appears to contain commented-out code.
MPI_COMM_WORLD);
i.parseBase();
REQUIRE(i.rankTable(/* collective = */ false) == compare);
// REQUIRE(i.rankTable(/* collective = */ false) == compare);

Check notice

Code scanning / CodeQL

Commented-out code Note test

This comment appears to contain commented-out code.
Access::READ_RANDOM_ACCESS,
MPI_COMM_WORLD);
REQUIRE(i.rankTable(/* collective = */ true) == compare);
// REQUIRE(i.rankTable(/* collective = */ true) == compare);

Check notice

Code scanning / CodeQL

Commented-out code Note test

This comment appears to contain commented-out code.
Access::READ_RANDOM_ACCESS,
MPI_COMM_WORLD);
REQUIRE(i.rankTable(/* collective = */ false) == compare);
// REQUIRE(i.rankTable(/* collective = */ false) == compare);

Check notice

Code scanning / CodeQL

Commented-out code Note test

This comment appears to contain commented-out code.
*/
auto rankMetaIn = series.rankTable(/* collective = */ true);
OPENPMD_REQUIRE_GUARD_WINDOWS(rankMetaIn == writingRanksHostnames);
// OPENPMD_REQUIRE_GUARD_WINDOWS(rankMetaIn == writingRanksHostnames);

Check notice

Code scanning / CodeQL

Commented-out code Note test

This comment appears to contain commented-out code.
TEST_CASE("adios2_chunk_distribution", "[parallel][adios2]")
{
adios2_chunk_distribution::run_test();
// adios2_chunk_distribution::run_test();

Check notice

Code scanning / CodeQL

Commented-out code Note test

This comment appears to contain commented-out code.
if (test_rank_table)
{
REQUIRE(read.rankTable(/* collective = */ false) == compare);
// REQUIRE(read.rankTable(/* collective = */ false) == compare);

Check notice

Code scanning / CodeQL

Commented-out code Note test

This comment appears to contain commented-out code.
<< host << "'\n";
}
REQUIRE(rank_table.size() == 1);
// REQUIRE(rank_table.size() == 1);

Check notice

Code scanning / CodeQL

Commented-out code Note test

This comment appears to contain commented-out code.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant