Skip to content

Allow Stores to opt out of consolidated metadata. #3119

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Jun 11, 2025
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions docs/user-guide/consolidated_metadata.rst
Original file line number Diff line number Diff line change
Expand Up @@ -114,3 +114,19 @@ removed, or modified, consolidated metadata may not be desirable.
metadata.

.. _Consolidated Metadata: https://github.com/zarr-developers/zarr-specs/pull/309

Stores Without Support for Consolidated Metadata
------------------------------------------------

Some stores may want to opt out of the conolidated metadata mechanism. This
may be for several reasons like:

* They want to maintain read-write consistency, which is challenging with
consolidated metadata.
* They have their own consolidated metadata mechanism.
* They offer good enough performance without need for consolidation.

This type of store can declare it doesn't want consolidation by implementing
`Store.supports_consolidated_metadata`. For stores that don't support
consolidation, Zarr will silently ignore any `consolidate_metadata` calls,
maintaining the store in its unconsolidated state.
12 changes: 12 additions & 0 deletions src/zarr/abc/store.py
Original file line number Diff line number Diff line change
Expand Up @@ -264,6 +264,18 @@
"""
await gather(*starmap(self.set, values))

@property
def supports_consolidated_metadata(self) -> bool:
"""
Does the store support and benefit from consolidated metadata?.

If it doesn't Zarr will ignore requests to consolidate the metadata.
Stores that would return `True` are the ones that implement their own
consolidation mechanism, that allows fast querying of metadata keys.
"""

return True

Check warning on line 277 in src/zarr/abc/store.py

View check run for this annotation

Codecov / codecov/patch

src/zarr/abc/store.py#L277

Added line #L277 was not covered by tests

@property
@abstractmethod
def supports_deletes(self) -> bool:
Expand Down
10 changes: 8 additions & 2 deletions src/zarr/api/asynchronous.py
Original file line number Diff line number Diff line change
Expand Up @@ -174,7 +174,8 @@
Consolidate the metadata of all nodes in a hierarchy.

Upon completion, the metadata of the root node in the Zarr hierarchy will be
updated to include all the metadata of child nodes.
updated to include all the metadata of child nodes. For Stores that prefer
not to use consolidated metadata, this operation does nothing.

Parameters
----------
Expand All @@ -194,11 +195,16 @@
-------
group: AsyncGroup
The group, with the ``consolidated_metadata`` field set to include
the metadata of each child node.
the metadata of each child node. If the Store doesn't prefer
consolidated metadata, this is function does nothing and returns
the group without modifications. See ``Store.supports_consolidated_metadata``.
"""
store_path = await make_store_path(store, path=path)

group = await AsyncGroup.open(store_path, zarr_format=zarr_format, use_consolidated=False)
if not store_path.store.supports_consolidated_metadata:
return group

Check warning on line 206 in src/zarr/api/asynchronous.py

View check run for this annotation

Codecov / codecov/patch

src/zarr/api/asynchronous.py#L205-L206

Added lines #L205 - L206 were not covered by tests

group.store_path.store._check_writable()

members_metadata = {
Expand Down
8 changes: 6 additions & 2 deletions src/zarr/api/synchronous.py
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,8 @@ def consolidate_metadata(
Consolidate the metadata of all nodes in a hierarchy.

Upon completion, the metadata of the root node in the Zarr hierarchy will be
updated to include all the metadata of child nodes.
updated to include all the metadata of child nodes. For Stores that prefer
not to use consolidated metadata, this operation does nothing.

Parameters
----------
Expand All @@ -101,7 +102,10 @@ def consolidate_metadata(
-------
group: Group
The group, with the ``consolidated_metadata`` field set to include
the metadata of each child node.
the metadata of each child node. If the Store doesn't prefer
consolidated metadata, this function does nothing and returns
the group without modifications. See ``Store.supports_consolidated_metadata``.

"""
return Group(sync(async_api.consolidate_metadata(store, path=path, zarr_format=zarr_format)))

Expand Down
9 changes: 7 additions & 2 deletions src/zarr/core/group.py
Original file line number Diff line number Diff line change
Expand Up @@ -492,8 +492,11 @@
store (in the ``zarr.json`` for Zarr format 3 and in the ``.zmetadata`` file
for Zarr format 2).

To explicitly require consolidated metadata, set ``use_consolidated=True``,
which will raise an exception if consolidated metadata is not found.
To explicitly require consolidated metadata, set ``use_consolidated=True``.
If the Store supports consolidated metadata, this will raise an
exception if consolidated metadata is not found. If the Store doesn't want
to use consolidated metadata, we assume it implements its own consolidation,
so this is equivalent to use_consolidated=False.

To explicitly *not* use consolidated metadata, set ``use_consolidated=False``,
which will fall back to using the regular, non consolidated metadata.
Expand All @@ -503,6 +506,8 @@
to load consolidated metadata from a non-default key.
"""
store_path = await make_store_path(store)
if not store_path.store.supports_consolidated_metadata:
use_consolidated = False

Check warning on line 510 in src/zarr/core/group.py

View check run for this annotation

Codecov / codecov/patch

src/zarr/core/group.py#L509-L510

Added lines #L509 - L510 were not covered by tests

consolidated_key = ZMETADATA_V2_JSON

Expand Down
28 changes: 26 additions & 2 deletions tests/test_metadata/test_consolidated.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,14 +17,14 @@
open,
open_consolidated,
)
from zarr.core.buffer import cpu, default_buffer_prototype
from zarr.core.buffer import Buffer, cpu, default_buffer_prototype
from zarr.core.group import ConsolidatedMetadata, GroupMetadata
from zarr.core.metadata import ArrayV3Metadata
from zarr.core.metadata.v2 import ArrayV2Metadata
from zarr.storage import StorePath

if TYPE_CHECKING:
from zarr.abc.store import Store
from zarr.abc.store import ByteRequest, Store
from zarr.core.common import ZarrFormat


Expand Down Expand Up @@ -651,3 +651,27 @@ async def test_consolidated_metadata_encodes_special_chars(
elif zarr_format == 3:
assert root_metadata["child"]["attributes"]["test"] == expected_fill_value
assert root_metadata["time"]["fill_value"] == expected_fill_value


async def test_consolidate_metadata_is_noop_for_self_consolidating_stores():
"""Verify calling consolidate_metadata on a non supporting stores does nothing"""

# We create a store that doesn't support consolidated metadata
class Store(zarr.storage.MemoryStore):
@property
def supports_consolidated_metadata(self) -> bool:
return False

memory_store = Store()
root = await zarr.api.asynchronous.create_group(store=memory_store)
await root.create_group("a/b")

# now we monkey patch the store so it raises if `Store.set` is called
async def set_raises(self, value: Buffer, byte_range: ByteRequest | None = None) -> None:
raise ValueError("consolidated metadata called")

memory_store.set = set_raises

# consolidate_metadata would call `set` if the store supported consolidated metadata
# if this doesn't raise, it means consolidate_metadata is NOOP
await zarr.api.asynchronous.consolidate_metadata(memory_store)