Skip to content

Compare and document downscaling methods available across libraries #30

@LucaMarconato

Description

@LucaMarconato

A outcome of the hackathon could be a comprehensive benchmarking and document where are the bottlenecks/incompatibilities. Then we can report the findings to the respective developers in dask, ome-zarr, etc.

Here is a starting point:

  • multiscale-spatial-image <= 2.0.3, (therefore excluding latest version), calls .compute() when downscaling imags
  • multiscale-spatial-image == 2.1.0 overcomes the problem above by relying on ngff-zarr. Currently multiscale-spatial-image == 2.1.0 has dependencies constraints with spatialdata. Also see next:
  • ngff-zarr, reported performance bottleneck for large data due to a large dask computational graph Dask task graphs are huge (non-performant) for large data fideus-labs/ngff-zarr#48
  • ome-zarr-py (implements some “slow” downscaling). This uses dask_image zoom. Downscaling shows performance bottlenecks around rechunk() and astype() https://github.com/ome/ome-zarr-py/blob/cade24ed81440d02d721966e0f766e2ee5e043d9/ome_zarr/dask_utils.py#L130. With a recent PR in ome-zarr-py, other downscaling functions are available (they are also in dask_utils.py.
  • spatialdata: uses dask_image zoom (as wrapped in ome-zarr-py), but adds extra functionalities such as downscaling with different scale factors per dimension.
  • Marvin uses a manual approach based on dask-image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    🚨 Before the hackathon

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions