-
Notifications
You must be signed in to change notification settings - Fork 42
Update bounds_to_vertices
to handle descending arrays
#579
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Update bounds_to_vertices
to handle descending arrays
#579
Conversation
Add `_get_ordered_vertices()` to handle extraction of unique vertex values in ascending order
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #579 +/- ##
==========================================
+ Coverage 85.78% 93.60% +7.81%
==========================================
Files 13 14 +1
Lines 2364 2252 -112
Branches 183 0 -183
==========================================
+ Hits 2028 2108 +80
+ Misses 303 144 -159
+ Partials 33 0 -33
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Hey @aulemahal, this PR is ready for review when you get the chance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tomvothecoder for the PR. While this works well with the usual case, it seems to drop support for some other cases (duplicated coordinates, extra non-core dimensions).
- Also handle strictly monotonic bounds
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @aulemahal, this PR is ready for another review. Please make any changes or suggestions if you think the code could be better written or simplified.
I added new tests to cover additional cases and provided some examples below.
Example script:
import xarray as xr
import cf_xarray as cfxr
def print_section(title):
print(f"\n{title}")
print("=" * len(title))
def print_bounds_and_vertices(bounds, vertices):
print("Bounds:")
print(bounds)
print("Vertices:")
print(vertices)
print("-" * 40)
# 0a. Strictly monotonic bounds (increasing)
print_section("0a. Strictly monotonic bounds (increasing)")
bounds = xr.DataArray(
[[10.0, 10.5], [10.5, 11.0], [11.0, 11.5], [11.5, 12.0], [12.0, 12.5]],
dims=("lat", "bounds"),
)
vertices = cfxr.bounds_to_vertices(bounds, "bounds")
print_bounds_and_vertices(bounds, vertices)
# 0b. Strictly monotonic bounds (decreasing)
print_section("0b. Strictly monotonic bounds (decreasing)")
bounds = xr.DataArray(
[[12.5, 12.0], [12.0, 11.5], [11.5, 11.0], [11.0, 10.5], [10.5, 10.0]],
dims=("lat", "bounds"),
)
vertices = cfxr.bounds_to_vertices(bounds, "bounds")
print_bounds_and_vertices(bounds, vertices)
# 1. Descending coordinates
print_section("1. Descending coordinates")
bounds = xr.DataArray(
[[50.5, 50.0], [51.0, 50.5], [51.5, 51.0], [52.0, 51.5], [52.5, 52.0]],
dims=("lat", "bounds"),
)
vertices = cfxr.bounds_to_vertices(bounds, "bounds")
print_bounds_and_vertices(bounds, vertices)
# 2. Descending coordinates with duplicated values
print_section("2. Descending coordinates with duplicated values")
bounds = xr.DataArray(
[[50.5, 50.0], [51.0, 50.5], [51.0, 50.5], [52.0, 51.5], [52.5, 52.0]],
dims=("lat", "bounds"),
)
vertices = cfxr.bounds_to_vertices(bounds, "bounds")
print_bounds_and_vertices(bounds, vertices)
# 3. Ascending coordinates
print_section("3. Ascending coordinates")
bounds = xr.DataArray(
[[50.0, 50.5], [50.5, 51.0], [51.0, 51.5], [51.5, 52.0], [52.0, 52.5]],
dims=("lat", "bounds"),
)
vertices = cfxr.bounds_to_vertices(bounds, "bounds")
print_bounds_and_vertices(bounds, vertices)
# 4. Ascending coordinates with duplicated values
print_section("4. Ascending coordinates with duplicated values")
bounds = xr.DataArray(
[[50.0, 50.5], [50.5, 51.0], [50.5, 51.0], [51.0, 51.5], [51.5, 52.0]],
dims=("lat", "bounds"),
)
vertices = cfxr.bounds_to_vertices(bounds, "bounds")
print_bounds_and_vertices(bounds, vertices)
# 5. 3D array (extra non-core dim)
print_section("5. 3D array (extra non-core dim)")
bounds_3d = xr.DataArray(
[
[
[50.0, 50.5],
[50.5, 51.0],
[51.0, 51.5],
[51.5, 52.0],
[52.0, 52.5],
],
[
[60.0, 60.5],
[60.5, 61.0],
[61.0, 61.5],
[61.5, 62.0],
[62.0, 62.5],
],
],
dims=("extra", "lat", "bounds"),
)
vertices_3d = cfxr.bounds_to_vertices(bounds_3d, "bounds", core_dims=["lat"])
print_bounds_and_vertices(bounds_3d, vertices_3d)
# 6. 4D array (time, extra, lat, bounds)
print_section("6. 4D array (time, extra, lat, bounds)")
bounds_4d = xr.DataArray(
[
[
[
[50.0, 50.5],
[50.5, 51.0],
[51.0, 51.5],
[51.5, 52.0],
[52.0, 52.5],
],
[
[60.0, 60.5],
[60.5, 61.0],
[61.0, 61.5],
[61.5, 62.0],
[62.0, 62.5],
],
],
[
[
[70.0, 70.5],
[70.5, 71.0],
[71.0, 71.5],
[71.5, 72.0],
[72.0, 72.5],
],
[
[80.0, 80.5],
[80.5, 81.0],
[81.0, 81.5],
[81.5, 82.0],
[82.0, 82.5],
],
],
],
dims=("time", "extra", "lat", "bounds"),
)
vertices_4d = cfxr.bounds_to_vertices(bounds_4d, "bounds", core_dims=["lat"])
print_bounds_and_vertices(bounds_4d, vertices_4d)
Example result (all seem good)
0a. Strictly monotonic bounds (increasing)
==========================================
Bounds:
<xarray.DataArray (lat: 5, bounds: 2)> Size: 80B
array([[10. , 10.5],
[10.5, 11. ],
[11. , 11.5],
[11.5, 12. ],
[12. , 12.5]])
Dimensions without coordinates: lat, bounds
Vertices:
<xarray.DataArray (lat_vertices: 6)> Size: 48B
array([10. , 10.5, 11. , 11.5, 12. , 12.5])
Dimensions without coordinates: lat_vertices
----------------------------------------
0b. Strictly monotonic bounds (decreasing)
==========================================
Bounds:
<xarray.DataArray (lat: 5, bounds: 2)> Size: 80B
array([[12.5, 12. ],
[12. , 11.5],
[11.5, 11. ],
[11. , 10.5],
[10.5, 10. ]])
Dimensions without coordinates: lat, bounds
Vertices:
<xarray.DataArray (lat_vertices: 6)> Size: 48B
array([12.5, 12. , 11.5, 11. , 10.5, 10. ])
Dimensions without coordinates: lat_vertices
----------------------------------------
1. Descending coordinates
=========================
Bounds:
<xarray.DataArray (lat: 5, bounds: 2)> Size: 80B
array([[50.5, 50. ],
[51. , 50.5],
[51.5, 51. ],
[52. , 51.5],
[52.5, 52. ]])
Dimensions without coordinates: lat, bounds
Vertices:
<xarray.DataArray (lat_vertices: 6)> Size: 48B
array([52.5, 52. , 51.5, 51. , 50.5, 50. ])
Dimensions without coordinates: lat_vertices
----------------------------------------
2. Descending coordinates with duplicated values
================================================
Bounds:
<xarray.DataArray (lat: 5, bounds: 2)> Size: 80B
array([[50.5, 50. ],
[51. , 50.5],
[51. , 50.5],
[52. , 51.5],
[52.5, 52. ]])
Dimensions without coordinates: lat, bounds
Vertices:
<xarray.DataArray (lat_vertices: 6)> Size: 48B
array([52.5, 52. , 51.5, 50.5, 50.5, 50. ])
Dimensions without coordinates: lat_vertices
----------------------------------------
3. Ascending coordinates
========================
Bounds:
<xarray.DataArray (lat: 5, bounds: 2)> Size: 80B
array([[50. , 50.5],
[50.5, 51. ],
[51. , 51.5],
[51.5, 52. ],
[52. , 52.5]])
Dimensions without coordinates: lat, bounds
Vertices:
<xarray.DataArray (lat_vertices: 6)> Size: 48B
array([50. , 50.5, 51. , 51.5, 52. , 52.5])
Dimensions without coordinates: lat_vertices
----------------------------------------
4. Ascending coordinates with duplicated values
===============================================
Bounds:
<xarray.DataArray (lat: 5, bounds: 2)> Size: 80B
array([[50. , 50.5],
[50.5, 51. ],
[50.5, 51. ],
[51. , 51.5],
[51.5, 52. ]])
Dimensions without coordinates: lat, bounds
Vertices:
<xarray.DataArray (lat_vertices: 6)> Size: 48B
array([50. , 50.5, 50.5, 51. , 51.5, 52. ])
Dimensions without coordinates: lat_vertices
----------------------------------------
5. 3D array (extra non-core dim)
================================
Bounds:
<xarray.DataArray (extra: 2, lat: 5, bounds: 2)> Size: 160B
array([[[50. , 50.5],
[50.5, 51. ],
[51. , 51.5],
[51.5, 52. ],
[52. , 52.5]],
[[60. , 60.5],
[60.5, 61. ],
[61. , 61.5],
[61.5, 62. ],
[62. , 62.5]]])
Dimensions without coordinates: extra, lat, bounds
Vertices:
<xarray.DataArray (extra: 2, lat_vertices: 6)> Size: 96B
array([[50. , 50.5, 51. , 51.5, 52. , 52.5],
[60. , 60.5, 61. , 61.5, 62. , 62.5]])
Dimensions without coordinates: extra, lat_vertices
----------------------------------------
6. 4D array (time, extra, lat, bounds)
======================================
Bounds:
<xarray.DataArray (time: 2, extra: 2, lat: 5, bounds: 2)> Size: 320B
array([[[[50. , 50.5],
[50.5, 51. ],
[51. , 51.5],
[51.5, 52. ],
[52. , 52.5]],
[[60. , 60.5],
[60.5, 61. ],
[61. , 61.5],
[61.5, 62. ],
[62. , 62.5]]],
[[[70. , 70.5],
[70.5, 71. ],
[71. , 71.5],
[71.5, 72. ],
[72. , 72.5]],
[[80. , 80.5],
[80.5, 81. ],
[81. , 81.5],
[81.5, 82. ],
[82. , 82.5]]]])
Dimensions without coordinates: time, extra, lat, bounds
Vertices:
<xarray.DataArray (time: 2, extra: 2, lat_vertices: 6)> Size: 192B
array([[[50. , 50.5, 51. , 51.5, 52. , 52.5],
[60. , 60.5, 61. , 61.5, 62. , 62.5]],
[[70. , 70.5, 71. , 71.5, 72. , 72.5],
[80. , 80.5, 81. , 81.5, 82. , 82.5]]])
Dimensions without coordinates: time, extra, lat_vertices
----------------------------------------
if _is_bounds_strictly_monotonic(bounds): | ||
# Example: [[51.0, 50.5], [50.5, 50.0]] | ||
# Example Result: [51.0, 50.5, 50.0] | ||
vertices = np.concatenate((bounds[..., :, 0], bounds[..., -1:, 1]), axis=-1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This portion of code now handles "strictly" monotonic bounds, which are when values are all descending/ascending if flattened. For example:
- strictly monotonic (descending)
[[51.5, 51.0], [51.0, 50.5], [50.5, 50.0]]
- Non-strictly monotonic (descending):
[[50.5, 50.0], [51.0, 50.5], [51.5, 51.0]]
I noticed the 1D test case was failing with because the values are "strictly" monotonic. Now it passes.
cf-xarray/cf_xarray/tests/test_helpers.py
Lines 14 to 18 in 195fc2e
def test_bounds_to_vertices() -> None: | |
# 1D case | |
ds = airds.cf.add_bounds(["lon", "lat", "time"]) | |
lat_c = cfxr.bounds_to_vertices(ds.lat_bounds, bounds_dim="bounds") | |
assert_array_equal(ds.lat.values + 1.25, lat_c.values[:-1]) |
# 2D case (descending) | ||
bounds_2d_desc = xr.DataArray( | ||
[[50.5, 50.0], [51.0, 50.5], [51.0, 50.5], [52.0, 51.5], [52.5, 52.0]], | ||
dims=("lat", "bounds"), | ||
) | ||
expected_vertices_2d_desc = xr.DataArray( | ||
[52.5, 52.0, 51.5, 50.5, 50.5, 50.0], | ||
dims=["lat_vertices"], | ||
) | ||
vertices_2d_desc = cfxr.bounds_to_vertices(bounds_2d_desc, bounds_dim="bounds") | ||
assert_equal(expected_vertices_2d_desc, vertices_2d_desc) | ||
|
||
# 3D case (ascending, "extra" non-core dim should be preserved) | ||
bounds_3d = xr.DataArray( | ||
[ | ||
[ | ||
[50.0, 50.5], | ||
[50.5, 51.0], | ||
[51.0, 51.5], | ||
[51.5, 52.0], | ||
[52.0, 52.5], | ||
], | ||
[ | ||
[60.0, 60.5], | ||
[60.5, 61.0], | ||
[61.0, 61.5], | ||
[61.5, 62.0], | ||
[62.0, 62.5], | ||
], | ||
], | ||
dims=("extra", "lat", "bounds"), | ||
) | ||
expected_vertices_3d = xr.DataArray( | ||
[ | ||
[50.0, 50.5, 51.0, 51.5, 52.0, 52.5], | ||
[60.0, 60.5, 61.0, 61.5, 62.0, 62.5], | ||
], | ||
dims=("extra", "lat_vertices"), | ||
) | ||
vertices_3d = cfxr.bounds_to_vertices( | ||
bounds_3d, bounds_dim="bounds", core_dims=["lat"] | ||
) | ||
assert_equal(vertices_3d, expected_vertices_3d) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some new test cases that were missing.
def _get_ordered_vertices(bounds: np.ndarray) -> np.ndarray: | ||
""" | ||
Convert a bounds array of shape (..., N, 2) or (N, 2) into a 1D array of vertices. | ||
|
||
This function reconstructs the vertices from a bounds array, handling both | ||
strictly monotonic and non-strictly monotonic bounds. | ||
|
||
- For strictly monotonic bounds (all values increase or decrease when flattened), | ||
it concatenates the left endpoints and the last right endpoint. | ||
- For non-strictly monotonic bounds (bounds are consistently ascending or descending | ||
within intervals, but not strictly so), it: | ||
- Uses the minimum of each interval as the lower endpoint. | ||
- Uses the maximum of the last interval as the final vertex. | ||
- Sorts the vertices in ascending or descending order to match the direction of the bounds. | ||
|
||
Features: | ||
- Handles both ascending and descending bounds. | ||
- Does not require bounds to be strictly monotonic. | ||
- Preserves repeated coordinates if present. | ||
- Output shape is (..., N+1) or (N+1,). | ||
|
||
Parameters | ||
---------- | ||
bounds : np.ndarray | ||
Array of bounds, typically with shape (N, 2) or (..., N, 2). | ||
|
||
Returns | ||
------- | ||
np.ndarray | ||
Array of vertices with shape (..., N+1) or (N+1,). | ||
""" | ||
if _is_bounds_strictly_monotonic(bounds): | ||
# Example: [[51.0, 50.5], [50.5, 50.0]] | ||
# Example Result: [51.0, 50.5, 50.0] | ||
vertices = np.concatenate((bounds[..., :, 0], bounds[..., -1:, 1]), axis=-1) | ||
else: | ||
# Example with bounds (descending) [[50.5, 50.0], [51.0, 50.5]] | ||
# Get the lower endpoints of each bounds interval | ||
# Example Result: [50, 50.5] | ||
lower_endpoints = np.minimum(bounds[..., :, 0], bounds[..., :, 1]) | ||
|
||
# Get the upper endpoint of the last interval. | ||
# Example Result: 51.0 | ||
last_upper_endpoint = np.maximum(bounds[..., -1, 0], bounds[..., -1, 1]) | ||
|
||
# Concatenate lower endpoints and the last upper endpoint. | ||
# Example Result: [50.0, 50.5, 51.0] | ||
vertices = np.concatenate( | ||
[lower_endpoints, np.expand_dims(last_upper_endpoint, axis=-1)], axis=-1 | ||
) | ||
|
||
# Sort vertices based on the direction of the bounds | ||
# Example Result: [51.0, 50.5, 50.0] | ||
ascending = is_bounds_ascending(bounds) | ||
if ascending: | ||
vertices = np.sort(vertices, axis=-1) | ||
else: | ||
vertices = np.sort(vertices, axis=-1)[..., ::-1] | ||
|
||
return vertices |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function now handles strictly monotonic bounds (all values descend or ascend) and non-strictly monotonic bounds (values ascend/descend with each bounds interval). It also operates on the last two dimensions, preserving the leading dimensions (related #579 (comment)) and does not remove repeated coordinates (related #579 (comment)).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I will remove all of the example comments if it looks good.
# NOTE: Python 3.10 uses numpy 1.26.4. If the input is a datetime64 array, | ||
# numpy 1.26.4 may raise: numpy.core._exceptions._UFuncInputCastingError: | ||
# Cannot cast ufunc 'greater' input 0 from dtype('<m8[ns]') to dtype('<m8') | ||
# with casting rule 'same_kind' To avoid this, always cast to float64 before | ||
# np.diff. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note about Python 3.10 and numpy 1.26.4 issue in the build.
Can somebody help address the failing build (tests are passing) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tomvothecoder for the changes!
Sadly I think there are still some cases lacking. Sorry if I missed that earlier!
# Sort vertices based on the direction of the bounds | ||
# Example Result: [51.0, 50.5, 50.0] | ||
ascending = is_bounds_ascending(bounds) | ||
if ascending: | ||
vertices = np.sort(vertices, axis=-1) | ||
else: | ||
vertices = np.sort(vertices, axis=-1)[..., ::-1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we don't want to sort on "bounds". I think the vertices should be in the same order as the coordinate, not the same as the bounds.
My comprehension of the idea of "vertices" is that if you have bounds of shape (
Thus if we sort, it should be on the coordinate dim, but this current implementation assumes a ascending coordinate as it takes the "min" of all bounds and then the max of the last one. For a descending coordinate, we would need to do the opposite : max of bounds and then min of the last one.
Also, this change would make the monotonic and non-monotonic cases match. Your 0b example does what I expected for descending coordinate.
Here are two tests cases that do not work with the current implementation, if it can help:
# Non-monotonic, descending bounds, descending coord
bounds = xr.DataArray(
[[12.5, 12.0], [12.5, 12.0], [12.0, 11.5], [11.5, 11.0], [11.0, 10.5], [10.5, 10.0]],
dims=("lat", "bounds"),
)
# should be : array([12.5 , 12.5 , 12.0. 11.5, 11. , 10.5, 10.])
# Non-monotonic, ascending bounds, descending coord
bounds = xr.DataArray(
[[12.0, 12.5], [12.0, 12.5], [11.5, 12.0], [11.0, 11.5], [10.5, 11.0], [10.0, 10.5]],
dims=("lat", "bounds"),
)
# should be : array([12.5 , 12.5 , 12.0. 11.5, 11. , 10.5, 10.])
This pull request updates the
_bounds_helper
function incf_xarray/helpers.py
to be able to handle descending arrays (bounds). The output vertices will be sorted in the same direction as the input bounds, as shown in the example.cf_xarray.bounds_to_vertices
produces incorrect results with descending 2D array (e.g., bounds) #576Example
Related to