Open
Description
I was helping out a labmate recently and thought something like this would be a good example for the docs. I was showing her how to use xskillscore
to sift through a lot of ensemble members and pull out automatically a few members that do a very good job and very poor job at replicating ENSO from observations. This beats out the standard practice of doing it manually, of course.
We can adapt the following demo using real data. Could just perturb it or could pull down just the tropical Pacific for CESM-LE vs. observations.
The idea here was to show how you could use pattern correlations and RMSE and use xarray
to select the single best/worst performers or sort and select a range so you could plot a few to eyeball from there.
import xskillscore as xs
import xarray as xr
import numpy as np
np.random.seed(333)
reference_data = np.random.rand(5, 5)
ds1 = xr.DataArray(reference_data, dims=['x', 'y'])
ensemble_data = np.random.rand(5, 5, 10)
# way off the mark.
ensemble_data[:, :, 9] = ensemble_data[:, :, 9] + 5
# perfect fit
ensemble_data[:, :, 0] = reference_data
ds2 = xr.DataArray(ensemble_data, dims=['x', 'y', 'member'])
ds2['member'] = np.arange(ds2.member.size) + 1
# assess performance
patterncorr = xs.pearson_r(ds1, ds2, dim=['x', 'y'], skipna=True)
rmse = xs.rmse(ds1, ds2, dim=['x', 'y'], skipna=True)
# SELECTING MEMBERS
# 1. just find the single best/worst.
# argmin/argmax give 0-based indices so use .isel()
ds2.isel(member=patterncorr.argmin())
ds2.isel(member=rmse.argmax())
ds2.isel(member=patterncorr.argmax())
ds2.isel(member=rmse.argmin())
# 2. select a few from top and bottom.
# sorting by the values in this array.
# will shift the member coordinate label so we can use that.
best_to_worst = rmse.sortby(rmse)
best_members = best_to_worst.isel(member=slice(0, 3)).member
worst_members = best_to_worst.isel(member=slice(7, None)).member
# note we use `.sel()` here since we are referencing the actual
# member label
ds2.sel(member=best_members)