I have multiple 2d xarray.DataArray which do not (necessarily) share common coordinate values. Imagine these DataArray's having come from slicing out multiple (non-overlapping) bounding boxes in a larger gridded dataset.
I'd like to combine the arrays along a new axis fid (i.e. a string identifier for the bounding box used to slice each array) without the original 2d coordinates "expanding" and filling with nan.
e.g.
import xarray as xr
import numpy as np
# create some toy gridded data
nx = 9
ny = 9
data = np.random.randint(5, size=(nx, ny))
x_coord = np.linspace(0, 1, nx)
y_coord = np.linspace(0, 1, ny)
da = xr.DataArray(
data,
dims=("x_coord", "y_coord"),
coords={"x_coord": x_coord, "y_coord": y_coord}
)
# slice out two subsets of the gridded data
a = da.isel(x_coord=[1, 2, 3], y_coord=[2, 3, 4]).expand_dims(fid=["abc123"])
b = da.isel(x_coord=[6, 7, 8], y_coord=[5, 6, 7]).expand_dims(fid=["def456"])
>>> a
<xarray.DataArray (fid: 1, x_coord: 3, y_coord: 3)>
array([[[3, 4, 3],
[3, 1, 0],
[3, 2, 4]]])
Coordinates:
* fid (fid) object 'abc123'
* x_coord (x_coord) float64 0.125 0.25 0.375
* y_coord (y_coord) float64 0.25 0.375 0.5
>>> b
<xarray.DataArray (fid: 1, x_coord: 3, y_coord: 3)>
array([[[4, 3, 0],
[3, 2, 2],
[4, 2, 1]]])
Coordinates:
* fid (fid) object 'def456'
* x_coord (x_coord) float64 0.75 0.875 1.0
* y_coord (y_coord) float64 0.625 0.75 0.875
If I naively try and concatenate these along fid dimension, the x_coord and y_coord expand to encompass all coordinate values from both sources, resulting in a (1, 6, 6) shaped array that is filled with nan in most places:
>>> xr.concat([a, b], dim="fid")
<xarray.DataArray (fid: 2, x_coord: 6, y_coord: 6)>
array([[[ 3., 4., 3., nan, nan, nan],
[ 3., 1., 0., nan, nan, nan],
[ 3., 2., 4., nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan]],
[[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, 4., 3., 0.],
[nan, nan, nan, 3., 2., 2.],
[nan, nan, nan, 4., 2., 1.]]])
Coordinates:
* fid (fid) object 'abc123' 'def456'
* x_coord (x_coord) float64 0.125 0.25 0.375 0.75 0.875 1.0
* y_coord (y_coord) float64 0.25 0.375 0.5 0.625 0.75 0.875
I want the resulting array to remain (2, 3, 3) shaped. My idea is to pre-process each individual array to use a "local" coordinates system (xi, yi), e.g.
a_ = a.assign_coords(x_coord=a.x_coord-a.x_coord.min(), y_coord=a.y_coord-a.y_coord.min())
b_ = b.assign_coords(x_coord=b.x_coord-b.x_coord.min(), y_coord=b.y_coord-b.y_coord.min())
>>> xr.concat([a_, b_], dim="fid").rename(x_coord="xi", y_coord="yi")
<xarray.DataArray (fid: 2, xi: 3, yi: 3)>
array([[[3, 4, 3],
[3, 1, 0],
[3, 2, 4]],
[[4, 3, 0],
[3, 2, 2],
[4, 2, 1]]])
Coordinates:
* fid (fid) object 'abc123' 'def456'
* xi (xi) float64 0.0 0.125 0.25
* yi (yi) float64 0.0 0.125 0.25
... but I also want to keep the "original" coordinate system for each array. I imagine this will involve creating a multidimensional coordinate such that the resulting array has coordinates that go something like:
Coordinates:
* fid (fid) object
array(['abc123', 'def456'])
* xi (xi) float64
array([0.0, 0.125, 0.25])
* yi (yi) float64
array([0.0, 0.125, 0.25])
x_coord (fid, xi) float64
array([[0.125 0.25 0.375], [0.75 0.875 1.0]])
y_coord (fid, yi) float64
array([[0.25 0.375 0.5], [0.625 0.75 0.875]])
I'm just not quite sure of a neat way to go about doing this!