2

I have multiple 2d xarray.DataArray which do not (necessarily) share common coordinate values. Imagine these DataArray's having come from slicing out multiple (non-overlapping) bounding boxes in a larger gridded dataset.

I'd like to combine the arrays along a new axis fid (i.e. a string identifier for the bounding box used to slice each array) without the original 2d coordinates "expanding" and filling with nan.

e.g.

import xarray as xr
import numpy as np

# create some toy gridded data
nx = 9
ny = 9
data = np.random.randint(5, size=(nx, ny))
x_coord = np.linspace(0, 1, nx)
y_coord = np.linspace(0, 1, ny)
da = xr.DataArray(
    data,
    dims=("x_coord", "y_coord"),
    coords={"x_coord": x_coord, "y_coord": y_coord}
)

# slice out two subsets of the gridded data
a = da.isel(x_coord=[1, 2, 3], y_coord=[2, 3, 4]).expand_dims(fid=["abc123"])
b = da.isel(x_coord=[6, 7, 8], y_coord=[5, 6, 7]).expand_dims(fid=["def456"])
>>> a
<xarray.DataArray (fid: 1, x_coord: 3, y_coord: 3)>
array([[[3, 4, 3],
        [3, 1, 0],
        [3, 2, 4]]])
Coordinates:
  * fid      (fid) object 'abc123'
  * x_coord  (x_coord) float64 0.125 0.25 0.375
  * y_coord  (y_coord) float64 0.25 0.375 0.5

>>> b
<xarray.DataArray (fid: 1, x_coord: 3, y_coord: 3)>
array([[[4, 3, 0],
        [3, 2, 2],
        [4, 2, 1]]])
Coordinates:
  * fid      (fid) object 'def456'
  * x_coord  (x_coord) float64 0.75 0.875 1.0
  * y_coord  (y_coord) float64 0.625 0.75 0.875

If I naively try and concatenate these along fid dimension, the x_coord and y_coord expand to encompass all coordinate values from both sources, resulting in a (1, 6, 6) shaped array that is filled with nan in most places:

>>> xr.concat([a, b], dim="fid")
<xarray.DataArray (fid: 2, x_coord: 6, y_coord: 6)>
array([[[ 3.,  4.,  3., nan, nan, nan],
        [ 3.,  1.,  0., nan, nan, nan],
        [ 3.,  2.,  4., nan, nan, nan],
        [nan, nan, nan, nan, nan, nan],
        [nan, nan, nan, nan, nan, nan],
        [nan, nan, nan, nan, nan, nan]],

       [[nan, nan, nan, nan, nan, nan],
        [nan, nan, nan, nan, nan, nan],
        [nan, nan, nan, nan, nan, nan],
        [nan, nan, nan,  4.,  3.,  0.],
        [nan, nan, nan,  3.,  2.,  2.],
        [nan, nan, nan,  4.,  2.,  1.]]])
Coordinates:
  * fid      (fid) object 'abc123' 'def456'
  * x_coord  (x_coord) float64 0.125 0.25 0.375 0.75 0.875 1.0
  * y_coord  (y_coord) float64 0.25 0.375 0.5 0.625 0.75 0.875

I want the resulting array to remain (2, 3, 3) shaped. My idea is to pre-process each individual array to use a "local" coordinates system (xi, yi), e.g.

a_ = a.assign_coords(x_coord=a.x_coord-a.x_coord.min(), y_coord=a.y_coord-a.y_coord.min())
b_ = b.assign_coords(x_coord=b.x_coord-b.x_coord.min(), y_coord=b.y_coord-b.y_coord.min())
>>> xr.concat([a_, b_], dim="fid").rename(x_coord="xi", y_coord="yi")
<xarray.DataArray (fid: 2, xi: 3, yi: 3)>
array([[[3, 4, 3],
        [3, 1, 0],
        [3, 2, 4]],

       [[4, 3, 0],
        [3, 2, 2],
        [4, 2, 1]]])
Coordinates:
  * fid      (fid) object 'abc123' 'def456'
  * xi  (xi) float64 0.0 0.125 0.25
  * yi  (yi) float64 0.0 0.125 0.25

... but I also want to keep the "original" coordinate system for each array. I imagine this will involve creating a multidimensional coordinate such that the resulting array has coordinates that go something like:

Coordinates:
  * fid      (fid) object 
                array(['abc123', 'def456'])
  * xi       (xi) float64 
                array([0.0, 0.125, 0.25])
  * yi       (yi) float64 
                array([0.0, 0.125, 0.25])
    x_coord  (fid, xi) float64 
                array([[0.125 0.25 0.375], [0.75 0.875 1.0]])
    y_coord  (fid, yi) float64 
                array([[0.25 0.375 0.5], [0.625 0.75 0.875]])

I'm just not quite sure of a neat way to go about doing this!

1 Answer 1

1

Almost there! I’d take the same approach in combination with xr.DataArray.swap_dims:


a_ = a.assign_coords(
    xi=(a.x_coord - a.x_coord.min()), 
    yi=(a.y_coord - a.y_coord.min()),
).swap_dims({
    "x_coord": "xi",
    "y_coord": "yi",
})

b_ = b.assign_coords(
    xi=(b.x_coord - b.x_coord.min()), 
    yi=(b.y_coord - b.y_coord.min()),
).swap_dims({
    "x_coord": "xi",
    "y_coord": "yi",
})

result = xr.concat([a_, b_], dim="fid")

This will preserve the original coordinates while indexing by xi, yi. Note that you will not be able to use the original label values to slice or select, e.g. with .sel. Instead, you’ll need to use your new dim labels (fid, xi, yi).

Sign up to request clarification or add additional context in comments.

1 Comment

Glad I’ve now found a use case for swap_dims! Thanks

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.