Combine 2d xarray dataarray with no common coordinate values

Question

I have multiple 2d xarray.DataArray which do not (necessarily) share common coordinate values. Imagine these DataArray's having come from slicing out multiple (non-overlapping) bounding boxes in a larger gridded dataset.

I'd like to combine the arrays along a new axis fid (i.e. a string identifier for the bounding box used to slice each array) without the original 2d coordinates "expanding" and filling with nan.

e.g.

import xarray as xr
import numpy as np

# create some toy gridded data
nx = 9
ny = 9
data = np.random.randint(5, size=(nx, ny))
x_coord = np.linspace(0, 1, nx)
y_coord = np.linspace(0, 1, ny)
da = xr.DataArray(
    data,
    dims=("x_coord", "y_coord"),
    coords={"x_coord": x_coord, "y_coord": y_coord}
)

# slice out two subsets of the gridded data
a = da.isel(x_coord=[1, 2, 3], y_coord=[2, 3, 4]).expand_dims(fid=["abc123"])
b = da.isel(x_coord=[6, 7, 8], y_coord=[5, 6, 7]).expand_dims(fid=["def456"])

>>> a
<xarray.DataArray (fid: 1, x_coord: 3, y_coord: 3)>
array([[[3, 4, 3],
        [3, 1, 0],
        [3, 2, 4]]])
Coordinates:
  * fid      (fid) object 'abc123'
  * x_coord  (x_coord) float64 0.125 0.25 0.375
  * y_coord  (y_coord) float64 0.25 0.375 0.5

>>> b
<xarray.DataArray (fid: 1, x_coord: 3, y_coord: 3)>
array([[[4, 3, 0],
        [3, 2, 2],
        [4, 2, 1]]])
Coordinates:
  * fid      (fid) object 'def456'
  * x_coord  (x_coord) float64 0.75 0.875 1.0
  * y_coord  (y_coord) float64 0.625 0.75 0.875

If I naively try and concatenate these along fid dimension, the x_coord and y_coord expand to encompass all coordinate values from both sources, resulting in a (1, 6, 6) shaped array that is filled with nan in most places:

>>> xr.concat([a, b], dim="fid")
<xarray.DataArray (fid: 2, x_coord: 6, y_coord: 6)>
array([[[ 3.,  4.,  3., nan, nan, nan],
        [ 3.,  1.,  0., nan, nan, nan],
        [ 3.,  2.,  4., nan, nan, nan],
        [nan, nan, nan, nan, nan, nan],
        [nan, nan, nan, nan, nan, nan],
        [nan, nan, nan, nan, nan, nan]],

       [[nan, nan, nan, nan, nan, nan],
        [nan, nan, nan, nan, nan, nan],
        [nan, nan, nan, nan, nan, nan],
        [nan, nan, nan,  4.,  3.,  0.],
        [nan, nan, nan,  3.,  2.,  2.],
        [nan, nan, nan,  4.,  2.,  1.]]])
Coordinates:
  * fid      (fid) object 'abc123' 'def456'
  * x_coord  (x_coord) float64 0.125 0.25 0.375 0.75 0.875 1.0
  * y_coord  (y_coord) float64 0.25 0.375 0.5 0.625 0.75 0.875

I want the resulting array to remain (2, 3, 3) shaped. My idea is to pre-process each individual array to use a "local" coordinates system (xi, yi), e.g.

a_ = a.assign_coords(x_coord=a.x_coord-a.x_coord.min(), y_coord=a.y_coord-a.y_coord.min())
b_ = b.assign_coords(x_coord=b.x_coord-b.x_coord.min(), y_coord=b.y_coord-b.y_coord.min())
>>> xr.concat([a_, b_], dim="fid").rename(x_coord="xi", y_coord="yi")
<xarray.DataArray (fid: 2, xi: 3, yi: 3)>
array([[[3, 4, 3],
        [3, 1, 0],
        [3, 2, 4]],

       [[4, 3, 0],
        [3, 2, 2],
        [4, 2, 1]]])
Coordinates:
  * fid      (fid) object 'abc123' 'def456'
  * xi  (xi) float64 0.0 0.125 0.25
  * yi  (yi) float64 0.0 0.125 0.25

... but I also want to keep the "original" coordinate system for each array. I imagine this will involve creating a multidimensional coordinate such that the resulting array has coordinates that go something like:

Coordinates:
  * fid      (fid) object 
                array(['abc123', 'def456'])
  * xi       (xi) float64 
                array([0.0, 0.125, 0.25])
  * yi       (yi) float64 
                array([0.0, 0.125, 0.25])
    x_coord  (fid, xi) float64 
                array([[0.125 0.25 0.375], [0.75 0.875 1.0]])
    y_coord  (fid, yi) float64 
                array([[0.25 0.375 0.5], [0.625 0.75 0.875]])

I'm just not quite sure of a neat way to go about doing this!

Michael Delgado · Accepted Answer · 2022-12-27 00:22:20Z

1

Almost there! I’d take the same approach in combination with xr.DataArray.swap_dims:


a_ = a.assign_coords(
    xi=(a.x_coord - a.x_coord.min()), 
    yi=(a.y_coord - a.y_coord.min()),
).swap_dims({
    "x_coord": "xi",
    "y_coord": "yi",
})

b_ = b.assign_coords(
    xi=(b.x_coord - b.x_coord.min()), 
    yi=(b.y_coord - b.y_coord.min()),
).swap_dims({
    "x_coord": "xi",
    "y_coord": "yi",
})

result = xr.concat([a_, b_], dim="fid")

This will preserve the original coordinates while indexing by xi, yi. Note that you will not be able to use the original label values to slice or select, e.g. with .sel. Instead, you’ll need to use your new dim labels (fid, xi, yi).

answered Dec 27, 2022 at 0:22

Michael Delgado

15.7k4 gold badges39 silver badges65 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

ogb119 Over a year ago

Glad I’ve now found a use case for swap_dims! Thanks

Collectives™ on Stack Overflow

Combine 2d xarray dataarray with no common coordinate values

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related