8

Suppose I have the following DataArray

arr = xarray.DataArray(np.arange(6).reshape(2,3),
                        dims=['A', 'B'],
                        coords=dict(A=['a0', 'a1'], 
                                    B=['b0', 'b1', 'b2']))

I want to iterate over the first dimension and do the following (of course I want to do something more complex than printing)

for coor in arr.A.values:
    print(coor, arr.sel(A=coor).values)

and get

a0 [0 1 2]
a1 [3 4 5]

I am new to xarray, so I was wondering whether there was some more natural way to achieve this, something like

for coor, sub_arr in arr.some_method():
    print(coor, sub_arr)

2 Answers 2

14

You can simply iterate over the DataArray - each element of the iterator will itself be a DataArray with a single value for the first coordinate:

for a in arr:
    print(a.A.item(), a.values)

prints

a0 [0 1 2]
a1 [3 4 5]

Note the use of the .item() method to access the scalar value of the zero-dimensional array a.A.

To iterate over the second dimension, you can just transpose the data:

for b in arr.T: # or arr.transpose()
    print(b.B.item(), b.values)

prints

b0 [0 3]
b1 [1 4]
b2 [2 5]

For multidimensional data, you can move the dimension you want to iterate over to the first place using ellipsis:

for x in arr.transpose("B", ...):
    # x has one less dimension than arr, and x.B is a scalar
    do_stuff_with(x)

The documentation on reshaping and reorganizing data has further details.

Sign up to request clarification or add additional context in comments.

4 Comments

That's kind of cool, but still kind of ugly and unintuitive. It's a shame that xarray doesn't have a more natural way of doing this.
I get an error when I do this: "ValueError: Ellipsis not found in array dimensions ('time', 'ens_member', 'station', 'lead_time')". My xarray version might be a little old perhaps?
To me, it makes sense that iterating over the array takes slices along the first dimension, I don't know what else it would do. Looks like support for transpose using ellipsis was added in version 0.14.1.
Oh yeah the first dimension iteration is ok-ish, although I still think a bit ugly because xarray is generally more abstract than that, i.e. most of the time you shouldn't really need to care which dimension is in what order in the underlying array; certainly it is less readable in the code since that ordering is often non-obvious. The transpose version has the abstraction benefit, but is kind of ugly since you need to deal with all the dimensions rather than just the one you care about. I feel better using groupby, especially in older versions where the ellipsis bit doesn't work in transpose.
9

It's an old question, but I find that using groupby is cleaner and makes more intuitive sense to me than using transpose when you want to iterate some dimension other than the first:

for coor, sub_arr in arr.groupby('A'):
    print(coor)
    print(sub_arr)

a0
<xarray.DataArray (B: 3)>
array([0, 1, 2])
Coordinates:
  * B        (B) <U2 'b0' 'b1' 'b2'
    A        <U2 'a0'
a1
<xarray.DataArray (B: 3)>
array([3, 4, 5])
Coordinates:
  * B        (B) <U2 'b0' 'b1' 'b2'
    A        <U2 'a1'

Also it seems that older versions of xarray don't handle the ellipsis correctly (see mgunyho's answer), but groupby still works correctly.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.