I'm working in a Jupyter Notebook using Python 3.12. I have a 2D xarray (in reality, it's 3D, but we can treat it as 2D). I want to pull out the values based on indices I acquire elsewhere, and then store those values in a list (or numpy array; haven't decided yet, but the particular container doesn't matter, just that I have the values stored in a container that I can easily access).
I have code that works, but it is painfully slow. See below.
myValues = []
for idx_pair in myIndices:
myValues.append( da[ idx_pair[0],idx_pair[1] ].item() )
In the above, myIndices is a two column array, with each row being the x and y index of da. The amount of index pairs in myIndices can reach as high as 100,000, and I need to loop through several different sets of myIndices. da is a 3D xarray ( da.shape ~ (20, 2000, 2000) which correspond to time, x direction, y direction ), though it can be treated as a 2D array that's about 2000x2000 (just x and y). Getting the values for a 2D da takes about 35 seconds. So this code works (it's the only method I've found that does), but it is far too slow to be useful.
**How can I more efficiently and rapidly access values from an xarray using index locations? **
I have tried da.load(). Supposedly, that takes care of the lazy loading issue inherent with a lot of xarrays, but it does nothing to reduce the run time in my case.
I've tried different ways of accessing the values from the xarray (e.g., .isel(x=myXs, y=myYs) or da[ myXs, myYs ]), but I get weird and insanely large matrices (e.g., ~100,000x100,000 with most of the values being zero). I feel like the solution to my problem is in how I'm accessing the values, but I can't figure out any other method to do it.
.item()is killing performance. You're making 100,000 individual element extractions, each with Python overhead. You want vectorized extraction. You want to tryidx_x = myIndices[:, 0]thenidx_y = myIndices[:, 1]then the linemyValues = da.values[idx_x, idx_y]. About this: see 'Integer array indexing' under 'Advanced indexing' here in the numpy documentation. Also commonly called 'fancy indexing'.