10

I have files which are made of 10 ensembles and 35 time files. One of these files looks like:

>>> xr.open_dataset('ens1/CCSM4_ens1_07ic_19820701-19820731_NPac_Jul.nc')
<xarray.Dataset>
Dimensions:    (ensemble: 1, latitude: 66, longitude: 191, time: 31)
Coordinates:
  * ensemble   (ensemble) int32 1
  * latitude   (latitude) float32 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 ...
  * longitude  (longitude) float32 100.0 101.0 102.0 103.0 104.0 105.0 106.0 ...
  * time       (time) datetime64[ns] 1982-07-01 1982-07-02 1982-07-03 ...
Data variables:
    u10m       (time, latitude, longitude) float64 -1.471 -0.05933 -1.923 ...
Attributes:
    CDI:                       Climate Data Interface version 1.6.5 (http://c...
    history:                   Wed Nov 22 21:54:08 2017: ncks -O -d longitude...
    Conventions:               CF-1.4
    CDO:                       Climate Data Operators version 1.6.5 (http://c...
    nco_openmp_thread_number:  1
    NCO:                       4.3.7

When I use open_mfdataset the files are concatenated along the time dimension and the ensemble dimension is dropped (possible because it has a size of 1)?

>>> xr.open_mfdataset('ens*/*NPac*.nc')
<xarray.Dataset>
Dimensions:    (latitude: 66, longitude: 191, time: 10850)
Coordinates:
  * latitude   (latitude) float32 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 ...
  * longitude  (longitude) float32 100.0 101.0 102.0 103.0 104.0 105.0 106.0 ...
  * time       (time) datetime64[ns] 1982-07-01 1982-07-02 1982-07-03 ...
Data variables:
    u10m       (time, latitude, longitude) float64 -1.471 -0.05933 -1.923 ...

I'm not sure if it possible to concat along the ensemble dimension as well?

I did a simple test using merge as given here Error on using xarray open_mfdataset function but it fails:

>>> ds = xr.open_mfdataset('ens1/*NPac*')
<xarray.Dataset>
Dimensions:    (ensemble: 1, latitude: 66, longitude: 191, time: 1085)
Coordinates:
  * ensemble   (ensemble) int32 1
  * latitude   (latitude) float32 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 ...
  * longitude  (longitude) float32 100.0 101.0 102.0 103.0 104.0 105.0 106.0 ...
  * time       (time) datetime64[ns] 1982-07-01 1982-07-02 1982-07-03 ...
Data variables:
    u10m       (time, latitude, longitude) float64 -1.471 -0.05933 -1.923 ...
>>> ds2 = xr.open_mfdataset('ens2/*NPac*')
<xarray.Dataset>
Dimensions:    (ensemble: 1, latitude: 66, longitude: 191, time: 1085)
Coordinates:
  * ensemble   (ensemble) int32 2
  * latitude   (latitude) float32 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 ...
  * longitude  (longitude) float32 100.0 101.0 102.0 103.0 104.0 105.0 106.0 ...
  * time       (time) datetime64[ns] 1982-07-01 1982-07-02 1982-07-03 ...
Data variables:
    u10m       (time, latitude, longitude) float64 3.992 2.099 -0.3162 ...
>>> ds3 = xr.merge([ds, ds2])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/nethome/rxb826/local/bin/miniconda3/lib/python3.6/site-packages/xarray/core/merge.py", line 513, in merge
    variables, coord_names, dims = merge_core(dict_like_objects, compat, join)
  File "/nethome/rxb826/local/bin/miniconda3/lib/python3.6/site-packages/xarray/core/merge.py", line 432, in merge_core
    variables = merge_variables(expanded, priority_vars, compat=compat)
  File "/nethome/rxb826/local/bin/miniconda3/lib/python3.6/site-packages/xarray/core/merge.py", line 166, in merge_variables
    merged[name] = unique_variable(name, variables, compat)
  File "/nethome/rxb826/local/bin/miniconda3/lib/python3.6/site-packages/xarray/core/merge.py", line 85, in unique_variable
    % (name, out, var))
xarray.core.merge.MergeError: conflicting values for variable 'u10m' on objects to be combined:
first value: <xarray.Variable (time: 1085, latitude: 66, longitude: 191)>
dask.array<shape=(1085, 66, 191), dtype=float64, chunksize=(31, 66, 191)>
Attributes:
    long_name:  10m U component of wind
    units:      m s**-1
second value: <xarray.Variable (time: 1085, latitude: 66, longitude: 191)>
dask.array<shape=(1085, 66, 191), dtype=float64, chunksize=(31, 66, 191)>
Attributes:
    long_name:  10m U component of wind
    units:      m s**-1

I'm using v0.10.0 (thanks for the recent update!)

3 Answers 3

12

xarray.open_mfdataset does not support 2d merges. What you will need to do is use concat along the second dimension:

import os
import xarray as xr

ens_list = []
for num in range(1, 11):
     ens = 'ens%d' % num
     ens_list.append(xr.open_mfdataset(os.path.join(ens, '*NPac*')))
ds = xr.concat(ens_list, dim='ensemble')

This is a common problem that xarray users run into. It is quite difficult, however, to write a generalized ND concat routine.

Sign up to request clarification or add additional context in comments.

1 Comment

thanks for the soln @jhamman, how will your soln change is the netCDFs do not have an ensemble dimension, and just have lat, lon and time.
3
+300

I wrote the following function as a workaround for my own use case: https://gist.github.com/jnhansen/fa474a536201561653f60ea33045f4e2

It works with arbitrary dimensions, but currently requires that the same variables exist in each file/dataset.

In my case I have a number of tiles (split along e.g. lat, lon, and time):

ds = auto_merge('data/part*.nc')

This will execute immediately as it returns only a view of the data (just like xarray.open_mfdataset would do).

3 Comments

thanks @jhansen, so in the example in this post, does that mean u10m should exist in each netCDF file? That seems like a reasonable assumption
can you put a small example of how to call your code? I will test with my own datasets as well and accept. thanks!
Sure, I added a single line example in my reply. Should be as simple as that! And yes, exactly, the variable u10m (or whatever) should exist in each file. I haven't actually tested what happens if it doesn't...
3

xarray does now support N-D concatenation. As your data has 1-D dimension coordinates, you can simply do

ds = xr.open_mfdataset('ens*/*NPac*.nc', combine='by_coords')

and it should combine them in order automatically! It should even work for the ensemble dimension, as you gave that a coordinate too.

Also see this answer to a very similar question.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.