2

I want to save an xarray.dataset as a .zarr file, but I cannot configure my chunks to be uniform and it will not save.

I have tried:

changing chunk size when using xarray.open_mfdataset -> it still uses auto chunks which do not work.

changing chunk size when using dataset.chunk(n) -> still refers to automatic chunks when opening dataset.

CODE:

import xarray as xr
import glob
import zarr

local_dir = "/directory/"
data_dir = local_dir + 'folder/'

files = glob.glob(data_dir + '*.nc')
n = 1320123
data_files = xr.open_mfdataset(files,concat_dim='TIME',chunks={'TIME': n}) # does not specify chunks, uses automatic chunks
data_files.chunk(n) # try modifying here, still uses automatic chunks
data_files.to_zarr(store=data_dir + 'test.zarr',mode='w') # I get an error about non-uniform chunks - see below

ValueError: Zarr requires uniform chunk sizes except for final chunk. Variable dask chunks ((1143410, 512447, 1170473, 281220, 852819),) are incompatible. Consider rechunking using chunk().

I expect the .zarr file to save with new chunks, but refers back to original automatic chunksizes.

1 Answer 1

4

Xarray's Dataset.chunk method return a new dataset, so you would need something more like:

ds = xr.open_mfdataset(files, concat_dim='TIME').chunk({'TIME': n})
ds.to_zarr(...)

A few other details to note:

  • Why the chunks kwarg open_mfdataset doesn't behave as desired: Currently, chunks along the concat_dim are fixed to the length of data in each file. I also suspect this is why you have irregular chunk sizes.

  • open_mfdataset will do the glob for you. This a minor time savor but something to note in the future, you can just call xr.open_mfdataset('/directory/folder/*nc', ...).

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.