0

Now I have a timelonlat 3D data where time is recorded as year, month and day. I need to split time in the form of year*month+day. So that the data becomes 4 dimensional. How should I do this?

I have given a simple data below:

import xarray as xr
import numpy as np
import pandas as pd

time = pd.date_range("2000-01-01", "2001-12-31", freq="D")
time = time[~((time.month == 2) & (time.day == 29))] 

lon = np.linspace(100, 110, 5)
lat = np.linspace(30, 35, 4)
data = np.random.rand(len(time), len(lon), len(lat))

da = xr.DataArray(
    data,
    coords={"time": time, "lon": lon, "lat": lat},
    dims=["time", "lon", "lat"],
    name="pr"
)

except dim:

year: 2000, 2001

monthly: 01-01, 01-02,...12-31

lon: ...

lat: ...


One additional question: Why is .first and .last reporting errors? How should I use them?

da.assign_coords(year = da.time.dt.year, monthday = da.time.dt.strftime("%m-%d")).groupby(['year', 'monthday']).first()
da.assign_coords(year = da.time.dt.year, monthday = da.time.dt.strftime("%m-%d")).groupby(['year', 'monthday']).last()

2 Answers 2

1

This is a solution for you:

import xarray as xr
import numpy as np
import pandas as pd

time = pd.date_range("2000-01-01", "2001-12-31", freq="D")
time = time[~((time.month == 2) & (time.day == 29))] 

lon = np.linspace(100, 110, 5)
lat = np.linspace(30, 35, 4)
data = np.random.rand(len(time), len(lon), len(lat))

da = xr.DataArray(
    data,
    coords={"time": time, "lon": lon, "lat": lat},
    dims=["time", "lon", "lat"],
    name="pr"
)

years = da.time.dt.year.values
month_day = da.time.dt.strftime('%m-%d').values

unique_years = np.unique(years)
unique_month_day = np.unique(month_day)

multi_index = pd.MultiIndex.from_arrays([years, month_day], names=('year', 'monthly'))

da_4d = da.copy()
da_4d.coords['time'] = multi_index
da_4d = da_4d.unstack('time')

print(da_4d)
Sign up to request clarification or add additional context in comments.

Comments

0

I tried one way and wondered if there was another way...

da.assign_coords(year = da.time.dt.year, monthday = da.time.dt.strftime("%m-%d")).groupby(['year', 'monthday']).mean('time')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.