Storing Xarray.Datasets in single Pandas.DataFrame cells

Question

I have an existing DataFrame with metadata. I am now trying to add a column with the data. For each row in the DataFrame I want to add a subset of my xarray.DataSet. However, pandas seems to try and convert the xr.dataset into a numpy array, which obviously fails. Is there any way to do this?

Here is some example code:

import pandas as pd
import xarray as xr
import numpy as np

# Create a DataFrame
df = pd.DataFrame({"id": [1, 2, 3]})

# Initialize an empty column with dtype=object (CRUCIAL!)
df["xr_dataset"] = None  # Automatically becomes object dtype
# Or explicitly:
df["xr_dataset"] = pd.Series(dtype=object)

for idx in df.index:
    # Create a unique xarray Dataset for each row
    ds = xr.Dataset({"temperature": xr.DataArray(np.random.rand(2)),
                    "pressure": xr.DataArray(np.random.rand(2))}
    )
    # Use .loc to prevent pandas from auto-converting
    df.loc[idx, "xr_dataset"] = ds

I have also tried to store my subsets in a list and to assign that to the dataframe but that fails as well.

This is not super important for me to solve, as I can use other ways to handle my data. But at this point I'm just curious if this is possible at all.

Thanks for your time!

Polarimetric · Accepted Answer · 2025-07-09 23:10:49Z

2

I think you have gone most of the way. You want df.at instead of df.loc . I took your code and did the following

    # Create a DataFrame
    df = pd.DataFrame({"id": [1, 2, 3]})

    # Initialize an empty column with dtype=object
    df["xr_dataset"] = pd.Series(dtype=object)

    for idx in df.index:
        # Create a unique xarray Dataset for each row
        ds = xr.Dataset({"temperature": xr.DataArray(np.random.rand(2)),
                        "pressure": xr.DataArray(np.random.rand(2))}
        )
        # Use .at to assign the xarray Dataset directly
        df.at[idx, "xr_dataset"] = ds

then when I print print(type(df.loc[0, "xr_dataset"])) I get <class 'xarray.core.dataset.Dataset'>. I am sure there are other ways. I think you were looking for this not loc .

answered Jul 9 at 23:10

Polarimetric

1438 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

BlueScr33n Jul 10 at 11:11

yep, that sure is working, thanks.

Collectives™ on Stack Overflow

Storing Xarray.Datasets in single Pandas.DataFrame cells

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related