17

Hello I want to store a dataframe in another dataframe cell. I have a data that looks like this enter image description here

I have daily data which consists of date, steps, and calories. In addition, I have minute by minute HR data of a specific date. Obviously it would be easy to put the minute by minute data in 2 dimensional list but I'm fearing that would be harder to analyze later.
What would be the best practice when I want to have both data in one dataframe? Is it even possible to even nest dataframes?
Any better ideas ? Thanks!

6
  • you might want to use zarr or xarray instead of pandas. it provides N-dimensional arrays and dataframes and it seems to me that is what you need. Commented Jul 24, 2018 at 18:42
  • R is able to do this very well, JSYK, It is a little harder with pandas because you can't store the data frame in a data frame. Commented Jul 24, 2018 at 18:43
  • Thank you for your comment, currently pandas is only my option Commented Jul 24, 2018 at 18:45
  • 1
    @DemetriP, why not? See my answer below Commented Jul 25, 2018 at 1:47
  • @sacul Oh! I tried a naive way of doing this, but I seem to be mistaken. Commented Jul 25, 2018 at 15:22

1 Answer 1

28

Yes, it seems possible to nest dataframes but I would recommend instead rethinking how you want to structure your data, which depends on your application or the analyses you want to run on it after.

How to "nest" dataframes into another dataframe

Your dataframe containing your nested "sub-dataframes" won't be displayed very nicely. However, just to show that it is possible to nest your dataframes, take a look at this mini-example:

Here we have 3 random dataframes:

>>> df1
          0         1         2
0  0.614679  0.401098  0.379667
1  0.459064  0.328259  0.592180
2  0.916509  0.717322  0.319057
>>> df2
          0         1         2
0  0.090917  0.457668  0.598548
1  0.748639  0.729935  0.680409
2  0.301244  0.024004  0.361283
>>> df3
          0         1         2
0  0.200375  0.059798  0.665323
1  0.086708  0.320635  0.594862
2  0.299289  0.014134  0.085295

We can make a main dataframe that includes these dataframes as values in individual "cells":

df = pd.DataFrame({'idx':[1,2,3], 'dfs':[df1, df2, df3]})

We can then access these nested datframes as we would access any value in any other dataframe:

>>> df['dfs'].iloc[0]
          0         1         2
0  0.614679  0.401098  0.379667
1  0.459064  0.328259  0.592180
2  0.916509  0.717322  0.319057
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.