0

I know that there are a few questions about nested dictionaries to dataframe but their solutions do not work for me. I have a dataframe, which is contained in a dictionary, which is contained in another dictionary, like this:

df1 = pd.DataFrame({'2019-01-01':[38],'2019-01-02':[43]},index = [1,2])
df2 = pd.DataFrame({'2019-01-01':[108],'2019-01-02':[313]},index = [1,2])
da = {}
da['ES']={}
da['ES']['TV']=df1
da['ES']['WEB']=df2

What I want to obtain is the following:

df_final = pd.DataFrame({'market':['ES','ES','ES','ES'],'device':['TV','TV','WEB','WEB'],
                     'ds':['2019-01-01','2019-01-02','2019-01-01','2019-01-02'],
                     'yhat':[43,38,423,138]})

Getting the code from another SO question I have tried this:

market_ids = []
frames = []
for market_id,d in da.items():
  market_ids.append(market_id)
  frames.append(pd.DataFrame.from_dict(da,orient = 'index'))    
df = pd.concat(frames, keys=market_ids)

Which gives me a dataframe with multiple indexes and the devices as column names.

Thank you

3
  • Okay, I get your question, and I think it should be that difficult. Wait, I am working on it So just correct me if I am wrong. That df_final is how you want your dataframe to look like and da above is how you have the values available... right? Commented Jan 15, 2019 at 16:40
  • That is exactly right Commented Jan 15, 2019 at 16:49
  • See I am able to get the output, and it's working well. Though I am not sure how will you implement that in your actual data. Because to apply it under a loop, I need good amount of actual data(or some changed value) to work on. But still, I'll share what I got and if that makes any useful sense to you. Commented Jan 15, 2019 at 17:00

1 Answer 1

1

The code below works well and gives the desired output:

t1=da['ES']['TV'].melt(var_name='ds', value_name='yhat')
t1['market']='ES'
t1['device']='TV'

t2=da['ES']['WEB'].melt(var_name='ds', value_name='yhat')
t2['market']='ES'
t2['device']='WEB'

m = pd.concat([t1,t2]).reset_index().drop(columns={'index'})

print(m)

And the output is:

           ds  yhat market device
0  2019-01-01    38     ES     TV
1  2019-01-02    43     ES     TV
2  2019-01-01   108     ES    WEB
3  2019-01-02   313     ES    WEB

The main takeaway here is melt function, which if you read about isn't that difficult to understand what's it doing here. Now as I mentioned in the comment above, this can be done iteratively over whole da named dictionary, but to perform that I'd need replicated form of the actual data. What I intended to do was to take this first t1 as the initial dataframe and then keep on concatinating others to it, which should be really easy. But I don't know how your actual values are. But I am sure you can figure out on your own from above how to put this under a loop.

The pseudo code for that loop thing I am talking about would be like this:

real=t1
for a in da['ES'].keys():
    if a!='TV':

        p=da['ES'][a].melt(var_name='ds', value_name='yhat')
        p['market']='ES'
        p['device']=a

        real = pd.concat([real,p],axis=0,sort=True)

real.reset_index().drop(columns={'index'})
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.