0

I have a dictionary data which has a structure like so:

{
    1: {
        'title': 'Test x Miss LaFamilia - All Mine [Music Video] | Link Up TV',
        'time': '2020-06-28T18:30:06Z',
        'channel': 'Link Up TV',
        'description': 'SUB & ENABLE NOTIFICATIONS for more:  Visit our clothing store:  Visit our website for the latest videos: ...',
        'url': 'youtube',
        'region_searched': 'US',
        'time_searched': datetime.datetime(2020, 8, 6, 13, 6, 5, 188727, tzinfo = < UTC > )
    },
    2: {
        'title': 'Day 1 Highlights | England Frustrated by Rain as Babar Impresses | England v Pakistan 1st Test 2020',
        'time': '2020-08-05T18:29:43Z',
        'channel': 'England & Wales Cricket Board',
        'description': 'Watch match highlights of Day 1 from the 1st Test between England and Pakistan at Old Trafford. Find out more at ecb.co.uk This is the official channel of the ...',
        'url': 'youtube',
        'region_searched': 'US',
        'time_searched': datetime.datetime(2020, 8, 6, 13, 6, 5, 188750, tzinfo = < UTC > )
    }

I am trying to make a pandas DataFrame which would look like this:

rank    title                             time                      channel             description                                     url                             region_searched         time_searched
1       Test x Miss LaFamilia...          2020-06-28T18:30:06Z      Link Up TV          SUB & ENABLE NOTIFICATIONS for more...          youtube.com                     US                      2020-8-6 13:06:05
2       Day 1 Highlights | E...           2020-08-05T18:29:43       England & ..        Watch match highlights of D                     youtube.com                     US                      2020-8-6 13:06:05

In my data dictionary, each key should be rank entry in my DataFrame, and each key inside the parent key is an entry which column name is the key and their value is the value that key holds.

When I simply run:

df = pd.DataFrame(data)

The df looks like this:

                 1                                                  2
title            Test x Miss LaFamilia - All Mine [Music Video]...  Day 1 Highlights | England Frustrated by Rain ...
time             2020-06-28T18:30:06Z                               2020-08-05T18:29:43Z
channel          Link Up TV                                         England & Wales Cricket Board
description      SUB & ENABLE NOTIFICATIONS for more: http://go...  Watch match highlights of Day 1 from the 1st T...
url              youtube.com/watch?v=YB3xASruJHE                    youtube.com/watch?v=xABoyLxWc7c
region_searched  US                                                 US
time_searched    2020-08-06                                         2020-08-06

Which I feel like is few smart pivot lines away from what I need but I can't figure out how can I achieve the structure I need in a smart way.

1
  • 5
    after you create your df, try df.T to transpose it. Commented Aug 6, 2020 at 13:37

4 Answers 4

4

It can be done in a much simpler way as @dm2 mentioned in the comments. Here d is the dictionary which has the data

df=pd.DataFrame(d)
dfz=df.T

To create the rank column

dfz['rank']=dfz.index
Sign up to request clarification or add additional context in comments.

1 Comment

Oh yeah, pandas Transpose would fit this issue nicely
2

try this,

import pandas as pd

pd.DataFrame(data.values()).assign(rank = data.keys())

                                               title  ... rank
0  Test x Miss LaFamilia - All Mine [Music Video]...  ...    1
1  Day 1 Highlights | England Frustrated by Rain ...  ...    2

Comments

1

If you want index and rank to be two different columns

  1. Create a dataframe from the data
df = pd.DataFrame(data.values())
  1. Just add a rank column in the dataframe
df['rank'] = data.keys()

OR

To do this in one line use assign method

df = pd.DataFrame(data.values()).assign(rank = data.keys())

If you want index and rank to be same column

  1. Create the dataframe but in transpose order
df = pd.DataFrame(data).T
  1. Rename the index
df.index.names = ['rank']

It should work.

Comments

0

Try looping trough the dict keys and appending to a new df for each value. (replace the object "dict" to your variable)

df_full = pd.DataFrame()
for key in dict.keys():
    df_temp = pd.DataFrame(dict[key])
    df_full = pd.concat([df_full, df_temp], axis=0)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.