Add rows to pandas dataframe using column of dictionaries

Question

I have a dataframe like this:

matrix = [(222, {'a': 1, 'b':3, 'c':2, 'd':1}),
         (333, {'a': 1, 'b':0, 'c':0, 'd':1})]

df = pd.DataFrame(matrix, columns=['ordernum', 'dict_of item_counts'])

   ordernum               dict_of item_counts
0       222  {'a': 1, 'b': 3, 'c': 2, 'd': 1}
1       333  {'a': 1, 'b': 0, 'c': 0, 'd': 1}

and I would like to create a dataframe in which each ordernum is repeated for each dictionary key in dict_of_item_counts that is not 0. I would also like to create a key column that shows the corresponding dictionary key for this row as well as a value column that contains the dictionary values. Finally, I would also an ordernum_index that counts the different rows in the dataframe for each ordernum.

The final dataframe should look like this:

ordernum      ordernum_index      key     value

222           1                   a       1
222           2                   b       3 
222           3                   c       2
222           4                   d       1
333           1                   a       1
333           2                   d       1

Any help would be much appreciated :)

Have you tried anything?

Hackaholic
– Hackaholic

2019-05-26 20:11:16 +00:00
Commented May 26, 2019 at 20:11 — Hackaholic
– Hackaholic, Commented May 26, 2019 at 20:11

Hackaholic · Accepted Answer · 2019-05-26 20:47:15Z

2

Always try to structure your data, Can be done easily like below:

>>> matrix
[(222, {'a': 1, 'b': 3, 'c': 2, 'd': 1}), (333, {'a': 1, 'b': 0, 'c': 0, 'd': 1})]
>>> data = [[item[0]]+[i+1]+list(value) for item in matrix for i,value in enumerate(item[1].items()) if value[-1]!=0]
>>> data
[[222, 1, 'a', 1], [222, 2, 'b', 3], [222, 3, 'c', 2], [222, 4, 'd', 1], [333, 1, 'a', 1], [333, 4, 'd', 1]]
>>> pd.DataFrame(data, columns=['ordernum', 'ordernum_index', 'key', 'value'])
   ordernum  ordernum_index key  value
0       222               1   a      1
1       222               2   b      3
2       222               3   c      2
3       222               4   d      1
4       333               1   a      1
5       333               4   d      1

answered May 26, 2019 at 20:47

Hackaholic

19.8k6 gold badges59 silver badges77 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Jondiedoop · Accepted Answer · 2019-05-26 20:19:09Z

Expand the dictionary by using apply with pd.Series and use concat to concatenate that to your other column (ordernum). See below for your in-between result of df2. Now to turn every column into a row, use melt, then use query to drop all the 0-rows and finally assign the cumcount to get the index (after ordering) and add 1 to start counting from 1, not 0.

df2 = pd.concat([df[['ordernum']], df['dict_of item_counts'].apply(pd.Series)], axis=1)
(df2.melt(id_vars='ordernum', var_name='key')
.query('value != 0')
.sort_values(['ordernum', 'key'])
.assign(ordernum_index = lambda df: df.groupby('ordernum').cumcount().add(1)))
#   ordernum key  value  ordernum_index
#0       222   a      1               1
#2       222   b      3               2
#4       222   c      2               3
#6       222   d      1               4
#1       333   a      1               1
#7       333   d      1               2

Now df2 looks like:

#   ordernum  a  b  c  d
#0       222  1  3  2  1
#1       333  1  0  0  1

Erfan · Accepted Answer · 2019-05-26 21:05:04Z

0

You can do this by unpacking your dictionarys while accesing them with iterrows and creating a tuple out of the ordernum, key, value.

Finally to create your ordernum_index we groupby on ordernum and do a cumcount:

data = [(r['ordernum'], k, v) for _, r in df.iterrows() for k, v in r['dict_of item_counts'].items() ]

new = pd.DataFrame(data, columns=['ordernum', 'key', 'value']).sort_values('ordernum').reset_index(drop=True)

new['ordernum_index'] = new[new['value'].ne(0)].groupby('ordernum').cumcount().add(1)
new.dropna(inplace=True)

   ordernum key  value  ordernum_index
0       222   a      1             1.0
1       222   b      3             2.0
2       222   c      2             3.0
3       222   d      1             4.0
4       333   a      1             1.0
7       333   d      1             2.0

edited May 26, 2019 at 21:05

answered May 26, 2019 at 20:35

Erfan

43.3k10 gold badges75 silver badges86 bronze badges

1 Comment

Hackaholic Over a year ago

The key which has 0 value must be excluded as per OP.

Andy L. · Accepted Answer · 2019-05-27 07:08:29Z

Construct dataframe df1 using df['dict_of item_counts'].tolist() for values and df.ordernum for index. replace 0 with np.nan and stack with dropna=True to ignore 0 values. reset_index to get all columns.

Next, create column ordernum_index by using groupby and cumcount.

Finally, change column names to appropriate names.

df1 = pd.DataFrame(df['dict_of item_counts'].tolist(), index=df.ordernum).replace(0, np.nan).stack(dropna=True).reset_index(name='value')
df1['ordernum_index'] = df1.groupby('ordernum')['value'].cumcount() + 1
df1 = df1.rename(columns={'level_1': 'key'})

Out[732]:
   ordernum key  value  ordernum_index
0       222   a    1.0               1
1       222   b    3.0               2
2       222   c    2.0               3
3       222   d    1.0               4
4       333   a    1.0               1
5       333   d    1.0               2

G.G · Accepted Answer · 2024-04-29 06:38:30Z

0

dd1=df1.set_index("ordernum").dict_of2item_counts.map(eval).apply(pd.Series).stack().reset_index().rename(columns={'level_1':"key",0:"value"}).query("value>0")
dd1.assign(ordernum_index=dd1.groupby("ordernum").key.transform('rank',method='first').astype(int))


  ordernum key  value  ordernum_index
0       222   a      1               1
1       222   b      3               2
2       222   c      2               3
3       222   d      1               4
4       333   a      1               1
7       333   d      1               2

answered Apr 29, 2024 at 6:38

G.G

7654 silver badges5 bronze badges

Collectives™ on Stack Overflow

Add rows to pandas dataframe using column of dictionaries

5 Answers 5

Comments

Comments

1 Comment

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Comments

Comments

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related