Grouping header column from specific key values in list of dicts to pandas dataframe

Question

Hi I would like to group headers by specific key value for a list of dicts

  my lists = [

      {'rank': 2, 'keyword_name': 'mens wallet', 'volume': 456677, 'asin': 'B01MG0ORBL','parent': parent1
      },
      {'rank': 18, 'keyword_name': 'mens wallet', 'volume': 456677, 'asin': 'B0735C9RDZ','parent': parent1'
      },
      {'rank': 21, 'keyword_name': 'mens wallet', 'volume': 456677, 'asin': 'B07FPVR858','parent': 'parent2'
      },
      {'rank': 126, 'keyword_name': 'mens wallet', 'volume':   , 'asin': 'B01MG0ORBL','parent': parent2'
      },
      {'rank': 128, 'keyword_name': 'mens wallet', 'volume': 456677, 'asin': 'B0735C9RDZ','parent': parent2'
      },
      {'rank': 136, 'keyword_name': 'mens wallet', 'volume': 456677, 'asin': 'B07FPVR858','parent': parent2'
      },
      {'rank': 19, 'keyword_name': 'leather wallets', 'volume': , 'asin': 'B0735C9RDZ','parent': parent2'
      },
      {'rank': 10, 'keyword_name': 'wallets for men', 'volume': 566, 'asin': 'B07FPVR858','parent': parent2'
      },
      {'rank': 16, 'keyword_name': 'wallets for men', 'volume': 566, 'asin': 'B0735C9RDZ','parent': parent2'
      },
  ]

// Create dataframe and pivoting with lists

  df = pd.DataFrame(my_lists)

  df = df.pivot_table(index=['keyword_name','volume'], 
                columns='asin', 
                values='rank', 
                aggfunc=list)
   print (df)
   asin                   B01MG0ORBL B0735C9RDZ B07FPVR858
   keyword_name    volume                                 
   leather wallets 23            NaN       [19]        NaN
   mens wallet     456677   [2, 126]  [18, 128]  [21, 136]
   wallets for men 566           NaN       [16]       [10]

But what I need to achieve is to group header columns by key(parent) value where each asin inside each dict belongs to parent(key), like

                                  parent1             parent2             parent3
   asin                   B01MG0ORBL B0735C9RDZ B07FPVR858 xxxxxxxxx  xxxxx   xxxx  xxx
   keyword_name    volume                                 
   leather wallets 23            NaN       [19]        NaN
   mens wallet     456677   [2, 126]  [18, 128]  [21, 136]
   wallets for men 566           NaN       [16]       [10]

Example desired output

Any Ideas?

deponovo · Accepted Answer · 2022-06-21 15:46:41Z

1

My interpretation of the problem's description and the expected result presented do not really match each other. If I understood it right, you need a multindex column?

Then simply (note that parent was added to the list of columns at the first place):

df2 = df.pivot_table(index=['keyword_name', 'volume'],
                     columns=['parent', 'asin'],
                     values='rank',
                     aggfunc=list)

print(df2.columns)

MultiIndex([('parent1', 'B01MG0ORBL'),
            ('parent1', 'B0735C9RDZ'),
            ('parent2', 'B0735C9RDZ'),
            ('parent2', 'B07FPVR858')],
           names=['parent', 'asin'])

PS. I could not reproduce your output with the shared data.

answered Jun 21, 2022 at 15:46

deponovo

1,44213 silver badges28 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Grouping header column from specific key values in list of dicts to pandas dataframe

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related