2

There are the following data:

  board_href_deals       items  test1
0            test2  {'x': 'a'}  test1
1            test2  {'x': 'b'}  test2

After grouping "board_href_deals", I would like to output the existing data in a list format as follows:

 board_href_deals                     items     test1
0            test2  [{'x': 'a'}, {'x': 'b'}]    ['test1', 'test2']

thank you

1
  • 1
    How working df.groupby('board_href_deals').agg(list) ? Commented Aug 29, 2018 at 10:43

2 Answers 2

2

Use DataFrameGroupBy.agg, tested in pandas 0.23.4:

df = df.groupby('board_href_deals', as_index=False).agg(list)
print (df)
  board_href_deals                     items           test1
0            test2  [{'x': 'a'}, {'x': 'b'}]  [test1, test2]

Thank you @jpp for solution for oldier pandas:

df = df.groupby('board_href_deals').agg(lambda x: list(x))
Sign up to request clarification or add additional context in comments.

Comments

1

An alternative solution, especially on older versions of Pandas, is to use GroupBy + apply on a sequence, then combine via concat.

Benchmarking on Python 3.60 / Pandas 0.19.2. This contrived example has a small number of groups; you should test with your data if efficiency is a concern.

import pandas as pd

df = pd.DataFrame({'A': ['test2', 'test2', 'test4', 'test4'],
                   'B': [{'x': 'a'}, {'x': 'b'}, {'y': 'a'}, {'y': 'b'}],
                   'C': ['test1', 'test2', 'test3', 'test4']})

df = pd.concat([df]*10000)

def jpp(df):
    g = df.groupby('A')
    L = [g[col].apply(list) for col in ['B', 'C']]
    return pd.concat(L, axis=1).reset_index()

%timeit jpp(df)                                 # 11.3 ms per loop
%timeit df.groupby('A').agg(lambda x: list(x))  # 20.5 ms per loop

5 Comments

Be carefull, there are only 4 big groups, so it is faster. I think in more groups your solution should be slowier.
@jezrael, Yep, as always user should test on their data. It may be better depending on ratio of # groups vs items per group.
I'll add a comment. We don't have enough data to make a judgement one way or the other. Best to show all solutions :)
sure, so if add solution for more groups it should really good ;)
@jezrael, Disagree, that's not necessary. Different solutions can cater for different edge cases. It's common practice on SO. It's also one reason we don't close down questions once they're answered!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.