pandas merge and group concat

Question

I want to do something like GROUP BY / GROUP_CONCAT in MySQL using pandas. Let's say I have:

table_a

col_a col_b
A     1
B     2
C     2

table_b

col_a col_c
A     VALUE_1
A     VALUE_2
B     VALUE_3
C     VALUE_4

I want a new table_c as follow:

col_a col_b col_c
A     1      VALUE_1, VALUE_2
B     2      VALUE_3
C     2      VALUE_4

I've been using pd.merge but I cannot find a way to do the concatenation and avoid duplicates.

BENY · Accepted Answer · 2018-12-13 16:27:42Z

5

Or using agg after merge

df1.merge(df2).groupby('col_a',as_index=False).agg({'col_b':'first','col_c':','.join})
Out[46]: 
  col_a  col_b            col_c
0     A      1  VALUE_1,VALUE_2
1     B      2          VALUE_3
2     C      2          VALUE_4

answered Dec 13, 2018 at 16:27

BENY

324k22 gold badges176 silver badges250 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

user1532587 Over a year ago

I'll select this just because I like the fact that I can use 'first'

ALollz · Accepted Answer · 2018-12-13 16:25:14Z

5

groupby before merge, ensuring 'col_a' is unique in the right Frame:

df1.merge(df2.groupby('col_a').col_c.apply(', '.join).reset_index())

  col_a  col_b             col_c
0     A      1  VALUE_1, VALUE_2
1     B      2           VALUE_3
2     C      2           VALUE_4

answered Dec 13, 2018 at 16:25

ALollz

59.7k7 gold badges73 silver badges97 bronze badges

Collectives™ on Stack Overflow

pandas merge and group concat

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related