How to combine multiple rows into a single row with pandas [duplicate]

Question

I need to combine multiple rows into a single row, that would be simple concat with space

    View of my dataframe:
  tempx        value
0  picture1         1.5
1  picture555       1.5
2  picture255       1.5
3  picture365       1.5
4  picture112       1.5

I want the dataframe to be converted like this: (space separated) tempx values

  Expected output:
  tempx                                                       value
  0     picture1 picture555 picture255 picture365 picture112  1.5

  or
  as a python dict
  {1.5:{picture1 picture555 picture255 picture365 picture112}}

What I have tried :

 df_test['tempx']=df_test['tempx'].str.cat(sep=' ')

this works but it combines the rows in all the columns like this:

      tempx        value
0  picture1 picture555 picture255 picture365 picture112 1.5
1  picture1 picture555 picture255 picture365 picture112 1.5
2  picture1 picture555 picture255 picture365 picture112 1.5
3  picture1 picture555 picture255 picture365 picture112 1.5
4  picture1 picture555 picture255 picture365 picture112 1.5

Is there any elegant solution?

also if there is a solution to conditionally combine based on value column — Sandeep Raikar
– Sandeep Raikar, Commented Apr 3, 2016 at 23:56
What is your expected output, can you edit and example into your question? Do you want to "group by" the value column, so you join the picture names for within each value? — Marius
– Marius, Commented Apr 4, 2016 at 0:48
I have applied grouby using pandas, next step I would like to do is to have a single row for each value attribute. please check the expected output — Sandeep Raikar
– Sandeep Raikar, Commented Apr 4, 2016 at 2:31

jezrael · Accepted Answer · 2016-04-04 05:07:56Z

87

You can use groupby and apply function join :

print df.groupby('value')['tempx'].apply(' '.join).reset_index()
   value                                              tempx
0    1.5  picture1 picture555 picture255 picture365 pict...

answered Apr 4, 2016 at 5:07

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

user8560167 Over a year ago

@jezrael hi, is there a way to merge more than one column? instead of tempx i want to merge also more columns how to do that? I am trying df.groupby('value')['tempx','second_column','third_column'].apply(' '.join).reset_index() but I am receiving only groupped names of columns

jezrael Over a year ago

@sygneto - Use df.groupby('value')['tempx','second_column','third_column'].agg(' '.join).reset_index()

user8560167 Over a year ago

thank you, i forgot again about .agg ^^, good to have you here

Ivo Over a year ago

for me, the call for multiple columns raises the

FutureWarning: Indexing with multiple keys (implicitly converted to a tuple of keys) will be deprecated, use a list instead.

and does not include the second column, am I doing something wrong?

jezrael Over a year ago

@Ivo use [] like df.groupby('value')[['tempx','second_column','third_column']].agg(' '.join).reset_index()

|

Collectives™ on Stack Overflow

How to combine multiple rows into a single row with pandas [duplicate]

1 Answer 1

7 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

7 Comments

Linked

Related