Merge multiple rows in Pandas

Question

I have a dataset with 5 rows that I wish to merge into one so that I can use them as unique column identifiers. For example

Name Unique No. Summary Nominal Voltage Nominal Voltage Upstream Upstream NaN NaN Class Upstream Downstream Constraint Oppurtunity (non unique) NaN NaN NaN NaN Physical Nan

I would like the columns to be named

Name (non unique) Unique No. Summary Class Nominal Voltage Upstream Nominal Voltage Downstream Upstream Constraint Phsyical Upstream Oppurtunity

So the rows (there are actually 5) would be merged (while ignoring NaNs) which I could then use as unique column names.

Thanks in advance.

As far as I can understand, groupby requires something common between the things being grouped, so can't be used here? The whole database is currently of string type because I thought that would make it easier to join them, but I couldn't figure out a way.

I may be misreading/misunderstanding the documentation but I didn't think that merge join or concat could do what was required here. They seem to join dataframes, rather than taking the contents of multiple rows and returning them as one row. — J.Komodo
– J.Komodo, Commented Mar 28, 2017 at 13:49

jezrael · Accepted Answer · 2017-03-28 13:26:05Z

1

I think you need apply with dropna:

df.columns = df.apply(lambda x: ' '.join([x.name] + x.dropna().tolist()))

print (df.columns.tolist())

['Name (non unique)', 
'Unique No.',
'Summary Class', 
'Nominal Voltage Upstream', 
'Nominal Voltage Downstream', 
'Upstream Constraint Physical', 
'Upstream Oppurtunity Nan']

If there are some string Nan - replace first:

df.columns = df.replace('Nan',np.nan)
               .apply(lambda x: ' '.join([x.name] + x.dropna().tolist()))
print (df.columns.tolist())
['Name (non unique)',
 'Unique No.', 
'Summary Class', 
'Nominal Voltage Upstream', 
'Nominal Voltage Downstream', 
'Upstream Constraint Physical',
 'Upstream Oppurtunity']

But if need unique column names, the simpliest is:

df.columns = range(len(df.columns))
print (df.columns.tolist())
[0, 1, 2, 3, 4, 5, 6]

Or assign new unique values of columns:

df.columns = list('abcdefg')
print (df.columns.tolist())
['a', 'b', 'c', 'd', 'e', 'f', 'g']

edited Mar 28, 2017 at 13:26

answered Mar 28, 2017 at 13:07

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

J.Komodo Over a year ago

Thank you, so apply is the way! (I appreciate columns a-z etc would be easier but I need the titles for the later code to check and identify, as the columns aren't always in the same order)

Collectives™ on Stack Overflow

Merge multiple rows in Pandas

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related