4

I have N dataframes:

df1:
time  data
1.0   a1
2.0   b1
3.0   c1


df2:
time  data
1.0   a2
2.0   b2
3.0   c2



df3:
time  data
1.0   a3
2.0   b3
3.0   c3

I want to merge all of them on id, thus getting

time  data1    data2    data3
1.0   a1       a2       a3
2.0   b1       b2       b3
3.0   c1       c2       c3

I can assure all the ids are the same in all dataframes.

How can I do this in pandas?

1 Answer 1

4

One idea is use concat for list of DataFrames - only necessary create index by id for each DaatFrame. Also for avoid duplicated columns names is added keys parameter, but it create MultiIndex in output. So added map with format for flatten it:

dfs = [df1, df2, df3]
dfs = [x.set_index('id') for x in dfs]

df = pd.concat(dfs, axis=1, keys=range(1, len(dfs) + 1))
df.columns = df.columns.map('{0[1]}{0[0]}'.format)
df = df.reset_index()
print (df)
   id data1 data2 data3
0   1    a1    a2    a3
1   2    b1    b2    b3
2   3    c1    c2    c3
Sign up to request clarification or add additional context in comments.

7 Comments

can you explain please? What is the '{0[1]}{0[0]}' part? Would also like to understand the rest
on pd.concat(...): ValueError: Shape of passed values is (9, 92994), indices imply (9, 89954). All of my 9 dfs are of length 10000. What did i do wrong?
@Gulzar - there are duplicates in id values
Well in that case, my question doesn't fit my data. Should I ask a new one or edit this one? My 'id' is actually 'timestamp' which are float values. I guess some tolerance is needed.
@Gulzar - I can edit, only whats happen in output if df3 is like {'id': {0: 1, 1: 3, 2: 3}, 'data': {0: 'a3', 1: 'b3', 2: 'c3'}} ?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.