0

I am attempting to concat 2 csv files, with data df1b(2214,4) and df2b(2262, 4). A large portion of the indices in these 2 files are the same, and therefore I am looking for those rows to overlap, and where indices are unique, the other rows will be filled by NaN. Example below:

df1b

Index Col1,  2,  3     
A .      Data in all columns     
B .      Data in all columns      
D .      Data in all columns      
E .      Data in all columns

df2b

Index, ColX, Y, Z

A .      Data in all columns     
B .      Data in all columns      
C .      Data in all columns      
E .      Data in all columns

Desired final concat:

Index, Col1, 2, 3, x, y, z,

A . Data in all columns

B . Data in all columns

C . NaN, NaN, NaN, Data, data, data 

D . Data in all columns

E . Data in all columns

When I concat using: df3 = pd.concat([df1b, df2b], axis=1) The result is a file of dimension (4800, 4) where concat is not recognizing that a large portion of the indices actually are the same between the 2 files. Has anyone encountered why this might occur?

df = pd.read_csv('XX.csv')

df1 = df[['Gene', 'Young_Q1', 'Young_Q2', 'Young_Q3']]

df1a = df1.to_csv('Young_Q.csv', index=False)

df1b = pd.read_csv('Young_Q.csv', index_col='Gene', encoding='utf-8')

df2 = df[['OldQ_Gene', 'Old_Q1', 'Old_Q2', 'Old_Q3']]

df2a = df2.to_csv('Old_Q.csv', index=False)

df2b = pd.read_csv('Old_Q.csv', index_col='OldQ_Gene', encoding='utf-8')


df3 = pd.concat([df1b, df2b], axis=1)

Result example looks like:

Df3

A .  NaN, NaN, NaN,  Data, Data, Data

B .  NaN, NaN, NaN,  Data, Data, Data 

D .  NaN, NaN, NaN,  Data, Data, Data 

E .  NaN, NaN, NaN,  Data, Data, Data 

A .  Data, Data, Data, NaN, NaN, NaN 

B .  Data, Data, Data, NaN, NaN, NaN  

C .  Data, Data, Data, NaN, NaN, NaN  

E .  Data, Data, Data, NaN, NaN, NaN

1 Answer 1

1

You could use merging:

df3 = df1b.merge(df2b, on='Gene', how='outer)

You will only need to consider the Gene as a normal column

more information here: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.