3

I have two DataFrames and I want to subset df2 based on the column names that intersect with the column names of df1. In R this is easy.

R code:

df1 <- data.frame(a=rnorm(5), b=rnorm(5))
df2 <- data.frame(a=rnorm(5), b=rnorm(5), c=rnorm(5))

df2[names(df2) %in% names(df1)]
           a          b
1 -0.8173361  0.6450052
2 -0.8046676  0.6441492
3 -0.3545996 -1.6545289
4  1.3364769 -0.4340254
5 -0.6013046  1.6118360

However, I'm not sure how to do this in pandas.

pandas attempt:

df1 = pd.DataFrame({'a': np.random.standard_normal((5,)), 'b': np.random.standard_normal((5,))})
df2 = pd.DataFrame({'a': np.random.standard_normal((5,)), 'b': np.random.standard_normal((5,)), 'c': np.random.standard_normal((5,))})

df2[df2.columns in df1.columns]

This results in TypeError: unhashable type: 'Index'. What's the right way to do this?

0

2 Answers 2

2

If you need a true intersection, since .columns yields an Index object which supports basic set operations, you can use &, e.g.

df2[df1.columns & df2.columns]

or equivalently with Index.intersection

df2[df1.columns.intersection(df2.columns)]

However if you are guaranteed that df1 is just a column subset of df2 you can directly use

df2[df1.columns]

or if assigning,

df2.loc[:, df1.columns]

Demo

>>> df2[df1.columns & df2.columns]
          a         b
0  1.952230 -0.641574
1  0.804606 -1.509773
2 -0.360106  0.939992
3  0.471858 -0.025248
4 -0.663493  2.031343

>>> df2.loc[:, df1.columns]
          a         b
0  1.952230 -0.641574
1  0.804606 -1.509773
2 -0.360106  0.939992
3  0.471858 -0.025248
4 -0.663493  2.031343
Sign up to request clarification or add additional context in comments.

Comments

1

The equivalent would be:

df2[df1.columns.intersection(df2.columns)]
Out: 
          a         b
0 -0.019703  0.379820
1  0.040658  0.243309
2  1.103032  0.066454
3 -0.921378  1.016017
4  0.188666 -0.626612

With this, you will not get a KeyError if a column in df1 does not exist in df2.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.