Combining multiple data types in pandas DataFrame

Question

I want to combine columns in a dataframe depending on whether the data is numeric or not, for example:

import pandas as pd
import numpy as np

x = {'a':[1,2], 'b':['foo','bar'],'c':[np.pi,np.e]}
y = pd.DataFrame.from_dict(x)
y.apply(lambda x: x.sum() if x.dtype in (np.int64,np.float64) else x.min())

This gives the desired output, but it seems like there should be a nicer way to write the last line--is there a simple way to just check if the number is a numpy scalar type instead of checking if the dtype is in a specified list of numpy dtypes?

Andy Hayden · Accepted Answer · 2014-03-28 19:38:57Z

2

Rather than do a apply here, I would probably check each column for whether it's numeric with a simple list comprehension and separate these paths and then concat them back. This will be more efficient for larger frames.

In [11]: numeric = np.array([dtype in [np.int64, np.float64] for dtype in y.dtypes])

In [12]: numeric
Out[12]: array([True, False, True])

There may be an is_numeric_dtype function but I'm not sure where it is..

In [13]: y.iloc[:, numeric].sum()
Out[13]: 
a    3.000000
c    5.859874
dtype: float64

In [14]: y.iloc[:, ~numeric].min()
Out[14]: 
b    bar
dtype: object

Now you can concat these and potentially reindex:

In [15]: pd.concat([y.iloc[:, numeric].sum(), y.iloc[:, ~numeric].min()]).reindex(y.columns)
Out[15]: 
a           3
b         bar
c    5.859874
dtype: object

answered Mar 28, 2014 at 19:38

Andy Hayden

378k110 gold badges640 silver badges546 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Jeff Over a year ago

df._get_numeric_data()

Michael K Over a year ago

Thanks, both of you. That private method really does the trick.

Lee · Accepted Answer · 2014-03-28 19:18:37Z

2

You could use isscalar:

y.apply(lambda x: x.sum() if np.isscalar(x) else x.min())

answered Mar 28, 2014 at 19:18

Lee

31.4k31 gold badges124 silver badges187 bronze badges

Collectives™ on Stack Overflow

Combining multiple data types in pandas DataFrame

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related