How to get pandas.DataFrame columns containing specific dtype

Question

I'm using df.columns.values to make a list of column names which I then iterate over and make charts, etc... but when I set this up I overlooked the non-numeric columns in the df. Now, I'd much rather not simply drop those columns from the df (or a copy of it). Instead, I would like to find a slick way to eliminate them from the list of column names.

Now I have:

names = df.columns.values

what I'd like to get to is something that behaves like:

names = df.columns.values(column_type=float64)

Is there any slick way to do this? I suppose I could make a copy of the df, and drop those non-numeric columns before doing columns.values, but that strikes me as clunky.

Welcome any inputs/suggestions. Thanks.

stackoverflow.com/questions/25039626/…

Gusev Slava
– Gusev Slava

2019-02-08 10:05:29 +00:00
Commented Feb 8, 2019 at 10:05 — Gusev Slava
– Gusev Slava, Commented Feb 8, 2019 at 10:05

Woody Pride · Accepted Answer · 2014-07-23 05:09:47Z

25

Someone will give you a better answe than this possibly, but one thing I tend to do is if all my numeric data are int64 or float64 objects, then you can create a dict of the column data types and then use the values to create your list of columns.

So for example, in a dataframe where I have columns of type float64, int64 and object firstly you can look at the data types as so:

DF.dtypes

and if they conform to the standard whereby the non-numeric columns of data are all object types (as they are in my dataframes), then you can do the following to get a list of the numeric columns:

[key for key in dict(DF.dtypes) if dict(DF.dtypes)[key] in ['float64', 'int64']]

Its just a simple list comprehension. Nothing fancy. Again, though whether this works for you will depend upon how you set up you dataframe...

answered Jul 23, 2014 at 5:09

Woody Pride

14k10 gold badges51 silver badges64 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Charlie_M Over a year ago

I ended up using this because it works and because I'm running 0.14.0 and didn't want to upgrade to 0.14.1 in the middle of my project. Thanks.

Arthur Zennig · Accepted Answer · 2015-09-24 11:40:57Z

24

dtypes is a Pandas Series. That means it contains index & values attributes. If you only need the column names:

headers = df.dtypes.index

it will return a list containing the column names of "df" dataframe.

answered Sep 24, 2015 at 11:40

Arthur Zennig

2,22227 silver badges21 bronze badges

Comments

chrisb · Accepted Answer · 2014-07-23 10:06:18Z

19

There's a new feature in 0.14.1, select_dtypes to select columns by dtype, by providing a list of dtypes to include or exclude.

For example:

df = pd.DataFrame({'a': np.random.randn(1000),
                   'b': range(1000),
                   'c': ['a'] * 1000,
                   'd': pd.date_range('2000-1-1', periods=1000)})


df.select_dtypes(['float64','int64'])

Out[129]: 
            a    b
0    0.153070    0
1    0.887256    1
2   -1.456037    2
3   -1.147014    3
...

answered Jul 23, 2014 at 10:06

chrisb

52.7k8 gold badges73 silver badges70 bronze badges

1 Comment

user2285236 Over a year ago

select_dtypes now also allows selecting more general categories (df.select_dtypes('number'), df.select_dtypes('object') or df.select_dtypes('datetime'), for example).

J11 · Accepted Answer · 2018-07-19 17:25:09Z

8

To get the column names from pandas dataframe in python3- Here I am creating a data frame from a fileName.csv file

>>> import pandas as pd
>>> df = pd.read_csv('fileName.csv')
>>> columnNames = list(df.head(0)) 
>>> print(columnNames)

answered Jul 19, 2018 at 17:25

J11

4734 silver badges8 bronze badges

Comments

Meloman · Accepted Answer · 2018-09-27 14:43:05Z

0

You can also try to get the column names from panda data frame that returns columnn name as well dtype. here i'll read csv file from https://mlearn.ics.uci.edu/databases/autos/imports-85.data but you have define header that contain columns names.

import pandas as pd

url="https://mlearn.ics.uci.edu/databases/autos/imports-85.data"

df=pd.read_csv(url,header = None)

headers=["symboling","normalized-losses","make","fuel-type","aspiration","num-of-doors","body-style",
         "drive-wheels","engine-location","wheel-base","length","width","height","curb-weight","engine-type",
         "num-of-cylinders","engine-size","fuel-system","bore","stroke","compression-ratio","horsepower","peak-rpm"
         ,"city-mpg","highway-mpg","price"]

df.columns=headers

print df.columns

edited Sep 27, 2018 at 14:43

Meloman

3,7743 gold badges43 silver badges52 bronze badges

answered Sep 27, 2018 at 13:52

Ritik Raj Srivastava

5865 silver badges6 bronze badges

Collectives™ on Stack Overflow

How to get pandas.DataFrame columns containing specific dtype

5 Answers 5

1 Comment

Comments

1 Comment

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

1 Comment

Comments

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related