Identifying multiple columns by name in Pandas

Question

Is there a way to select a subset of columns using text matching or regular expressions?

In R it would be like this:

attach(iris) #Load the 'Stairway to Heaven' of R's built-in data sets
iris[grep(names(iris),pattern="Length")] #Prints only columns containing the word "Length"

joris · Accepted Answer · 2014-04-11 14:50:59Z

6

You can use the filter method for this (use axis=1 to filter on the column names). This function has different possibilities:

Equivalent to if 'Length' in col:
```
df.filter(like='Length', axis=1)
```
Using a regex (however, it is using re.search and not re.match, so you have possibly to adjust the regex):
```
df.filter(regex=r'\.Length$', axis=1)
```

edited Apr 11, 2014 at 14:50

answered Apr 11, 2014 at 14:45

joris

140k37 gold badges258 silver badges207 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

JKC Over a year ago

Very good info @joris. But I also need to get the column names which contains some other characters also along with the column name. For example "Length_1", "Length_2", "Width_1", "Width_2", etc.. are my column names. My filter function is like df.filter(like=col+'_', axis=1) where col will have values like "Length", "Width", etc... which is not fetching values . Any idea what should I correct ?

joris Over a year ago

You should be able to do that with a regular expression, eg regex=r"Length|Width"

duber · Accepted Answer · 2014-04-11 14:22:06Z

1

Using Python's in statement, it would work like this:

#Assuming iris is already loaded as a df called 'iris' and has a proper header
iris = iris[[col for col in iris.columns if 'Length' in col]]
print iris.head()

Or, using regular expressions,

import re
iris = iris[[col for col in iris.columns if re.match(r'\.Length$',col)]]
print iris.head()

The first will run faster but the second will be more accurate.

answered Apr 11, 2014 at 14:22

duber

2,8894 gold badges27 silver badges32 bronze badges

Collectives™ on Stack Overflow

Identifying multiple columns by name in Pandas

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related