1

I have a large database in which I want to select all the columns that meet a certain criteria:

My data looks like the following:

Name  a  b  c 
target-01  5196     24     24  
target-02  5950    150    150 
target-03  5598     50     50 
object-01  6558     44     -1 
object-02  6190     60     60 

I want to select all the data whose Name starts with target.

So the selected df would be:

target-01  5196     24     24  
target-02  5950    150    150 
target-03  5598     50     50 

I am reading the data using:

data = pd.read_csv('catalog.txt', sep = '\s+', header = None, skiprows =1 )

How can I select the data I want?

1
  • If you would like to apply further filtering of your data, you may take a look at the panda library: pandas.pydata.org Commented Aug 29, 2016 at 8:48

1 Answer 1

2

Use str.startswith and boolean indexing:

print (df[df.Name.str.startswith('target')])
        Name     a    b    c
0  target-01  5196   24   24
1  target-02  5950  150  150
2  target-03  5598   50   50

Another solution with str.contains:

print (df[df.Name.str.contains(r'^target')])
        Name     a    b    c
0  target-01  5196   24   24
1  target-02  5950  150  150
2  target-03  5598   50   50

Last solution with filter:

df.set_index('Name', inplace=True)

print (df.filter(regex=r'^target', axis=0))
              a    b    c
Name                     
target-01  5196   24   24
target-02  5950  150  150
target-03  5598   50   50

print (df.filter(regex=r'^target', axis=0).reset_index())
        Name     a    b    c
0  target-01  5196   24   24
1  target-02  5950  150  150
2  target-03  5598   50   50
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.