1

My dataframe has many columns. I want to extract columns starting with 9. Code:

df_columns = Index(['_id', 'Time', '2.1', '2.2', '2.3', '2.4', '2.5', '2.6', '2.7', '2.8',
       '2.9', '2.10', '2.11', '2.12', '2.13', '2.14', '2.15', '2.16', '2.17',
       '2.18', '2.19', '2.20', '9.1', '9.2', '9.3', '9.4', '9.5', '9.6', '9.7',
       '9.8', '9.9', '9.10', '9.11', '9.12', '9.13', '9.14', '9.15', '9.16',
       '9.17', '9.18', '9.19', '9.20'],
      dtype='object')
col9s = df_columns.str.findall(r'\b9.')

Present solution:

cols = 
Index([    [],     [],     [],     [],     [],     [],     [],     [],     [],
           [],     [],     [],     [],     [],     [],     [],     [],     [],
           [],     [],     [],     [], ['9.'], ['9.'], ['9.'], ['9.'], ['9.'],
       ['9.'], ['9.'], ['9.'], ['9.'], ['9.'], ['9.'], ['9.'], ['9.'], ['9.'],
       ['9.'], ['9.'], ['9.'], ['9.'], ['9.'], ['9.']],
      dtype='object')

Expected answer:

col9s = ['9.1', '9.2', '9.3', '9.4', '9.5', '9.6', '9.7',
       '9.8', '9.9', '9.10', '9.11', '9.12', '9.13', '9.14', '9.15', '9.16',
       '9.17', '9.18', '9.19', '9.20']

1 Answer 1

2

Use filter with a regex:

col9s = df.filter(regex=r'^9\.').columns

Or, to directly subset the columns:

df2 = df.filter(regex=r'^9\.')
Sign up to request clarification or add additional context in comments.

2 Comments

Simply superb solution. I heard filter first time here. Also, if it were a list of strings as in my question, what would be the regex expression? This for my learning.
You could use mine, the issue is more that findall is not the right tool, better use a comprehension with an if filter and re.match ;)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.