0

A question in Python (3.9.5) and Pandas:

Suppose I have an array of strings x and I want to extract all the elements that contains a certain substring, e.g. feb05. Is there a Pythonic way to do it in one-line, including using a Pandas functions?

Example for what I mean:

x = ["2023_jan05", "2023_jan_27", "2023_feb04", "2023_feb05", "2024_feb05"]
must_contain = "feb05"
desired_output = ["2023_feb05", "2024_feb05"]

I can run a loop,

import numpy as np
import pandas as pd

desired_output = []
indices_bool = np.zeros(len(x))
for idx, test in enumerate(x):
   if must_contain in test:
      desired_output.append(test)
      indices_bool[idx] = 1
      

but I seek for a more Pythonic way to do it.

In my application x is a column in a Pandas dataframe, so answers with Pandas functions will also be welcomed. The goal is to filter all the rows that has must_contain in the field x (e.g. x = df["names"]).

2 Answers 2

1

Since you are with pandas, you can use str.contains to get the boolean condition:

import pandas as pd
df = pd.DataFrame({'x': ["2023_jan05", "2023_jan_27", "2023_feb04", "2023_feb05", "2024_feb05"]})
must_contain = "feb05"

df.x.str.contains(must_contain)
#0    False
#1    False
#2    False
#3     True
#4     True
#Name: x, dtype: bool

Filter by the condition:

df[df.x.str.contains(must_contain)]
#            x
#3  2023_feb05
#4  2024_feb05
Sign up to request clarification or add additional context in comments.

Comments

1

no pandas

list(filter(lambda y: must_contain in y,x))

["2023_feb05", "2024_feb05"]

pandas

series=pd.Series(["2023_jan05", "2023_jan_27", "2023_feb04", "2023_feb05", "2024_feb05"])
must_contain = "feb05"
series[series.str.contains(must_contain)].to_list()

["2023_feb05", "2024_feb05"]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.