1

I try to change my dataframe. Usually I use something like

df1= df[df.url.str.contains("avito.ru/*/telefony/")]

But if I want a lot of condition? I want to write to contains more than 100 strings. How can I do that?

Dataframe

анонс кинофильмов 2016
анонс кинофильмов 2016
"выборок имеют величину момента сопротивления"
"выборок имеют величину момента сопротивления"
ансамбль 9 человек
ансамбль 9 человек
ансамбль 9 человек
"Времена года в музыке, литературе, живописи"
"Времена года в музыке, литературе, живописи"
"Времена года в музыке, литературе, живописи"
apple iphone
samsumg
facebook
None
None
None

And some words from list

lst = ['iphone', 'sony', 'alcatel', 'galaxy', 'samsumg]

Desire output

apple iphone
samsumg
None
None
None

I mean if some words don't contain in str, I want to delete that. (But values with None I want to have there too).

4
  • Sorry are you this user: stackoverflow.com/users/6065920/ldevyataykina? as this question is really similar to that user's questions Commented Aug 25, 2016 at 10:49
  • Also your question is a little unclear are you looking for matching all 100 strings or are you looking for any strings that match any of the 100 strings? Commented Aug 25, 2016 at 10:50
  • @EdChum I add desire output Commented Aug 25, 2016 at 10:58
  • You can create a search string by doing '|'.join(lst) and pass this to str.contains Commented Aug 25, 2016 at 11:00

1 Answer 1

3

You can create a pattern by joining | with all your list items and pass this to str.contains:

In [31]:
lst = ['iphone', 'sony', 'alcatel', 'galaxy', 'samsumg','None']
pat = '|'.join(lst)
df[df['url'].str.contains(pat)]

Out[31]:
             url
10  apple iphone
11       samsumg
13          None
14          None
15          None

To handle the missing values include pd.isNull(df['url']) in the boolean condition:

In [54]:
lst = ['iphone', 'sony', 'alcatel', 'galaxy', 'samsumg']
pat = '|'.join(lst)
df[pd.isnull(df['url']) | df['url'].str.contains(pat) ]

Out[54]:
             url
10  apple iphone
11       samsumg
13           NaN
14           NaN
15           NaN
Sign up to request clarification or add additional context in comments.

7 Comments

But how can I don't loose values with None?
Are those values string 'None' or are they really NaN None? it's unclear from your question
They are empty and I want to save that
Please don't send me error codes, edit your question with raw data, code and errors that reproduces the errors, what you posted is still ambiguous
What do you mean empty strings or NaN? do you understand the difference? Also post raw data that correctly reproduces your df, at the moment it looks like the string 'None' which is completely different
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.