0

df is the database with weather as column

 Weather
 Rain, freezing cold
 Rain, and thunder
 Thunderstorm, and dust
 Drizzle, for half an hour
 Drizzle, for sometime
 Rain, non stop
 Slight rain

Code

heavy_rain_indicator = ['Rain,','Thunderstorm,',]
light_rain_indicator = ['Drizzle,','Slight rain']

df['Heavy Rain Indicator'] = (df['Weather'].str.contains(heavy_rain_indicator))
df['Light Rain Indicator'] = (df['Weather'].str.contains(light_rain_indicator))

Expected output:

 Weather                  Heavy Rain Indicator    Light Rain Indicator
 Rain, freezing cold         TRUE                      FALSE
 Rain, and thunder           TRUE                      FALSE
 Thunderstorm, and dust       TRUE                      FALSE
 Drizzle, for half an hour   FALSE                     TRUE
 Drizzle, for sometime       FALSE                     TRUE
 Rain, non stop              TRUE                      FALSE
 Slight rain                 FALSE                     TRUE

Actual output

 TypeError: unhashable type: 'list'
 ----> 4     df['Heavy Rain Indicator'] = (df['Weather'].str.contains(heavy_rain_indicator))

I want the columns Heavy rain indicator to be TRUE when heavy rain indicators are present and light rain indicator to be TRUE when light rain indicators are present

Someone suggested to use isin (and then deleted the post) but I cannot type the exact expression, so for heavy rain indicator for eg I want all values beginning with Rain, to be in heavy indicator column and so on. Pls answer accordingly

2

3 Answers 3

2

More pandas answer:

df['Heavy Rain Indicator'] = df['Weather'].str.startswith(tuple(heavy_rain_indicator))
df['Light Rain Indicator'] = df['Weather'].str.startswith(tuple(light_rain_indicator))

or if you want find cases not only from the beginning:

df['Heavy Rain Indicator'] = df['Weather'].str.contains('|'.join(heavy_rain_indicator))
df['Light Rain Indicator'] = df['Weather'].str.contains('|'.join(light_rain_indicator))
Sign up to request clarification or add additional context in comments.

Comments

2

str.contains takes argument as string but you are passing the list

You can use list comprehension with any like below:

df['Heavy Rain Indicator'] = [any(i.lower() in j.lower() for i in heavy_rain_indicator) for j in df["Weather"].values]

df['Light Rain Indicator'] = [any(i.lower() in j.lower() for i in light_rain_indicator) for j in df["Weather"].values]

1 Comment

Why are you using .values? It shouldn't be a huge issue here, but the correct way of performing a case-insensitive comparison is using str.casefold().
2

you can try this:

def get_TF(x, info_list):
    return any([True for i in info_list if i in x])
heavy_rain_indicator = ['Rain,','Thunderstorm,']
light_rain_indicator = ['Drizzle,','Slight rain']

df['Heavy Rain Indicator'] = df['Weather'].apply(lambda x : get_TF(x, heavy_rain_indicator))
df['Light Rain Indicator'] = df['Weather'].apply(lambda x : get_TF(x, light_rain_indicator))
df


                     Weather  Heavy Rain Indicator  Light Rain Indicator
0        Rain, freezing cold                  True                 False
1          Rain, and thunder                  True                 False
2     Thunderstorm, and dust                  True                 False
3  Drizzle, for half an hour                 False                  True
4      Drizzle, for sometime                 False                  True
5             Rain, non stop                  True                 False
6                Slight rain                 False                  True

1 Comment

You really don't need .apply(), an entire function, and a lambda for this. Why use any([True for i in info_list if i in x]) instead of just any(i in x for i in info_list) ?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.