2

I have a dataframe,

DF,
Name    Stage   Description
Sri     1       Sri is one of the good singer in this two
        2       Thanks for reading
Ram     1       Ram is one of the good cricket player
ganesh  1       good driver

and a list,

my_list=["one","driver"]

I tried, names=df.loc[df["Description"].str.contains("|".join(my_list),na=False), 'Name']

achieved everything except the keyvalue column.

 output_DF.
Name    Stage   Description
Sri     1       Sri is one of the good singer in this two
Ram     1       Ram is one of the good cricket player

My desired output is,
desired_DF,
Name    Stage   Description                                 keyvalue
Sri     1       Sri is one of the good singer in this two    one
        2       Thanks for reading                           
Ram     1       Ram is one of the good cricket player        one
ganesh  1       good driver                                  driver

some one help me with generating keyvalue column

1 Answer 1

2

I think you can use previous solution from here and then extract:

pat = "|".join(my_list)

df['keyvalue'] = df['Description'].str.extract("(" + pat + ')', expand=False).fillna('')
print (df)
     Name  Stage                                Description keyvalue
0     Sri      1  Sri is one of the good singer in this two      one
1     Sri      2                         Thanks for reading         
2     Ram      1      Ram is one of the good cricket player      one
3  ganesh      1                                good driver   driver

All together:

print (df)
#     Name  Stage                                Description
#0     Sri      1  Sri is one of the good singer in this two
#1              2                         Thanks for reading
#2     Ram      1      Ram is one of the good cricket player
#3  ganesh      1                            good Driver one

my_list=["ONE","driver"]
df['Name'] = df['Name'].mask(df['Name'].str.strip() == '').ffill()

pat = "|".join(my_list).lower()

names=df.loc[df["Description"].str.lower().str.contains(pat,na=False), 'Name']

df = df[df['Name'].isin(names)]

df['keyvalue'] = (df['Description'].str.lower()
                                   .str.extract("(" + pat + ')', expand=False)
                                   .fillna(''))
print (df)
#     Name  Stage                                Description keyvalue
#0     Sri      1  Sri is one of the good singer in this two      one
#1     Sri      2                         Thanks for reading         
#2     Ram      1      Ram is one of the good cricket player      one
#3  ganesh      1                            good Driver one   driver
Sign up to request clarification or add additional context in comments.

17 Comments

no Jezrael, i dont want to add "one" for all the rows. It has to add the exact matching keyword
ops, give me some time.
Edited the final expected output, please check
What is 2 or more values match?
we can give the first value
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.