Mapping keyword with a dataframe column using pandas in python

Question

I have a dataframe,

DF,
Name    Stage   Description
Sri     1       Sri is one of the good singer in this two
        2       Thanks for reading
Ram     1       Ram is one of the good cricket player
ganesh  1       good driver

and a list,

my_list=["one"]

 I tried mask=df["Description"].str.contains('|'.join(my_list),na=False)

but it gives,

 output_DF.
Name    Stage   Description
Sri     1       Sri is one of the good singer in this two
Ram     1       Ram is one of the good cricket player

My desired output is,
desired_DF,
Name    Stage   Description
Sri     1       Sri is one of the good singer in this two
        2       Thanks for reading
Ram     1       Ram is one of the good cricket player

It has to consider the stage column, I want all the rows associated with the description.

no, mask=df["Description"].str.contains(my_list,na=False) works good but I want to pick the other row too. until the stage is finished or again the stage is 1 — Pyd
– Pyd, Commented Oct 4, 2017 at 7:38

jezrael · Accepted Answer · 2017-10-04 07:47:50Z

2

I think you need:

print (df)
     Name  Stage                                Description
0     Sri      1  Sri is one of the good singer in this two
1              2                         Thanks for reading
2     Ram      1      Ram is one of the good cricket player
3  ganesh      1                                good driver

#replace empty or whitespaces by previous value
df['Name'] = df['Name'].mask(df['Name'].str.strip() == '').ffill()
print (df)
     Name  Stage                                Description
0     Sri      1  Sri is one of the good singer in this two
1     Sri      2                         Thanks for reading
2     Ram      1      Ram is one of the good cricket player
3  ganesh      1                                good driver

#get all names by condition
my_list = ["one"]
names=df.loc[df["Description"].str.contains("|".join(my_list),na=False), 'Name']
print (names)
0    Sri
2    Ram
Name: Name, dtype: object

#select all rows contains names
df = df[df['Name'].isin(names)]
print (df)
  Name  Stage                                Description
0  Sri      1  Sri is one of the good singer in this two
1  Sri      2                         Thanks for reading
2  Ram      1      Ram is one of the good cricket player

answered Oct 4, 2017 at 7:47

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

10 Comments

Pyd Over a year ago

IF we have, my_list ["Thanks"], it gives me "Thanks for reading" row. But i dont want to map when the stage is other than 1. Is there a way ?

Pyd Over a year ago

I want to map my_list and df["Description"] only when the stage is 1. if we find a match we will get all the stages for the particular description.

jezrael Over a year ago

Yes, I think instead df["Description"].str.contains("|".join(my_list),na=False) need df["Description"].str.contains("|".join(my_list),na=False) & (df['Stage'] == 1)

Pyd Over a year ago

do you want to replace this names=df.loc[df["Description"].str.contains("|".join(my_list),na=False), 'Name'] if i did I am getting only the columns, no values are there

Pyd Over a year ago

yes Jezrael, it works fine. I dint check properly. Thanks for your answer

|

Calvin Taylor · Accepted Answer · 2017-10-04 07:45:57Z

0

It looks to be finding "one" in the Description fields of the dataframe and returning the matching descriptions.

If you want the third row, you will have to add an array element for the second match

eg. 'Thanks' so something like my_list=["one", "Thanks"]

answered Oct 4, 2017 at 7:45

Calvin Taylor

7044 silver badges17 bronze badges

Collectives™ on Stack Overflow

Mapping keyword with a dataframe column using pandas in python

2 Answers 2

10 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

10 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related