1

I have a dataframe,

DF,
Name    Stage   Description
Sri     1       Sri is one of the good singer in this two
        2       Thanks for reading
Ram     1       Ram is one of the good cricket player
ganesh  1       good driver

and a list,

my_list=["one"]

 I tried mask=df["Description"].str.contains('|'.join(my_list),na=False)

but it gives,

 output_DF.
Name    Stage   Description
Sri     1       Sri is one of the good singer in this two
Ram     1       Ram is one of the good cricket player

My desired output is,
desired_DF,
Name    Stage   Description
Sri     1       Sri is one of the good singer in this two
        2       Thanks for reading
Ram     1       Ram is one of the good cricket player

It has to consider the stage column, I want all the rows associated with the description.

6
  • What is print (df.index) ? Commented Oct 4, 2017 at 7:38
  • no, mask=df["Description"].str.contains(my_list,na=False) works good but I want to pick the other row too. until the stage is finished or again the stage is 1 Commented Oct 4, 2017 at 7:38
  • 1
    My pc is hanging ill restart and answer, give me some time. Commented Oct 4, 2017 at 7:39
  • Is possible match data by Name column? Commented Oct 4, 2017 at 7:48
  • no, see in stage =2 , name column is null Commented Oct 4, 2017 at 8:58

2 Answers 2

2

I think you need:

print (df)
     Name  Stage                                Description
0     Sri      1  Sri is one of the good singer in this two
1              2                         Thanks for reading
2     Ram      1      Ram is one of the good cricket player
3  ganesh      1                                good driver

#replace empty or whitespaces by previous value
df['Name'] = df['Name'].mask(df['Name'].str.strip() == '').ffill()
print (df)
     Name  Stage                                Description
0     Sri      1  Sri is one of the good singer in this two
1     Sri      2                         Thanks for reading
2     Ram      1      Ram is one of the good cricket player
3  ganesh      1                                good driver

#get all names by condition
my_list = ["one"]
names=df.loc[df["Description"].str.contains("|".join(my_list),na=False), 'Name']
print (names)
0    Sri
2    Ram
Name: Name, dtype: object

#select all rows contains names
df = df[df['Name'].isin(names)]
print (df)
  Name  Stage                                Description
0  Sri      1  Sri is one of the good singer in this two
1  Sri      2                         Thanks for reading
2  Ram      1      Ram is one of the good cricket player
Sign up to request clarification or add additional context in comments.

10 Comments

IF we have, my_list ["Thanks"], it gives me "Thanks for reading" row. But i dont want to map when the stage is other than 1. Is there a way ?
I want to map my_list and df["Description"] only when the stage is 1. if we find a match we will get all the stages for the particular description.
Yes, I think instead df["Description"].str.contains("|".join(my_list),na=False) need df["Description"].str.contains("|".join(my_list),na=False) & (df['Stage'] == 1)
do you want to replace this names=df.loc[df["Description"].str.contains("|".join(my_list),na=False), 'Name'] if i did I am getting only the columns, no values are there
yes Jezrael, it works fine. I dint check properly. Thanks for your answer
|
0

It looks to be finding "one" in the Description fields of the dataframe and returning the matching descriptions.

If you want the third row, you will have to add an array element for the second match

eg. 'Thanks' so something like my_list=["one", "Thanks"]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.