0

I'm iterating through excel files and logging rows that contain a string, like so:

def find_string_row(toFind, dframe):
    index = 0
    while toFind not in dframe.iloc[index].values:
        print(dframe.iloc[index].values)
        index = index+1
    return index  

I have some new files that are quite messy, and contain strings with whitespace inside and outside the text. So while the above works for exact matches, it fails for loose matches. How can I rewrite this to find strings in this manner:

toFind.replace(" ", "").lower()  

so if I input a string like "Address 1 23 " and the excel contains " add ress 12 3" they will match?

2
  • when you are replacing " " with "" and adding lower() only address123 will match. Commented Oct 12, 2021 at 13:40
  • yep, i tried putting the same string functions on the other side (dframe.iloc[index].values), but i think it's a pandas object, not a string so it errors out. how would I make it symmetric? Commented Oct 12, 2021 at 13:46

1 Answer 1

1
def find_string_row(toFind, df):
index = 0
stripped_dfrow= [x.replace(" ", "").lower() for x in df.iloc[index].values]
stripped_tofind = toFind.replace(" ", "").lower() 
while toFind not in stripped_dfrow:
    print(dframe.iloc[index].values)
    index = index+1
return index  

Your locating all of the values within a row, which should return you a list-like object. So to use the string replace method you'll have to iterate over the list. I used the comprehension above which will remove all of the spaces of each item in the value list.

Sign up to request clarification or add additional context in comments.

1 Comment

thanks, the list comprehension on the pandas object did the trick!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.