0

I have the following function to check if a row within a DataFrame contains a string. This approach does work however it will only match if the provided string is exactly the same as what is in the DataFrame and I need it to match if it contains a string.

e.g. searching for 'fox' in 'a quick brown fox' will yield no return

def search_excel_files(file_list, search_term):
    #list of row indexes that contain the search term
    rows = {}
    for file in file_list:
        df = pd.read_excel("files/" + file)
        for row in df.iterrows():
            if search_term in row[1].values:
                #get row index
                row_index = row[0]
                #add row index to dictionary
                rows = df.iloc[row_index].to_dict()
    return rows

How can I check if the row contains the provided string in this instance?

6
  • 1
    pandas.pydata.org/docs/reference/api/… is of no use for you? Commented Jul 5, 2022 at 13:51
  • @9769953 if row[1].str.contains(search_term, regex=False).any(): returns the error Can only use .str accessor with string values! how can I access the value as a string? Commented Jul 5, 2022 at 14:15
  • Operate on the dataframe column in total: df['column_name'].str.contains(....). Your code iterates over the rows, thus operates on single values (row[1]). Commented Jul 5, 2022 at 14:47
  • @9769953 Ok that makes sense however I am still getting the same error of Can only use .str accessor with string values! when looping through the 'column_name' using for col in df.columns: any idea why that could be? Commented Jul 5, 2022 at 15:06
  • Because you are now doing this for all your columns (with the for loop)! If any of those columns is not of a string type (has a None, or is of integer type or floating point), then it obviously fails. That should be obvious from the error message. Commented Jul 5, 2022 at 17:17

1 Answer 1

1

It's better to think in columns than in rows when using pandas

df.your_col.str.contains("fox")

which will return an array of booleans, one bool for each row

Below you will get a dataframe where each row has fox in your_col column.

df[df.your_col.str.contains("fox")]
Sign up to request clarification or add additional context in comments.

3 Comments

The "regex" argument should be set to "False" if regex search isn't wanted.
@PascalVKooten in your example "your_col" how can I iterate through the columns? I am currently using IterRows however I do not see an equivalent for IterCols?
[x for x in df.columns]

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.