Python Pandas - replace all values in dataframe where the value meets certain condition

Question

I have a dataframe that contains numbers represented as strings which uses the comma separator (e.g. 150,000). There are also some values that are represented by "-".

I'm trying to convert all the numbers that are represented as strings into a float number. The "-" will remain as it is.

My current code uses a for loop to iterate each column and row to see if each cell has a comma. If so, it removes the comma then converts it to a number.

This works fine most of the time except some of the dataframes have duplicated column names and that's when it falls apart.

Is there a more efficient way of doing this update (i.e. not using loops) and also avoid the problem when there are duplicated column names?

Current code:

    for col in statement_df.columns: 
    row = 0
    while row < len(statement_df.index):

        row_name = statement_df.index[row]

        if statement_df[col][row] == "-":
            #do nothing
            print(statement_df[col][row])

        elif statement_df[col][row].find(",") >= 0:
            #statement_df.loc[col][row] = float(statement_df[col][row].replace(",",""))
            x = float(statement_df[col][row].replace(",",""))
            statement_df.at[row_name, col] = x
            print(statement_df[col][row])

        else:

            x = float(statement_df[col][row])
            statement_df.at[row_name, col] = x
            print(statement_df[col][row])

        row = row + 1

Prince Francis · Accepted Answer · 2020-04-27 14:47:30Z

1

Use str.replace(',', '') on dataframe itself

For a dataframe like below

Name  Count
Josh  12,33
Eric  24,57
Dany  9,678

apply like these

df['Count'] = df['Count'].str.replace(',', '')
df

It will give you the following output

   Name Count
0  Josh  1233
1  Eric  2457
2  Dany  9678

answered Apr 27, 2020 at 14:47

Prince Francis

3,1071 gold badge16 silver badges22 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

spongey Over a year ago

Thanks, that works - at least it removes all occurences of ",". However, once the "," has been removed, the "number" is still technically a string. How can I convert to float numbers but bear in mind that I still have occurences of "-" which I want to leave unchanged?

Tom Ron · Accepted Answer · 2020-04-27 14:45:42Z

0

You can use iloc functionality for that -

for idx in range(len(df.columns)):
    df.iloc[:, idx] = df.iloc[:, idx].apply(your_function)

The code in your_function should be able to deal with input from one row. For example -

def your_function(x):
    if x == ',': return 0
    return float(x)

answered Apr 27, 2020 at 14:45

Tom Ron

6,2413 gold badges28 silver badges43 bronze badges

Collectives™ on Stack Overflow

Python Pandas - replace all values in dataframe where the value meets certain condition

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related