0

I have this code:

    for row in range(len(df[col])):
        df[col][row] = int(df[col][row].replace(',','')) 
    df[col] = df[col].astype(int)
    df[col] = np.round(df[col]/500)*500  #rounds the numbers to the closest 500 multiple.
    df[col] = df[col].astype(int) #round returns a float, this turns it back to int after rounding  

In the for loop the: df[col][row].replace(',','') basically removes commas from numbers that are stored as objects like 1,430 and then converts it to int like 1430

Then I'm having to add the df[col] = df[col].astype(int) because otherwise, the following np.round() throws the error: 'float' object has no attribute 'rint'

The thing is that after the np.round() I'm having to add again the .astype(int) because the round as I have it is returning a float, but I want ints.

I'm seeing that the execution of this is considerably long, even thought my dataframe is only 32 x 17

is there anyway I could improve it??

2

2 Answers 2

0

Would a more general replace using a lambda function df[col].apply(lambda x: x.str.replace(',','')) be more suitable and time efficient?

And would a one liner like this not yield what you are after?

df['col'] = (df['col'] / 500).astype(int) * 500

Sign up to request clarification or add additional context in comments.

1 Comment

it's not neccesary the usage of apply when specifying the col, you can just use df[col].str.replace(',','') ;)
0

Don't do that for row in range(len(df[col])): do this: for row in df[col]

or instead of that for use this:

Use this for actually replacing string with another string: DataFrame.replace

or better use a lambda: DataFrame.apply (Example here)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.