1

I have following dataframe in pandas.

  order_id    no
  1           1,234,450,445.00 
  2           1,234,450,446.00
  3           1,234,450,447.00

I want to convert the no column to integer. Following is my desired dataframe.

  order_id    no
  1           1234450445 
  2           1234450446
  3           1234450447

When I do dtypes, it shows as float64

I tried following

df['no'] = (pd.to_numeric(df['no'].str.replace(',',''), errors='coerce'))

How can I convert this to integer in pandas?

9
  • Try df.no.str.replace(",", "").str[:-3].astype(int). If you are reading in the file as a csv, pandas has a thousands parameter Commented Jul 5, 2020 at 4:56
  • @sammywemmy It gives me following error. TypeError: only integer scalar arrays can be converted to a scalar index Commented Jul 5, 2020 at 4:58
  • what do you get when you use just df.no.str.replace(",", "").str[:-3]? Commented Jul 5, 2020 at 4:59
  • @sammywemmy Same error Commented Jul 5, 2020 at 5:00
  • @Neil try this, df.no.str.replace(",", "").astype(float).astype(int) Commented Jul 5, 2020 at 5:03

2 Answers 2

3

Here is a way, go through float type first:

df['no'].str.replace(',','').astype(float).astype(int)

Output:

0    1234450445
1    1234450446
2    1234450447
Name: no, dtype: int64

Or slice '.00' off then end of all rows:

df['no'].str.strip('.00').str.replace(',','').astype(int)
Sign up to request clarification or add additional context in comments.

3 Comments

it is still giving me TypeError: only integer scalar arrays can be converted to a scalar index this error. Moreover no column is already in float64 format
When I try to convert it to string df['no'] = df['no'].astype(str) it gives same error TypeError: only integer scalar arrays can be converted to a scalar index
@Neil Can you do this on the first five rows is it successful? You need to find the errant data. If so, try the next few rows until you find a rows that is causing the problem. Post that to this question if you are unsure how to fix.
0

Hope can help you.

import pandas as pd
import numpy as np
df['no'] = df['no'].astype(str).apply(lambda x: "".join(x.split(","))).astype(np.float64).astype(np.int64)

or

df['new_no'] = df['no'].astype(str).apply(lambda x: int(float("".join(x.split(",")))))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.