1

Working with pandas data frame, where one of the columns, say col1 has floating point values, I am trying to divide each of these values with a pre defined constant, say A, and then save the results as integer values.

A = 0.5

Following is the data in col1

df["col1"]

0     0.800000
1     0.883333
2     0.883333
3     1.000000
4     1.000000
5     1.300000
6     1.300000
7     1.500000
8     1.500000
9     2.000000
10    2.000000
11    2.500000
12    2.500000

After applying

df["new_col"] = (df["col1"] / A)

It gives

0     1.600000
1     1.766667
2     1.766667
3     2.000000
4     2.000000
5     2.600000
6     2.600000
7     3.000000
8     3.000000
9     4.000000
10    4.000000
11    5.000000
12    5.000000

which is fine, but as soon as I add .astype(int) to the above code, it can be observed at index 9 and 10 the values are 3 and 3 whereas it should be 4 and 4 respectively.

df["new_col"] = (df["col1"] / A).astype(int)
df["new_col"]

0     1
1     1
2     1
3     2
4     2
5     2
6     2
7     3
8     3
9     3
10    3
11    5
12    5

The other ways I am trying to obtain the result are

 df["new_col"] = math.floor(df["col1"] / A )

and

df["new_col"] = int( df["col1"] / A)

Both of which gives me TypeError saying:

TypeError: cannot convert the series to class 'float' and TypeError: cannot convert the series to class 'int' respectively.

Please let me know how should I resolve the above issues.

4
  • Wierd. I am getting 4,4 in both the locations. Commented Aug 9, 2017 at 21:24
  • I guess, I should restart my system then and try again, otherwise I've tried the code 10 times and it shows me 3 and 3. Commented Aug 9, 2017 at 21:30
  • That's probably a floating point issue. If it's represented as 3.99999 astype will round it down to 3. Try round(0) maybe? Commented Aug 9, 2017 at 21:32
  • @ayhan I did what u suggested, using .round(0) , but the problem is it works as a ceiling function, it makes values like 2.6, 2.87 to 3, whereas I want them to be 2. What else can I try ? Commented Aug 9, 2017 at 22:01

1 Answer 1

0

You probably have rounding issue. What you see as 4.000000 is probably 3.9999999999 internally, so you get this result. ( try df.col1-2 to check it.)

You can try (df.round(6)/.5).astype(int) to work on the digits you see, but it is a workaround.

Sign up to request clarification or add additional context in comments.

3 Comments

What if all the values in the above column are negative, so in case of -3.999999, it will be rounded to -4 ?
It's dangerous to continue this way. I think if your computations lead to 3.999 when 4 is the good result, you must refactor this part.
I am still struggling with this, now one of the values in column is -7.680000 and I am dividing it with 0.04000 , it should be -192, but I get -191 which affects my overall result. Please suggest the refactoring part yo mentioned. I am sorry, haven't encountered this type of problem before, so just running out of solutions. I have a column which has only negative floating point numbers which I will be dividing with another positive floating point no, and then keeping only the integer part of the result (like floor and not ceiling).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.