7

I have a list of floats (actually it's a pandas Series object, if it changes anything) which looks like this:

mySeries:

...
22      16.0
23      14.0
24      12.0
25      10.0
26       3.1
...

(So elements of this Series are on the right, indices on the left.) Then I'm trying to assign the elements from this Series as keys in a dictionary, and indices as values, like this:

{ mySeries[i]: i for i in mySeries.index }

and I'm getting pretty much what I wanted, except that...

{ 6400.0: 0, 66.0: 13, 3.1000000000000001: 23, 133.0: 10, ... }

Why has 3.1 suddenly changed into 3.1000000000000001? I guess this has something to do with the way the floating point numbers are represented (?) but why does it happen now and how do I avoid/fix it?

EDIT: Please feel free to suggest a better title for this question if it's inaccurate.

EDIT2: Ok, so it seems that it's the exact same number, just printed differently. Still, if I assign mySeries[26] as a dictionary key and then I try to run:

myDict[mySeries[26]]

I get KeyError. What's the best way to avoid it?

2
  • did you try MySeries.astype(float).to_dict() Commented Oct 6, 2016 at 17:09
  • @StevenG I'm trying to do this the other way around: to have indices as values. Anyway, I don't think it would solve this problem. Commented Oct 6, 2016 at 17:23

2 Answers 2

10

The dictionary isn't changing the floating point representation of 3.1, but it is actually displaying the full precision. Your print of mySeries[26] is truncating the precision and showing an approximation.

You can prove this:

pd.set_option('precision', 20)

Then view mySeries.

0    16.00000000000000000000
1    14.00000000000000000000
2    12.00000000000000000000
3    10.00000000000000000000
4     3.10000000000000008882
dtype: float64

EDIT:

What every computer programmer should know about floating point arithmetic is always a good read.

EDIT:

Regarding the KeyError, I was not able to replicate the problem.

>> x = pd.Series([16,14,12,10,3.1])
>> a = {x[i]: i for i in x.index}
>> a[x[4]]
4
>> a.keys()
[16.0, 10.0, 3.1000000000000001, 12.0, 14.0]
>> hash(x[4])
2093862195
>> hash(a.keys()[2])
2093862195
Sign up to request clarification or add additional context in comments.

Comments

6

The value is already that way in the Series:

>>> x = pd.Series([16,14,12,10,3.1])
>>> x
0    16.0
1    14.0
2    12.0
3    10.0
4     3.1
dtype: float64
>>> x.iloc[4]
3.1000000000000001

This has to do with floating point precision:

>>> np.float64(3.1)
3.1000000000000001

See Floating point precision in Python array for more information about this.

Concerning the KeyError in your edit, I was not able to reproduce. See the below:

>>> d = {x[i]:i for i in x.index}
>>> d
{16.0: 0, 10.0: 3, 12.0: 2, 14.0: 1, 3.1000000000000001: 4}
>>> x[4]
3.1000000000000001
>>> d[x[4]]
4

My suspicion is that the KeyError is coming from the Series: what is mySeries[26] returning?

2 Comments

Thanks @brianpck, you're right, the problem was indeed caused by something else in my code (unrelated to floating point representation). Pandas Series works fine.
@machaerus If I were you, I'd edited the question to indicate that the key error turned out to be unrelated to the precision issue.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.