Storage of floating point numbers in memory in Python [duplicate]

Question

I know that Python maintains an internal storage of small-ish integers rather than creating them at runtime:

id(5)

4304101544

When repeating this code after some time in the same kernel, the id is stable over time:

id(5)

4304101544

I thought that this wouldn't work for floating point numbers because it can't possibly maintain a pre-calculated list of all floating point numbers.

However this code returns the same id twice.

id(4.33+1), id(5.33)

(5674699600, 5674699600)

After some time repeating the same code returns some different location in memory:

id(4.33 + 1), id(5.33)

(4962564592, 4962564592)

What's going on here?

When you do for example id(A(4.33) + A(1)), id(A(5.33)), A(4.33) + A(1) can be garbage collected before A(5.33) is constructed: so they aren't the same object, they're two objects that occupy the same memory address with non-overlapping times. — slothrop
– slothrop, Commented Jul 5, 2023 at 20:07
See: stackoverflow.com/questions/50893267/… and stackoverflow.com/questions/24802740/… — slothrop
– slothrop, Commented Jul 5, 2023 at 20:10
I think the explanation for the floats is likely a different one, i.e. constant folding by the interpreter, so that the bytecode treats 4.33 + 1 and 5.33 as the same literal, though I haven't found a definitive statement of this. But that isn't what's happening with your class instances, since two different instances of a user-defined class can't be the same object. — slothrop
– slothrop, Commented Jul 5, 2023 at 20:15
reopening because the OP has some code that can trigger different optimizations, and the meaning of the id equality will vary. Also, the question itslef differs from the question marked as a dupe. — jsbueno
– jsbueno, Commented Jul 5, 2023 at 20:45
It has nothing to do with the value that was calculated. In fact, it's only coincidental that the floating-point addition shown here "works" even numerically. For example, 0.1 + 0.2 == 0.3 gives a False result. — Karl Knechtel
– Karl Knechtel, Commented Jun 12, 2024 at 6:54

jsbueno · Accepted Answer · 2024-06-12 15:45:02Z

4

The id mechanisms for cPython are not only implementation dependent: they are dependent on several runtime optimizations that may or may not be triggered by subtle code or context changes, along with the current interpreter state - and that should never, ever - NOT EVEN THIS ONCE - be relied upon.

That said, what you hit is a completely different mechanism than the small integer caching - what you have is space-reutilization in the interpreter memory pool for objects.

In this case, you are hitting a cache for floats in the same code-block, yes, along with a compile-time optimization which resolves constant operations, such as "1" at compile time (even if "compile" is instant when you press enter in the repl)

In [39]: id(1 + 4.33), id(5.33)
Out[39]: (139665743642672, 139665743642672)

^Even with a reference to the first float, the second one shares the same object: this is one kind of optimization.

What could be happening was also: id(4.33+1), id(5.33) This is what takes place under the hood: Python instantiate (or copy from a co-object specific constant objects space) the "4.33" number, then "instantiates" the "1" - (and this will usually hit the optimization path for reusing small integers - but do not rely on that either), resolves the "+" and instantiates the 5.33. Then it uses this number in the call to id, when that returns, there are no remaining references to 5.33 and the object is deleted. Then, after the ,, Python instantiates a new 5.33 - in the same memory location, by coincidence, occupied by the previous 5.33, and the numbers happen to match.

Just keep an instance to the former number around, and you would see the different ID:

In [41]: id(old:=(one + 4.33)), id(5.33)
Out[41]: (139665742657488, 139665743643856)

A reference kept around for the first number, and no binary operation of literals, which is optimized at text->bytecode time: different objects

edited Jun 12, 2024 at 15:45

answered Jul 5, 2023 at 20:10

jsbueno

113k11 gold badges159 silver badges239 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

slothrop Over a year ago

For id(4.33+1), id(5.33) is it possible there is constant folding? So that 4.33+1 in the source is turned into 5.33 in the bytecode by the interpreter?

jsbueno Over a year ago

yes, that optimization does happen - and it even broke my initial demonstration, taking me to add a new paragraph and mention it midway.

Sebastian Wozny Over a year ago

It does occur and I showed it using dis. I feel insecure about reopening the question.def f(a): a=4.33+1. return a dis.dis(f)->LOAD_CONST 1 (5.33)

jsbueno Over a year ago

feel free to detail that in an answer if you will I reopened the question as this specific about floats.

Karl Knechtel Over a year ago

While the implementation happens to use a "cache for float objects", it would be perfectly possible to get the observed result without that - because each object can be garbage-collected after its id is calculated, and there's nothing to prevent the second one from being allocated in the same memory where the first was.

Sebastian Wozny · Accepted Answer · 2023-07-05 21:17:53Z

It's not just that the object is garbage collected and and the new object stored in the same location as the previous one after garbage collection.

Something different is at work here.

We can use the dis module to look at the bytecode generated:

import dis

def f():
    one, two = 4.3333333, 3.3333333 + 1.
    a, b = id(one), id(two)
    return one, two, a, b

dis.dis(f)
one, two, a, b = f()

shows us the bytecode generated:

  1           0 RESUME                   0
 
  2           2 LOAD_CONST               1 ((4.3333333, 4.3333333))
              4 UNPACK_SEQUENCE          2
              8 STORE_FAST               0 (one)
             10 STORE_FAST               1 (two)

  3          12 LOAD_GLOBAL              1 (NULL + id)
             24 LOAD_FAST                0 (one)
             26 PRECALL                  1
             30 CALL                     1
             40 LOAD_GLOBAL              1 (NULL + id)
             52 LOAD_FAST                1 (two)
             54 PRECALL                  1
             58 CALL                     1
             68 STORE_FAST               3 (b)
             70 STORE_FAST               2 (a)

  4          72 LOAD_FAST                0 (one)
             74 LOAD_FAST                1 (two)
             76 LOAD_FAST                2 (a)
             78 LOAD_FAST                3 (b)
             80 BUILD_TUPLE              4
             82 RETURN_VALUE
(4.3333333, 4.3333333, 12424698960, 12424698960)

The id of one and two are also stable over time:

>>> id(one), id(two)
(12424698960, 12424698960)

They are indeed the same object, because the interpreter optimizes the addition before the bytecode is generated.

Collectives™ on Stack Overflow

Storage of floating point numbers in memory in Python [duplicate]

2 Answers 2

5 Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

Comments

Linked

Related