Detecting unexpected type conversion in python

Question

I have a piece of complex Python code involving the using of 32-bit numerical values (for saving memory and bandwidth). But later I discovered many of these 32-bit numbers were implicitly converted to 64-bit in some high-level functions. For example, the sum function, by default, can transforms a 32bit array to a 64bit number.

In [152]: x32
Out[152]:
array([  0.      ,   1.010101,   2.020202,   3.030303,   4.040404,
         5.050505,   6.060606,   7.070707,   8.080808,   9.090909,
        10.10101 ,  11.111111,  12.121212,  13.131313,  14.141414,
        15.151515,  16.161615,  17.171717,  18.181818,  19.19192 ,
        20.20202 ,  21.212122,  22.222221,  23.232323,  24.242424,
        25.252525,  26.262627,  27.272728,  28.282827,  29.292929,
        30.30303 ,  31.313131,  32.32323 ,  33.333332,  34.343433,
        35.353535,  36.363636,  37.373737,  38.38384 ,  39.39394 ,
        40.40404 ,  41.414143,  42.424244,  43.434345,  44.444443,
        45.454544,  46.464645,  47.474747,  48.484848,  49.49495 ,
        50.50505 ,  51.515152,  52.525253,  53.535355,  54.545456,
        55.555557,  56.565655,  57.575756,  58.585857,  59.59596 ,
        60.60606 ,  61.61616 ,  62.626263,  63.636364,  64.64646 ,
        65.65656 ,  66.666664,  67.676765,  68.68687 ,  69.69697 ,
        70.70707 ,  71.71717 ,  72.72727 ,  73.73737 ,  74.747475,
        75.757576,  76.76768 ,  77.77778 ,  78.78788 ,  79.79798 ,
        80.80808 ,  81.818184,  82.828285,  83.83839 ,  84.84849 ,
        85.85859 ,  86.86869 ,  87.878784,  88.888885,  89.89899 ,
        90.90909 ,  91.91919 ,  92.92929 ,  93.93939 ,  94.94949 ,
        95.959595,  96.969696,  97.9798  ,  98.9899  , 100.      ],
      dtype=float32)

In [153]: sum(x32)
Out[153]: 4999.999972701073

In [154]: type(sum(x32))
Out[154]: numpy.float64

The reason in this case sum(x32) is 64-bit should be from the default accumulator of sum, 0, as shown here:

In [156]: type(sum(x32, start=np.float32(0)))
Out[156]: numpy.float32

Above, I use the sum function as an example, to explain type conversion is everywhere if I use 32bit as inputs. I have changed the sum part to avoid such implicit type conversion. But I would like to know if internally in my library call, there is any other unexpected 32bit -> 64bit conversion. Is there a general programming language solution to monitor any possible type conversion? For example, can I run my python code with some special debugging tool so that any type conversion from 32bit to 64bit will trigger an alarm or being logged?

You could use np.sum instead, keeping in mind that numpy will not report overflow and will not give the right answer if you exceed int32. — jwal
– jwal, Commented Nov 21, 2022 at 8:51
Thanks. I used the sum function as an example, to explain type conversion is everywhere if I use 32bit as inputs. So the question is how can we monitor these type conversion systematically? — zell
– zell, Commented Nov 21, 2022 at 10:15
Why not save your initial dtype, carry out the operation, save that dtype and then assert them to be equal? You could make a decorator to assist too. Not sure of the best way to decorate stackoverflow.com/questions/22600365/… — jtlz2
– jtlz2, Commented Nov 21, 2022 at 10:42

jtlz2 · Accepted Answer · 2022-11-22 09:01:51Z

1

I think you are nearly there to be honest.

original_dtype = x32.dtype

new_dtype = sum(x32, start=np.float32(0))).dtype

assert new_dtype == original_dtype, f"dtypes differ, {new_dtype=} != {original_dtype=}"

To use this method globally, you can write something like:

def type_checker_func(func,input_array,*args):

    dtype_orig = input_array.dtype

    result = func(input_array,*args)

    dtype_new = result.dtype

    if dtype_new != dtype_orig:
        print(f"dtypes differ, {dtype_new=} != {dtype_orig=}")

    return result

my_answer = type_checker_func(sum,x32,start=np.float32(0))

But I am not sure how you would best handle multiple return values (consider np.histogram), all sorts of args, etc. etc.

I am also not sure how to invoke the type_checker_func globally / implicitly (if only for numpy fns).

Update: I posted a github question asking about doing this for every function call using line_profiler - see https://github.com/pyutils/line_profiler/issues/188 - fingers crossed.

edited Nov 22, 2022 at 9:01

answered Nov 21, 2022 at 10:46

jtlz2

8,55511 gold badges74 silver badges128 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

zell Over a year ago

Thanks. The problem is how to invoke the type_checker_func globally and implicitly for all functions. My code consists of high-level tensorflow code. It looks not easy to force the use of type_checker_function everywhere internally.

jtlz2 Over a year ago

@zell I think you could try hacking pypi.org/project/line-profiler or pypi.org/project/memory-profiler [no longer maintained in the case of the latter :( ]. Maybe post some issues on their respective github pages and see if they can assist?

jtlz2 Over a year ago

@zell I opened a line_profiler github issue - see here github.com/pyutils/line_profiler/issues/188

Collectives™ on Stack Overflow

Detecting unexpected type conversion in python

1 Answer 1

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related