1

I have a piece of complex Python code involving the using of 32-bit numerical values (for saving memory and bandwidth). But later I discovered many of these 32-bit numbers were implicitly converted to 64-bit in some high-level functions. For example, the sum function, by default, can transforms a 32bit array to a 64bit number.

In [152]: x32
Out[152]:
array([  0.      ,   1.010101,   2.020202,   3.030303,   4.040404,
         5.050505,   6.060606,   7.070707,   8.080808,   9.090909,
        10.10101 ,  11.111111,  12.121212,  13.131313,  14.141414,
        15.151515,  16.161615,  17.171717,  18.181818,  19.19192 ,
        20.20202 ,  21.212122,  22.222221,  23.232323,  24.242424,
        25.252525,  26.262627,  27.272728,  28.282827,  29.292929,
        30.30303 ,  31.313131,  32.32323 ,  33.333332,  34.343433,
        35.353535,  36.363636,  37.373737,  38.38384 ,  39.39394 ,
        40.40404 ,  41.414143,  42.424244,  43.434345,  44.444443,
        45.454544,  46.464645,  47.474747,  48.484848,  49.49495 ,
        50.50505 ,  51.515152,  52.525253,  53.535355,  54.545456,
        55.555557,  56.565655,  57.575756,  58.585857,  59.59596 ,
        60.60606 ,  61.61616 ,  62.626263,  63.636364,  64.64646 ,
        65.65656 ,  66.666664,  67.676765,  68.68687 ,  69.69697 ,
        70.70707 ,  71.71717 ,  72.72727 ,  73.73737 ,  74.747475,
        75.757576,  76.76768 ,  77.77778 ,  78.78788 ,  79.79798 ,
        80.80808 ,  81.818184,  82.828285,  83.83839 ,  84.84849 ,
        85.85859 ,  86.86869 ,  87.878784,  88.888885,  89.89899 ,
        90.90909 ,  91.91919 ,  92.92929 ,  93.93939 ,  94.94949 ,
        95.959595,  96.969696,  97.9798  ,  98.9899  , 100.      ],
      dtype=float32)

In [153]: sum(x32)
Out[153]: 4999.999972701073

In [154]: type(sum(x32))
Out[154]: numpy.float64

The reason in this case sum(x32) is 64-bit should be from the default accumulator of sum, 0, as shown here:

In [156]: type(sum(x32, start=np.float32(0)))
Out[156]: numpy.float32  

Above, I use the sum function as an example, to explain type conversion is everywhere if I use 32bit as inputs. I have changed the sum part to avoid such implicit type conversion. But I would like to know if internally in my library call, there is any other unexpected 32bit -> 64bit conversion. Is there a general programming language solution to monitor any possible type conversion? For example, can I run my python code with some special debugging tool so that any type conversion from 32bit to 64bit will trigger an alarm or being logged?

3
  • You could use np.sum instead, keeping in mind that numpy will not report overflow and will not give the right answer if you exceed int32. Commented Nov 21, 2022 at 8:51
  • Thanks. I used the sum function as an example, to explain type conversion is everywhere if I use 32bit as inputs. So the question is how can we monitor these type conversion systematically? Commented Nov 21, 2022 at 10:15
  • Why not save your initial dtype, carry out the operation, save that dtype and then assert them to be equal? You could make a decorator to assist too. Not sure of the best way to decorate stackoverflow.com/questions/22600365/… Commented Nov 21, 2022 at 10:42

1 Answer 1

1

I think you are nearly there to be honest.

original_dtype = x32.dtype

new_dtype = sum(x32, start=np.float32(0))).dtype

assert new_dtype == original_dtype, f"dtypes differ, {new_dtype=} != {original_dtype=}"

To use this method globally, you can write something like:

def type_checker_func(func,input_array,*args):

    dtype_orig = input_array.dtype

    result = func(input_array,*args)

    dtype_new = result.dtype

    if dtype_new != dtype_orig:
        print(f"dtypes differ, {dtype_new=} != {dtype_orig=}")

    return result

my_answer = type_checker_func(sum,x32,start=np.float32(0))

But I am not sure how you would best handle multiple return values (consider np.histogram), all sorts of args, etc. etc.

I am also not sure how to invoke the type_checker_func globally / implicitly (if only for numpy fns).

Update: I posted a github question asking about doing this for every function call using line_profiler - see https://github.com/pyutils/line_profiler/issues/188 - fingers crossed.

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks. The problem is how to invoke the type_checker_func globally and implicitly for all functions. My code consists of high-level tensorflow code. It looks not easy to force the use of type_checker_function everywhere internally.
@zell I think you could try hacking pypi.org/project/line-profiler or pypi.org/project/memory-profiler [no longer maintained in the case of the latter :( ]. Maybe post some issues on their respective github pages and see if they can assist?
@zell I opened a line_profiler github issue - see here github.com/pyutils/line_profiler/issues/188

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.