This numpy algorithm operating on integers occasionally returns floats, why?

Question

This is part of an algorithm to generate concentric rings of points on a hexagonal lattice that I'm in the process of rewriting.

I'd thought it's all integer math, but I discovered that in some cases arrays are unexpectedly created as floats!

In the sequence below p0 is a float64 for n=1 but int64 for n>1 and I simply can not figure out why this is happening.

I'm running numpy version 1.17.3, Anaconda installation of Python 3.7.3 on MacOS

import numpy as np
n_max = 3
for n in range(1, n_max+1):
    seq = np.arange(n, -n-1, -1, dtype=int)
    p0  = np.hstack((seq, (n-1)*[-n], seq[::-1], (n-1)*[n]))
    print('n: ', n)
    print('seq: ', seq)
    print('p0: ', p0.dtype, p0)
    print('')

returns

n:  1
seq:  [ 1  0 -1]
p0:  float64 [ 1.  0. -1. -1.  0.  1.]

n:  2
seq:  [ 2  1  0 -1 -2]
p0:  int64 [ 2  1  0 -1 -2 -2 -2 -1  0  1  2  2]

n:  3
seq:  [ 3  2  1  0 -1 -2 -3]
p0:  int64 [ 3  2  1  0 -1 -2 -3 -3 -3 -3 -2 -1  0  1  2  3  3  3]

Is that expected behavior?

update 1: okay np.hstack(([1, 0, -1], 1*[7])) returns int64 but np.hstack(([1, 0, -1], 0*[7])) returns float64 so it's the occurrence of the 0*[n] in the tuple on which np.hstack operates that's triggering the upcast to float64.

update 2: Just asked in Code Review: Is there a better, cleaner or otherwise “less tricky” way to get these hexagonal arrays of dots arranged in this spiral pattern?

My own expectations would be that empty arrays don't have a contents type at all (unless we forced em to). Concatenating an empty array should not change the type. — aka.nice
– aka.nice, Commented Apr 24, 2020 at 11:16
@aka.nice agreed, that's what I'd expected, this was a real surprise! — uhoh
– uhoh, Commented Apr 24, 2020 at 11:22

yatu · Accepted Answer · 2020-04-23 08:52:00Z

2

What's triggering the entire array to be cast to np.float64, is the empty list obtained when n=0 with (n-1)*[n] and (n-1)*[-n]:

print((n-1)*[n])
# []

np.hstack constructs an array from each of its input arrays to be concatenated. For each array there's a call to np.atleast_1d, which by default casts empty arrays to np.float64 dtype:

np.atleast_1d([])
# array([], dtype=float64)

answered Apr 23, 2020 at 8:52

yatu

88.7k12 gold badges93 silver badges148 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

johannesack · Accepted Answer · 2020-04-23 08:48:51Z

1

The reason for this is that NumPy creates ndarrays from all inputs before concatenating them.

[0]*n evaluates to [], which is an empty list and thus has no numeric type, therefore when being cast to an array it becomes an empty array with the default data type, which is to use a float.

You can avoid this by using casting the input to ndarrays yourself and specifying the datatype as int, for example like so:

import numpy as np
n_max = 3
for n in range(1, n_max+1):
    seq = np.arange(n, -n-1, -1, dtype=int)
    p0  = np.hstack((seq, np.array((n-1)*[-n], dtype=np.int32), seq[::-1], np.array((n-1)*[-n], dtype=np.int32)))
    print('n: ', n)
    print('seq: ', seq)
    print('p0: ', p0.dtype, p0)
    print('')

I can't really speak to whether it is expected behavior or not, but it does make some sense intrinsically.

answered Apr 23, 2020 at 8:48

johannesack

6784 silver badges20 bronze badges

1 Comment

uhoh Over a year ago

Thanks for your answer! pre-casting the arrays as int or (even worse) trapping the n<1 case(s) both work.

Collectives™ on Stack Overflow

This numpy algorithm operating on integers occasionally returns floats, why?

2 Answers 2

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related