How to check if float pandas column contains only integer numbers?

Question

I have a dataframe

df = pd.DataFrame(data=np.arange(10),columns=['v']).astype(float)

How to make sure that the numbers in v are whole numbers? I am very concerned about rounding/truncation/floating point representation errors

How will testing for integers allay concerns about floating-point errors? Do the values come from integers, and you are concerned they have changed? Or are they the results of calculations whose mathematical properties are such that exact results would be integers? — Eric Postpischil
– Eric Postpischil, Commented Mar 13, 2018 at 10:21
these values come from integers. However during processing often they are casted to float64 — 00__00__00
– 00__00__00, Commented Mar 13, 2018 at 12:18
The only errors that can occur in handling integers in floating point are rounding and overflow errors when converting from one format to another. When converting integer to floating-point, if the precision does not suffice to represent the value exactly, it will be rounded. However, the value it will be rounded to will be another integer, due to the nature of floating-point. Therefore, testing whether all values in an array are integers will provide no information about whether any rounding errors have occurred. — Eric Postpischil
– Eric Postpischil, Commented Mar 13, 2018 at 12:51
If the task is to ensure that values converted from integer to floating-point do not incur any rounding error, then it suffices if no integer exceeds the precision of the significand of the floating-point format. For example, IEEE 754 basic 64-bit binary has a 53-bit significand, so conversion of any integers up to 2^53 in magnitude will be not incur any rounding error. — Eric Postpischil
– Eric Postpischil, Commented Mar 13, 2018 at 12:54

Community · Accepted Answer · 2020-06-20 09:12:55Z

54

Comparison with `astype(int)`

Tentatively convert your column to int and test with np.array_equal:

np.array_equal(df.v, df.v.astype(int))
True

`float.is_integer`

You can use this python function in conjunction with an apply:

df.v.apply(float.is_integer).all()
True

Or, using python's all in a generator comprehension, for space efficiency:

all(x.is_integer() for x in df.v)
True

edited Jun 20, 2020 at 9:12

CommunityBot

11 silver badge

answered Mar 13, 2018 at 6:50

cs95

406k106 gold badges744 silver badges797 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

00__00__00 Over a year ago

WHat is the tolerance of allclose compared to is_integer?are they a call to the same function?

cs95 Over a year ago

@ErroriSalvo No, the mechanisms are slightly different. With allclose, the tolerance is very small to account for floating point inaccuracies. With is_integer, the function actually checks for whole numbers. The mechanism is slightly different but the end result is the same.

Eric Postpischil Over a year ago

allclose is incapable of determining that a number is an integer unless the tolerance is set to 0, at which point it becomes a test for equality. Furthermore, as stated in my comment to the question, testing for integer values does not accomplish the OP’s actual goal.

cs95 Over a year ago

@EricPostpischil okay, I've changed that to array_equal. By the way, this may be an XY problem, but it is still useful to know how to do this with numpy/pandas, so I've gone ahead and answered anyway. I appreciate the criticism (and the downvote).

Joe Over a year ago

df.v.apply: not sure if this works, after df.v it is a numpy ndarray, which does not have the method apply. Do you mean apply_along_axis?

|

scott · Accepted Answer · 2020-01-28 17:36:03Z

22

Here's a simpler, and probably faster, approach:

(df[col] % 1  == 0).all()

To ignore nulls:

(df[col].fillna(-9999) % 1  == 0).all()

answered Jan 28, 2020 at 17:36

scott

2212 silver badges2 bronze badges

Comments

lxs602 · Accepted Answer · 2024-08-01 08:24:27Z

10

For completeness, Pandas v1.0+ offers the convert_dtypes() utility, that (among 3 other conversions) performs the requested operation for all dataframe-columns (or series) containing only integer numbers.

If you wanted to limit the conversion to a single column only, you could do the following:

>>> df.dtypes          # inspect previous dtypes
v                      float64

>>> df["v"] = df["v"].convert_dtypes()
>>> df.dtypes          # inspect converted dtypes
v                      Int64

edited Aug 1, 2024 at 8:24

lxs602

6110 bronze badges

answered Oct 13, 2020 at 9:41

ankostis

9,6703 gold badges53 silver badges68 bronze badges

Comments

mgoldwasser · Accepted Answer · 2019-01-22 21:44:51Z

If you want to check multiple float columns in your dataframe, you can do the following:

col_should_be_int = df.select_dtypes(include=['float']).applymap(float.is_integer).all()
float_to_int_cols = col_should_be_int[col_should_be_int].index
df.loc[:, float_to_int_cols] = df.loc[:, float_to_int_cols].astype(int)

Keep in mind that a float column, containing all integers will not get selected if it has np.NaN values. To cast float columns with missing values to integer, you need to fill/remove missing values, for example, with median imputation:

float_cols = df.select_dtypes(include=['float'])
float_cols = float_cols.fillna(float_cols.median().round()) # median imputation
col_should_be_int = float_cols.applymap(float.is_integer).all()
float_to_int_cols = col_should_be_int[col_should_be_int].index
df.loc[:, float_to_int_cols] = float_cols[float_to_int_cols].astype(int)

Nicoolasens · Accepted Answer · 2022-02-01 21:01:39Z

1

On 27 331 625 rows it works well. Time : 1.3sec

df['is_float'] = df[field_fact_qty]!=df[field_fact_qty].astype(int)

This way took Time : 4.9s

df[field_fact_qty].apply(lambda x : (x.is_integer()))

edited Feb 1, 2022 at 21:01

answered Jan 31, 2022 at 15:46

Nicoolasens

3,6782 gold badges20 silver badges23 bronze badges

Collectives™ on Stack Overflow

How to check if float pandas column contains only integer numbers?

5 Answers 5

Comparison with `astype(int)`

`float.is_integer`

7 Comments

Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Comparison with astype(int)

float.is_integer

7 Comments

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related

Comparison with `astype(int)`

`float.is_integer`