0

I wonder why does python pandas / numpy not implement 3-valued logic (so-called Łukasiewicz's logic) with true, false and NA (like for instance R does). I've read (https://www.oreilly.com/learning/handling-missing-data) that this is to some extent due to the fact that pandas uses much more many basic data types than R for example. However, this is not entirely clear to me why in this case it is unavoidable to have this weird behaviour of logical operations with missing values.

Example.

import numpy as np
np.nan and False   # so far so good, we have False
np.nan or False    # again, good, we have nan
False and np.nan   # False, good
False or np.nan    # give nan, so again, it is correct
np.nan and True    # weird, this gives True, while it should give nan
True and np.nan    # nan, so it is correct, but switching order should not affect the result
np.nan or True     # gives nan, which is not correct, should be True
True or np.nan     # True so it is correct, again switching the arguments changes the result

So the example shows that something very weird happens in comparisons between np.nan and True values. So what is going on here?

EDIT. Thanks for the comments, now I see that np.nan is considered a "truthy" value. So can anybody explain what does this mean exactly and what is a rationale behind this approach?

7
  • Pandas 2.0 has a lot of changes, including how nulls are handled for non-float types. Commented May 11, 2017 at 21:28
  • @aryamccarthy the above won't change with pandas 2.0, though. This is basic Commented May 11, 2017 at 21:34
  • 1
    For the record, very few languages make a distinction between true, false and some third "NA" value. Typically, either strong typing means only special constants have boolean meaning, or if many objects have boolean meaning, they all ultimately get treated as truthy or falsy. R having an NA value is unusual; general purpose programming languages almost never have such a value (you can write your own logic to simulate it, but ultimately the language only supports truthy or falsyness). Commented May 11, 2017 at 21:39
  • Yes, I understand that logical operations in R are quite special in this regard. However, both pandas and numpy are designed to solve similar problems as R, so I wonder why the 3-valued logic has not been built into these two modules? Is it due to some technical constraints or is it a, somehow rational, design decision of the authors? Commented May 11, 2017 at 21:45
  • @sztal note, you aren't using pandas in the above code. All of that is pure python, except you are using an attribute of the numpy module, np.nan, but that is the same as float('nan'), which is just vanilla Python, so you aren't really even using numpy. Commented May 11, 2017 at 21:52

2 Answers 2

1

This is numpy behaviour and, at least partially, inherited from python:

In [11]: bool(float('nan'))
Out[11]: True

In [12]: bool(np.NaN)
Out[12]: True

(NaN is "truthy".)

Sign up to request clarification or add additional context in comments.

3 Comments

Note: The truthiness dictates the behavior of and and or.
It's not even a part of numpy really, since np.nan is essentially float('nan')
@juanpa.arrivillaga true!
0

You wrongly misjudged or and and statements.

or would check if first value is True in form of bool(value) if it's False then it takes second value.

and on the other hand checks if two of the values are True at the same time in the form of bool(value1) and bool(value2)

9 Comments

So how come np.nan or True gives nan. If one of the arguments is True then the logical or have to produce True regardless of the second argument. So in this case the results should be True, but it is not. And this proves that np.nan is not congruent with the 3-valued logic.
@sztal it's not. np.nan is considered "truthy"
It seems that in this case python check the first argument, sees that it is np.nan and declares (prematurely) that the results is undecidable, but it is very decidable, since on of the arguments is True, so the logical disjunction must be True as well.
@juanpa.arrivillaga thanks for backup, wanted to put that in answer, but wanted to try it first, but lack of python interpreter on mobile phone:)
@juanpa.arrivillaga, what does "truthy" exactly mean? It seems to me that this kind of behaviour may be quite dangerous in some situations.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.