3

In python, it's easy to test whether two variables have the same top-level type:

In [1]: s1 = 'bob'
In [2]: s2 = 'tom'
In [3]: type(s1) == type(s2)
Out[3]: True

But in the case where types are nested, it's not so easy:

In [4]: strlist = ['bob', 'tom']
In [5]: intlist = [5, 6, 7]
In [6]: type(strlist) == type(intlist)
Out[6]: True

Is there a general way to "deeply" compare two variables such that:

deepcompare(['a', 'b'], [1, 2]) == False
deepcompare([42, 43], [1, 2]) == True

?

EDIT:

To define the question a bit more, let's say this includes both list length and heterogeneous list types:

deepcompare([1, 2, 3], [1, 2]) == False
deepcompare([1, 3], [2, 'b']) == False
deepcompare([1, 'a'], [2, 'b']) == True
5
  • Do you care about equal sizes as well? Or is it just making sure that regardless of how many are in there, they just have to have the same types inside the structure? Commented Mar 16, 2016 at 17:16
  • 1
    For there to be a general way implies that this is a well-defined problem, but it isn't. Is the length of a list relevant to its "type"? What do you do with non-homogenous container types? Commented Mar 16, 2016 at 17:17
  • 1
    Are we talking only one level of nesting? Commented Mar 16, 2016 at 17:17
  • You could recursively create two structures just containing the types of the objects at each position, then compare them directly - [str, str] != [int, int], for example. Can you assume that the structures contain homogeneous types, or might you end up looking at e.g. deepcompare(['a', 1], ['b', 2]) (and what should the result be)? Commented Mar 16, 2016 at 17:19
  • @tzaman point well taken, see edit Commented Mar 16, 2016 at 17:24

2 Answers 2

2

To expand on my comment, you could create what I've called a "type map" recursively:

def typemap(lst_or_obj):
    if not isinstance(lst_or_obj, list):
        return type(lst_or_obj)
    return [typemap(obj) for obj in lst_or_obj]

Then use this to get the types within your structures:

a = [1, 2, ['three', 4]]
b = [5, 6, ['seven', 8]]
c = [9, 10, [11, 'twelve']]

ta = typemap(a)
tb = typemap(b)
tc = typemap(c)

print(ta)
print(tb)
print(tc)

print(ta == tb)
print(ta == tc)

Output:

[<class 'int'>, <class 'int'>, [<class 'str'>, <class 'int'>]]
[<class 'int'>, <class 'int'>, [<class 'str'>, <class 'int'>]] 
[<class 'int'>, <class 'int'>, [<class 'int'>, <class 'str'>]]
True
False

Then your function is simply:

def deepcompare(a, b):
    return typemap(a) == typemap(b)

If you need to deal with things other than lists, you can trivially expand the isinstance check to (list, tuple), but you can quickly run into issues with things like str (recursively iterating over strings is a problem because a single character or empty string is an iterable of itself, so your program explodes) and dict (ordering issues, comparing keys and/or values, ...).

Sign up to request clarification or add additional context in comments.

6 Comments

Nice! Although for this to be a bit more general, should the first line in typemap check whether the list_or_obj is iterable, instead of whether it is a list?
@spiffman is that a requirement? The issue with iterability is that e.g. strings and dictionaries are iterable, but you likely want to handle those differently to lists and tuples.
Not necessarily, I was more just wondering whether that would provide a more general solution, but you're right, the strings case is especially problematic there.
@jonrsharpe This is the way to go and clear. You can also use collections.Iterable for type checking in order to make it more comprehensive.
@Kasramvd please read my comment above, where I point out why simply checking for iterables isn't a silver bullet
|
1

The way that I do this is by using this function:

def getDeepTypes(items):
    types = [type(x) for x in items]
    return (types[0] if all(x == types[0] for x in types) else None)

This uses various list comprehensions to get the deep type of a list. If they aren't all the same, None is returned.

>>> getDeepTypes([1, 2, 3])
int
>>> getDeepTypes(["foo", "bar"])
str
>>> print(getDeepTypes([1, "foo"]))
None

So you could do:

getDeepTypes(['a', 'b']) == getDeepTypes([1, 2]) # False
getDeepTypes([42, 43]) == getDeepTypes([1, 2]) # True

2 Comments

I would use any here instead. It would short circuit the moment it breaks your condition. Also, this only handles a single level
Would getDeepTypes(items) be more efficient if you created a set of all types and just checked if the length of the set is 1? types = {type(x) for x in items} and if len(types) == 1 return types.pop().

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.