9

I have gone through Find intersection of two lists?, Intersection of Two Lists Of Strings, Getting intersection of two lists in python. However, I could not solve this problem of finding intersection between two string lists using Python.

I have two variables.

A = [['11@N3'], ['23@N0'], ['62@N0'], ['99@N0'], ['47@N7']]

B  = [['23@N0'], ['12@N1']]

How to find that '23@N0' is a part of both A and B?

I tried using intersect(a,b) as mentioned in http://www.saltycrane.com/blog/2008/01/how-to-find-intersection-and-union-of/ But, when I try to convert A into set, it throws an error:

File "<stdin>", line 1, in <module> TypeError: unhashable type: 'list'

To convert this into a set, I used the method in TypeError: unhashable type: 'list' when using built-in set function where the list can be converted using

result = sorted(set(map(tuple, A)), reverse=True)

into a tuple and then the tuple can be converted into a set. However, this returns a null set as the intersection.

Can you help me find the intersection?

6
  • 1
    The fastest way to intersect a big bunch of data is to use Python sets. Python sets are hash maps, therefore they require hashing. Your problem comes from wrapping strings into lists. Lists are mutable objects, that's why they can't be hashed, while strings, being immutable, can be. Commented Feb 24, 2015 at 8:54
  • 1
    Is there a reason you have a single string in each list? Commented Feb 24, 2015 at 8:58
  • This is the dataset I have, I did not generate it, borrowed it from someone. Commented Feb 24, 2015 at 10:02
  • @SharathChandra: what does "borrowed" mean? Have you read it from a file? What format? Commented Feb 24, 2015 at 10:20
  • related: Flattening a shallow list in Python Commented Feb 24, 2015 at 10:40

7 Answers 7

8

You can use flatten function of compiler.ast module to flatten your sub-list and then apply set intersection like this

from compiler.ast import flatten

A=[['11@N3'], ['23@N0'], ['62@N0'], ['99@N0'], ['47@N7']]
B=[['23@N0'], ['12@N1']]

a = flatten(A)
b = flatten(B)
common_elements = list(set(a).intersection(set(b)))
common_elements
['23@N0']
Sign up to request clarification or add additional context in comments.

5 Comments

compiler.ast is python 2 only; suggest using itertools.chain
Agree with you but if the input list is something like this A= ['11@N3', ['23@N0']] then applying itertools.chain will not truly flatten the list. Resulting list after list(itertools.chain(*A)) would be ['1', '1', '@', 'N', '3', '23@N0'].
True too; what we really would need is a flatten itertool that would understand strings, bytes.
@AnttiHaapala: perhaps, you are looking for Flatten (an irregular) list of lists in Python.
Neither was needed to answer this question, except many of the answers there wouldn't work in 3 either. I mean, it should be in core written in C.
3

In case you have to fit it on a fortune cookie:

set(i[0] for i in A).intersection(set(i[0] for i in B))

Comments

2

The problem is that your lists contain sublists so they cannot be converted to sets. Try this:

A=[['11@N3'], ['23@N0'], ['62@N0'], ['99@N0'], ['47@N7']]
B=[['23@N0'], ['12@N1']]

C = [item for sublist in A for item in sublist]
D = [item for sublist in B for item in sublist]

print set(C).intersection(set(D))

Comments

2

Your datastructure is a bit strange, as it is a list of one-element lists of strings; you'd want to reduce it to a list of strings, then you can apply the previous solutions:

Thus a list like:

B = [['23@N0'], ['12@N1']]

can be converted to iterator that iterates over '23@N0', '12@N1'

with itertools.chain(*), thus we have simple oneliner:

>>> set(chain(*A)).intersection(chain(*B))
{'23@N0'}

1 Comment

This does seem to be working if A and B are reversed in the last statement. That is, if we try set(B).intersection(A), it results an empty set.
0

You have two lists of lists with one item each. In order to convert that to a set you have to make it a list of strings:

set_a = set([i[0] for i in A])
set_b = set([i[0] for i in B])

Now you can get the intersection:

set_a.intersection(set_b)

1 Comment

you don't need [] inside (): set(x[0] for x in A) & set(x[0] for x in B)
0
A=[['11@N3'], ['23@N0'], ['62@N0'], ['99@N0'], ['47@N7']]
A=[a[0] for a in A]
B=[['23@N0'], ['12@N1']]
B=[b[0] for b in B]
print set.intersection(set(A),set(B))

Output:set(['23@N0'])

If each of your list has sublists of only 1 element you can try this.

Comments

0

My preference is to use itertools.chain from the standard library:

from itertools import chain

A = [['11@N3'], ['23@N0'], ['62@N0'], ['99@N0'], ['47@N7']]

B = [['23@N0'], ['12@N1']]

set(chain(*A)) & set(chain(*B))

# {'23@N0'}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.