I'm following the post Search a list of list of strings for a list of strings in python efficiently and trying to search for a list of substrings in a list of list of strings. The above post finds the index of the list of strings that match the list of strings. In my code, I substring the L1 and flatten it to match the L2 string. How do I get a list of all the L1 strings that have L2 strings as substrings? Right now, I'm getting the index of the L1 list of strings that match each L2 string.
This is how far I got. The code that I'm following:
from bisect import bisect_left, bisect_right
from itertools import chain
L1=[["animal:cat","pet:dog","fruit:apple"],["fruit:orange","color:green","color:red","fruit:apple"]]
L2=["apple", "cat","red"]
M1 = [[i]*len(j) for i, j in enumerate(L1)]
M1 = list(chain(*M1))
L1flat = list(chain(*L1))
I = sorted(range(len(L1flat)), key=L1flat.__getitem__)
L1flat = sorted([L1flat[i].split(':')[1] for i in I])
print(L1flat)
M1 = [M1[i] for i in I]
for item in L2:
s = bisect_left(L1flat, item)
e = bisect_right(L1flat, item)
print(item, M1[s:e])
#print(L1flat[s:e])
sub = M1[s:e]
for y in sub:
print('%s found in %s' % (item, str(L1(y))))
Edit: I just realized I'm getting errors in my search for second and third item.
3 things:
I created the M1 by enumerating split elements of L1
L1Splitted = [i[0].split(':')[1] for i in L1]
M1 = [[i]*len(j) for i, j in enumerate(L1Splitted)]
I reversed the elements in L1flat and split the elements
L1flatReversed = []
for j, x in enumerate(L1flat)
L1flatReversed.append(reverseString(x, ':'))Then I made another list of reversed strings split
L1flatReversedSplit = [L1flatReversed[i].split(':')[0] for i in I]
now my s and e are bisecting on L1flatReversedSplit
L2 --> "cat"do you wantL1 --> ["animal:cat", "pet:dog", "fruit:apple"]orL1 --> "animal:cat"or something else?