2

I have a list:

["toaster", "oven", "door"]  

I need to get ALL the possible sequential words that can be created. The output should look like this:

["toaster", "toaster oven", "toaster oven door", "oven", "oven door", "door"]

What is the most efficient way to get this list? I've looked at itertools.combinations() and a few other suggestions found on Stack Overflow, but nothing that would produce this exact result.

For example, the above list is not a powerset, because only words adjacent to each other in the input list should be used. A powerset would combine toaster and door into toaster door, but those two words are not adjacent.

5
  • 2
    what about single 'door'? Commented May 21, 2018 at 11:58
  • @AzatIbrakov thanks just corrected. "door" would be needed as well Commented May 21, 2018 at 12:00
  • 1
    And toaster door? Commented May 21, 2018 at 12:01
  • 1
    @DeepSpace, I'm ONLY looking for strings in sequential order N+1 Commented May 21, 2018 at 12:02
  • "Sequential words" is not a thing. What you want are called substrings, which are subsequences made of all consecutive elements from the initial sequence. So you want to generate all substrings of that sequence. Commented May 21, 2018 at 12:10

3 Answers 3

10

You can do it like this:

words = ["toaster", "oven", "door"]  

length = len(words)
out = []
for start in range(length):
    for end in range (start+1, length+1):
        out.append(' '.join(words[start:end]))

print(out)

# ['toaster', 'toaster oven', 'toaster oven door', 'oven', 'oven door', 'door']

You just need to determine the first and last word to use.

You could also use a list comprehension:

[' '.join(words[start:end]) for start in range(length) for end in range(start+1, length+1)]

#['toaster', 'toaster oven', 'toaster oven door', 'oven', 'oven door', 'door']
Sign up to request clarification or add additional context in comments.

4 Comments

Note: you can use for start in range(L-1) with L = len(words) because: 1) using lowercase l is evil and 2) the OP does not want the empty string and you are doing an unneccessary iteration anyway.
Not sure why you range start all the way through to length + 1, you can safely remove the +1 there. It's not a problem because range(3, 3) is empty, but you make Python do that extra step without reason.
@GiacomoAlzetta: no, not L-1, then start ends at 1, so you'd never get door on its own.
@GiacomoAlzetta: and no to using L either. Just use length.
3

You want to create sliding windows of increasing length, use the window() function from the top answer there inside a range() loop to increment the lengths:

from itertools import islice, chain

# window definition from https://stackoverflow.com/a/6822773

def increasing_slices(seq):
    seq = list(seq)
    return chain.from_iterable(window(seq, n=i) for i in range(1, len(seq) + 1))

for combo in increasing_slices(["toaster", "oven", "door"]):
    print(' '.join(combo))

This outputs:

toaster
oven
door
toaster oven
oven door
toaster oven door

Comments

0
import itertools

a = ['toaster', 'over', 'door']

result = []
for i in [itertools.combinations(a, x + 1) for x in range(len(a))]:
    result += [' '.join(e) for e in list(i)]

print(result)

What do you think about this solution? The result is:

['toaster', 'over', 'door', 'toaster over', 'toaster door', 'over door', 'toaster over door']

1 Comment

That includes non-sequential combos.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.