Remove string in list that is substring of another string if both are in the list

Question

Imagine you have a list of lists as follows:

list = [['Hello','Hello World','something else'],
 ['Hello','something'],
 ['somethings']
 ['Hello World','something else'],
 ['Hello','blabla','Hello World']]

I would like to remove the 'Hello' in the list if and only if 'Hello World' is in it.

What I have tried:

new_list = [elem.remove('Hello') for elem in lista if 'Hello' and 'Hello World' in elem]

However, I get the following error:

list.remove(x): x not in list

And my list becomes this:

[['Hello World', 'something else'],
 ['Hello', 'something'],
 ['somethings'],
 ['Hello World', 'something else'],
 ['Hello', 'blabla', 'Hello World']]

So it worked for the first row, but then it broke.

Extra points for computational efficiency!

Ajax1234 · Accepted Answer · 2019-07-24 18:07:46Z

1

You can use a an inner list comprehension to filter "hello" values:

l = [['Hello','Hello World','something else'], ['Hello','something'], ['somethings'], ['Hello World','something else'],['Hello','blabla','Hello World']]
new_l = [[c for c in i if c != 'Hello'] if 'Hello World' in i else i for i in l]

Output:

[['Hello World', 'something else'], ['Hello', 'something'], ['somethings'], ['Hello World', 'something else'], ['blabla', 'Hello World']]

answered Jul 24, 2019 at 18:07

Ajax1234

71.7k9 gold badges67 silver badges110 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Antonio López Ruiz Over a year ago

Wouldn't this make it O(nm^2) instead of O(nm)? Being m the inner size of the array and n the size of the array (we got O(m) for the in and O(n) for the loop and we would be doing a second O(m) loop)

Ajax1234 Over a year ago

@AntonioLópezRuiz I believe it is just O(mn) as str in i is O(m), but [c for c in i if c != 'Hello'] is not being run for every pass over i being made by the internal looping of str.__contains__. It would simply be O(n)*(O(m)+O(m)) => O(mn).

Antonio López Ruiz Over a year ago

You are right, it doesn't pass every time, would be like an additional #of times it passes, which honestly is small. Would only be O(nm2) if it goes into that if every time.

tzaman · Accepted Answer · 2019-07-24 18:24:53Z

The problem lies here:

if 'Hello' and 'Hello World' in elem

This does not work how you think it does. if 'Hello' is a separate clause, which always evaluates to True since 'Hello' is not an empty string. You need to write out the full test both times:

if 'Hello' in elem and 'Hello World' in elem

Separately, writing this as a list comprehension doesn't quite make sense since list.remove modifies the original list, and doesn't return anything. Your new_list will just be full of None. Just use a for loop:

for sub_list in my_list:  # also note, you should not use `list` as a variable name. 
    if 'Hello' in sub_list and 'Hello World' in sub_list:
        sub_list.remove('Hello')

If you actually don't want to modify the original list / sub-lists, you'll need to create new lists explicitly instead of using remove:

new_list = []
for sub_list in my_list:
    if 'Hello World' in sub_list:
        new_sub_list = [elem for elem in sub_list if elem != 'Hello']
    else:
        new_sub_list = sub_list[:] # make a copy to avoid referencing the original
    new_list.append(new_sub_list)

This whole thing can also be written as a nested list-comprehension if you want:

new_list = [sub_list[:] if 'Hello World' not in sub_list else 
            [elem for elem in sub_list if elem != 'Hello']
            for sub_list in my_list]

But in either case, I'd probably prefer the explicit for loop construction just for clarity.

Collectives™ on Stack Overflow

Remove string in list that is substring of another string if both are in the list

2 Answers 2

3 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related