Replace identical elements in a list without loop

Question

I'm trying to replace all identical elements in a list with a new string, and also trying to move away from using loops for everything.

# My aim is to turn:
list = ["A", "", "", "D"]
# into:
list = ["A", "???", "???", "D"]
# but without using a for-loop

I started off with variations of comprehensions:

# e.g. 1
['' = "???"(i) for i in list]
# e.g. 2
list = [list[i] .replace '???' if ''(i) for i in range(len(lst))]

Then I tried to employ Python's map function as seen here:

list[:] = map(lambda i: "???", list)
# I couldn't work out where to add the '""' to be replaced.

Finally I butchered a third solution:

list[:] = ["???" if ''(i) else i for i in list]

I feel like I'm moving further from a sensible line of attack, I just want a tidy way to complete a simple task.

Does this answer your question? In-place replacement of all occurrences of an element in a list in python — Julien Sorin
– Julien Sorin, Commented Aug 20, 2021 at 13:30
Yes, thank you, however I also got ample novel solutions to my answer, including one which used python's map function correctly. — Solebay Sharp
– Solebay Sharp, Commented Aug 20, 2021 at 13:35
@PierreD is it faster or just more concise for a human to read? — Solebay Sharp
– Solebay Sharp, Commented Aug 20, 2021 at 13:37

user15801675 · Accepted Answer · 2021-08-20 13:31:54Z

3

You can try this:

list1 = ["A", "", "", "D"]

list2=list(map(lambda x: "???" if not x else x,list1))

print(list2)

Here is a longer version of the above one:

list1 = ["A", "", "", "D"]
def check_string(string):
    if not string:
        return "???"
    return string

list2=list(map(check_string,list1))
print(list2)

Taking advantage of the fact that "" strings are False value, you can then use implicit booleanness and return the value respectively. Output:

['A', '???', '???', 'D']

answered Aug 20, 2021 at 13:31

user15801675

Sign up to request clarification or add additional context in comments.

Comments

Pierre D · Accepted Answer · 2021-08-20 13:49:04Z

2

For concision (if we allow list comprehensions, which are a form of loop). Also, as noted correctly by @ComteHerappait, this is to replace empty strings with '???', consistent with the examples of the question.

>>> [e or '???' for e in l]
['A', '???', '???', 'D']

If instead we focus on replacing duplicate elements, then:

seen = set()
newl = ['???' if e in seen or seen.add(e) else e for e in l]
>>> newl
['A', '', '???', 'D']

Finally, the following replaces all duplicates in a list:

from collections import Counter

c = Counter(l)
newl = [e if c[e] < 2 else '???' for e in l]
>>> newl
['A', '???', '???', 'D']

edited Aug 20, 2021 at 13:49

answered Aug 20, 2021 at 13:34

Pierre D

26.6k8 gold badges71 silver badges108 bronze badges

3 Comments

ComteHerappait Over a year ago

this works very well for removing empty strings, but I think the question is about duplicates.

Pierre D Over a year ago

you are correct; the question is ambiguous, see my comment.

Pierre D Over a year ago

Just FWIW, this updated answer responds to all the cases of the OP's question: replacement of empty strings, replacement of duplicates (starting from the first dupe), or replacement of all duplicates. The list comprehension (first code snippet) is also the fastest solution so far, both for short lists and long lists.

Cory Kramer · Accepted Answer · 2021-08-20 13:31:15Z

1

You could use a list comprehension, but what you'd do is compare each element, and if its a match replace with a different string, otherwise just keep the original element.

>>> data = ["A", "", "", "D"]
>>> ['???' if i == '' else i for i in data]
['A', '???', '???', 'D']

answered Aug 20, 2021 at 13:31

Cory Kramer

119k19 gold badges176 silver badges233 bronze badges

2 Comments

user2668284 Over a year ago

That works but contains an explicit 'for' loop which is what the OP wanted to avoid

Cory Kramer Over a year ago

@DarkKnight What do you think map does under the hood ;) there is no solution to this problem that does not involve explicit or implicit looping

user2668284 · Accepted Answer · 2021-08-20 13:36:08Z

1

How about this:-

myList = ['A', '', '', 'D']
myMap = map(lambda i: '???' if i == '' else i, myList)
print(list(myMap))

...will result in:-

['A', '???', '???', 'D']

answered Aug 20, 2021 at 13:36

user2668284

2 Comments

user15801675 Over a year ago

That looks a lot like my solution

user2668284 Over a year ago

You're right. We were obviously writing code coincidentally

Rm4n · Accepted Answer · 2021-08-21 08:14:03Z

-1

If you want to avoid using loops as the title suggests, one can use np.where instead of list-comprehension, and it's faster for large arrays:

data = np.array(["A", "", "", "D"], dtype='object')
index = np.where(data == '')[0]
data[index] = "???"
data.tolist()

and the result:

['A', '???', '???', 'D']

Speed test

for rep in [1, 10, 100, 1000, 10000]:
    data = ["A", "", "", "D"] * rep
    print(f'array of length {4 * rep}')
    print('np.where:')
    %timeit data2 = np.array(data, dtype='object'); index = np.where(data2 == '')[0]; data2[index] = "???"; data2.tolist()
    print('list-comprehension:')
    %timeit ['???' if i == '' else i for i in data]

and the result:

array of length 4
np.where:
The slowest run took 11.79 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 5: 10.7 µs per loop
list-comprehension:
The slowest run took 5.75 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 5: 487 ns per loop
array of length 40
np.where:
The slowest run took 7.08 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 5: 13 µs per loop
list-comprehension:
100000 loops, best of 5: 2.99 µs per loop
array of length 400
np.where:
The slowest run took 4.83 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 5: 31 µs per loop
list-comprehension:
10000 loops, best of 5: 26 µs per loop
array of length 4000
np.where:
1000 loops, best of 5: 225 µs per loop
list-comprehension:
1000 loops, best of 5: 244 µs per loop
array of length 40000
np.where:
100 loops, best of 5: 2.27 ms per loop
list-comprehension:
100 loops, best of 5: 2.63 ms per loop

for arrays longer than 4000 np.where is faster.

edited Aug 21, 2021 at 8:14

answered Aug 20, 2021 at 14:06

Rm4n

9081 gold badge10 silver badges23 bronze badges

8 Comments

Pierre D Over a year ago

this is one of the slowest methods for short lists; For the four-element list of the OP question, it takes 7.89 µs ± 237 ns per loop, which is 23.8x slower than a simple list comprehension. For large lists (that are not yet as np.array), the relative difference decreases; it asymptotically stabilizes to around 1.9x slower.

Rm4n Over a year ago

@PierreD check out the updated post; for large arrays this method is faster

Pierre D Over a year ago

you used the wrong list comprehension. The one I proposed is [e or '???' for e in data]. That ends up at 1.9x faster than np.where in your loop of %timing: np.where: 1.83 ms ± 1.43 µs; list comprehension: 959 µs ± 735 ns. Before writing my comment, I had tested up to 100 million random elements. That's why I asserted 1.9x asymptotic speedup against np.where.

Rm4n Over a year ago

what do you mean by wrong? The list-comprehension I compared with is the solution to identical elements as the title of OP suggests (and as can be seen in other answers). Yours just works for empty elements.

Pierre D Over a year ago

You used %timeit ['???' if i == '' else i for i in data]. That replaces only empty elements, just like most of the answers here. For the case of empty elements, I suggested [e or '' for e in data], which is between 28x and 1.9x faster than np.array and np.where. That's why I say you used the wrong list comprehension. As far as removing duplicates, the other parts of my answer address that. I note that it seems to be the only answer so far that does it.

|

Collectives™ on Stack Overflow

Replace identical elements in a list without loop

5 Answers 5

Comments

3 Comments

2 Comments

2 Comments

8 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Comments

3 Comments

2 Comments

2 Comments

8 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related