Haskell Remove duplicates from list

Question

I am new to Haskell and I am trying the below code to remove duplicates from a list. However it does not seem to work.

compress []     = []
compress (x:xs) = x : (compress $ dropWhile (== x) xs)

I have tried some search, all the suggestions use foldr/ map.head. Is there any implementation with basic constructs?

Hint: you have exactly the correct idea, except dropWhile is the wrong function. dropWhile removes every element of the list until it hits one that == x; you need something that removes every element like that instead. — Tikhon Jelvis
– Tikhon Jelvis, Commented Jun 30, 2015 at 23:49
Note that there is a built-in named nub from Data.List module which does that. — Sibi
– Sibi, Commented Jul 1, 2015 at 0:39
@TikhonJelvis Are you sure? GHCi gives dropWhile (==1) [1,1,1,2,3,1,4,5,6] ==> [2,3,1,4,5,6]. — chi
– chi, Commented Jul 1, 2015 at 8:18
Oh yeah, totally got that backwards, sorry! Either way, dropWhile is not the right function. — Tikhon Jelvis
– Tikhon Jelvis, Commented Jul 1, 2015 at 17:16

fgv · Accepted Answer · 2015-07-01 01:13:30Z

6

I think that the issue that you are referring to in your code is that your current implementation will only get rid of adjacent duplicates. As it was posted in a comment, the builtin function nub will eliminate every duplicate, even if it's not adjacent, and keep only the first occurrence of any element. But since you asked how to implement such a function with basic constructs, how about this?

myNub :: (Eq a) => [a] -> [a]
myNub (x:xs) = x : myNub (filter (/= x) xs)
myNub [] = []

The only new function that I introduced to you there is filter, which filters a list based on a predicate (in this case, to get rid of every element present in the rest of that list matching the current element).

I hope this helps.

answered Jul 1, 2015 at 1:13

fgv

8358 silver badges15 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

chi · Accepted Answer · 2015-07-01 08:27:06Z

2

First of all, never simply state "does not work" in a question. This leaves to the reader to check whether it's a compile time error, run time error, or a wrong result.

In this case, I am guessing it's a wrong result, like this:

> compress [1,1,2,2,3,3,1]
[1,2,3,1]

The problem with your code is that it removes successive duplicates, only. The first pair of 1s gets compressed, but the last lone 1 is not removed because of that.

If you can, sort the list in advance. That will make equal elements close, and then compress does the right job. The output will be in a different order, of course. There are ways to keep the order too if needed (start with zip [0..] xs, then sort, then ...).

If you can not sort becuase there is really no practical way to define a comparison, but only an equality, then use nub. Be careful that this is much less efficient than sorting & compressing. This loss of performance is intrinsic: without a comparator, you can only use an inefficient quadratic algorithm.

answered Jul 1, 2015 at 8:27

chi

117k4 gold badges143 silver badges230 bronze badges

1 Comment

Alain O'Dea Over a year ago

The OP's question was about implementing nub, not finding it in the library. The discussion of the problems is interesting but not addressing that or the OP's confusion about folds and map.

Alain O'Dea · Accepted Answer · 2015-06-30 23:58:41Z

0

foldr and map are very basic FP constructs. However, they are very general and I found them a bit mind-bending to understand for a long time. Tony Morris' talk Explain List Folds to Yourself helped me a lot.

In your case an existing list function like filter :: (a -> Bool) -> [a] -> [a] with a predicate that exludes your duplicate could be used in lieu of dropWhile.

answered Jun 30, 2015 at 23:58

Alain O'Dea

21.8k1 gold badge64 silver badges91 bronze badges

5 Comments

Alain O'Dea Over a year ago

The base library included with most Haskell implementations (certainly GHC) includes efficient Set data structures that do this as you add to them. Combining Data.Set.fromList and Data.Set.toList would let you do this, but I don't think library use your intent here.

fgv Over a year ago

However, converting the list into a set will order the elements and require an Ord constraint. Nub only requires an Eq constraint and preserves the ordering. Of course, the set option is more efficient at O(n log n) vs O(n^2). The reordering and/or the additional constraint may not be desirable in this case.

Alain O'Dea Over a year ago

@fgv what do you think about my answer? I kept the Set suggestion as a comment because it really didn't seem to match the OP's intent.

fgv Over a year ago

I think your answer steers the OP in the right direction. The idea of using set or map is good too from a performance standpoint, only that it will reorder elements and put that additional constraint, which the OP may not want...

Alain O'Dea Over a year ago

I completely agree. I was apprehensive about posting the comment suggesting Data.Set.fromList based on library use, but your identification of output ordering and the Ord constraint on input are even more important considerations since they change the type of the OP's function.

Collectives™ on Stack Overflow

Haskell Remove duplicates from list

3 Answers 3

Comments

1 Comment

5 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

1 Comment

5 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related