3

I am new to Haskell and I am trying the below code to remove duplicates from a list. However it does not seem to work.

compress []     = []
compress (x:xs) = x : (compress $ dropWhile (== x) xs)

I have tried some search, all the suggestions use foldr/ map.head. Is there any implementation with basic constructs?

6
  • 3
    Hint: you have exactly the correct idea, except dropWhile is the wrong function. dropWhile removes every element of the list until it hits one that == x; you need something that removes every element like that instead. Commented Jun 30, 2015 at 23:49
  • There's a good discussion on this subject here. Commented Jul 1, 2015 at 0:07
  • 2
    Note that there is a built-in named nub from Data.List module which does that. Commented Jul 1, 2015 at 0:39
  • 1
    @TikhonJelvis Are you sure? GHCi gives dropWhile (==1) [1,1,1,2,3,1,4,5,6] ==> [2,3,1,4,5,6]. Commented Jul 1, 2015 at 8:18
  • 2
    Oh yeah, totally got that backwards, sorry! Either way, dropWhile is not the right function. Commented Jul 1, 2015 at 17:16

3 Answers 3

6

I think that the issue that you are referring to in your code is that your current implementation will only get rid of adjacent duplicates. As it was posted in a comment, the builtin function nub will eliminate every duplicate, even if it's not adjacent, and keep only the first occurrence of any element. But since you asked how to implement such a function with basic constructs, how about this?

myNub :: (Eq a) => [a] -> [a]
myNub (x:xs) = x : myNub (filter (/= x) xs)
myNub [] = []

The only new function that I introduced to you there is filter, which filters a list based on a predicate (in this case, to get rid of every element present in the rest of that list matching the current element).

I hope this helps.

Sign up to request clarification or add additional context in comments.

Comments

2

First of all, never simply state "does not work" in a question. This leaves to the reader to check whether it's a compile time error, run time error, or a wrong result.

In this case, I am guessing it's a wrong result, like this:

> compress [1,1,2,2,3,3,1]
[1,2,3,1]

The problem with your code is that it removes successive duplicates, only. The first pair of 1s gets compressed, but the last lone 1 is not removed because of that.

If you can, sort the list in advance. That will make equal elements close, and then compress does the right job. The output will be in a different order, of course. There are ways to keep the order too if needed (start with zip [0..] xs, then sort, then ...).

If you can not sort becuase there is really no practical way to define a comparison, but only an equality, then use nub. Be careful that this is much less efficient than sorting & compressing. This loss of performance is intrinsic: without a comparator, you can only use an inefficient quadratic algorithm.

1 Comment

The OP's question was about implementing nub, not finding it in the library. The discussion of the problems is interesting but not addressing that or the OP's confusion about folds and map.
0

foldr and map are very basic FP constructs. However, they are very general and I found them a bit mind-bending to understand for a long time. Tony Morris' talk Explain List Folds to Yourself helped me a lot.

In your case an existing list function like filter :: (a -> Bool) -> [a] -> [a] with a predicate that exludes your duplicate could be used in lieu of dropWhile.

5 Comments

The base library included with most Haskell implementations (certainly GHC) includes efficient Set data structures that do this as you add to them. Combining Data.Set.fromList and Data.Set.toList would let you do this, but I don't think library use your intent here.
However, converting the list into a set will order the elements and require an Ord constraint. Nub only requires an Eq constraint and preserves the ordering. Of course, the set option is more efficient at O(n log n) vs O(n^2). The reordering and/or the additional constraint may not be desirable in this case.
@fgv what do you think about my answer? I kept the Set suggestion as a comment because it really didn't seem to match the OP's intent.
I think your answer steers the OP in the right direction. The idea of using set or map is good too from a performance standpoint, only that it will reorder elements and put that additional constraint, which the OP may not want...
I completely agree. I was apprehensive about posting the comment suggesting Data.Set.fromList based on library use, but your identification of output ordering and the Ord constraint on input are even more important considerations since they change the type of the OP's function.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.