0

Does anyone know of a source for performance characteristics on the "contains" method within the different flavors of Scala Lists? The scala language docs cover the primary operations like head, tail, append, and so forth, but don't seem to cover the performance of 'contains.' (Or at least I didn't find anything as such.)

FWIW, I need the fastest structure that will effectively tell me whether an element exists within its listing. That listing, once initially compiled, will not undergo any further a/m/d operations.

This is for Scala version 2.10.0

EDIT: in case it should make any difference, this is a listing of text segments (~16 to 48 characters each.) And, to clarify, the docs did contain one small table that showed look-up performance - but for only a small set of the list/map implementations.

3
  • 2
    It'd have to be a pretty crazy kinda list for contains not to be O(n). If you want fast contains, use a Set Commented Jun 8, 2013 at 0:07
  • This table is all over the map when it comes to look-ups. Granted, most of these are hash (per earlier edit), but there's enough variance there that i'd like to make sure: scala-lang.org/docu/files/collections-api/collections_40.html Commented Jun 8, 2013 at 0:13
  • For most purposes, there is only one "flavor" of list in Scala, List which is a classic, functional (Lisp-like) cons-cell-based list. Scala's List is a concrete type while Java's List is abstract. What Java calls List Scala calls Seq, an abstract type for all collections that maintain a specific order of their entries identical to or the opposite of the order in which entries were added. As others have pointed out, what you want is a Set, whose precise purpose is to support fast tests for the presence or absence of a particular value. Commented Jun 8, 2013 at 16:03

1 Answer 1

1

This seems the right job for a tree, a RB tree in this case, where the search executed by contains performs logarithmically in the number of fragments.

Since you only need to check containment, you should use a set to further reduce lookup time.

The solution is TreeSet

Sign up to request clarification or add additional context in comments.

8 Comments

Out of curiosity, I'm seem several of the scala list implementations seemingly combine parallel functionality between lists, arrays, and so fort...do any of the basic constructs use a tree for implementation of contains - or is it done via brute-force iteration?
Iteration. Anyway, note that RB trees need additional memory for storing the data structure (at least a colour bit and two pointers per node). Choosing one or the other actually depends on the situation: RB trees are heavier and you will perceive the benefit only with a big number of stored elements (I did not actually benchmark this). Also note that Lists can preserve insertion order (like arrays), while Trees do not by design.
Why not a HashMap? It will provide O(1)/
Best case. Worst case could be O(n). A RB tree is consistently O(log(n))
Stefano, do you have any gut feeling for the number of elements required before you start to see that extra component weight start to pay off in performance. I understand there's a lot of factors involved, but I'm just try to get a rough ballpark. For my situation I'm only dealing with less than ~500, or so, elements, but the look-up will be done very frequently. (As compared to sifting through a quadrillion elements, but only doing it infrequently.)
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.