76

I am doing a project in Scala, but I am fairly new to the language, despite this I do have a Java background. I see that Scala doesn't have support for Java's ArrayList class, therefore I am wondering what is Scala's equivalent of it; and, if there is such an equivalent, I was wondering if there are any important differences between the Java and Scala versions of the class.

EDIT: I'm not looking for a specific behaviour so much as an internal representation that being: data stored in an array, but in such a way that the whole array isn't visible, except for the part you use.

2
  • What would be the difference between the whole array being visible versus the part that you use? Commented Nov 27, 2011 at 18:42
  • @Daniel, the main difference to me is that ArrayList handles the size for me (i.e. if I don't know what it will be ahead of time or know it will change, I don't have to worry about allocating extra space and keeping track of how much space I actually use). Commented Nov 27, 2011 at 21:21

3 Answers 3

116

I can think of 3 more specific questions to address yours:

  • What is Scala's default collection?
  • What Scala collection has characteristics similar to ArrayList?
  • What's a good replacement for Array in Scala?

So here are the answers for these:

What is Scala's default collection?

Scala's equivalent of Java's List interface is the Seq. A more general interface exists as well, which is the GenSeq -- the main difference being that a GenSeq may have operations processed serially or in parallel, depending on the implementation.

Because Scala allows programmers to use Seq as a factory, they don't often bother with defining a particular implementation unless they care about it. When they do, they'll usually pick either Scala's List or Vector. They are both immutable, and Vector has good indexed access performance. On the other hand, List does very well the operations it does well.

What Scala collection has characteristics similar to ArrayList?

That would be scala.collection.mutable.ArrayBuffer.

What's a good replacement for Array in Scala?

Well, the good news is, you can just use Array in Scala! In Java, Array is often avoided because of its general incompatibility with generics. It is a co-variant collection, whereas generics is invariant, it is mutable -- which makes its co-variance a danger, it accepts primitives where generics don't, and it has a pretty limited set of methods.

In Scala, Array -- which is still the same Array as in Java -- is invariant, which makes most problems go away. Scala accepts AnyVal (the equivalent of primitives) as types for its "generics", even though it will do auto-boxing. And through the "enrich my library" pattern, ALL of Seq methods are available to Array.

So, if you want a more powerful Array, just use an Array.

What about a collection that shrinks and grows?

The default methods available to all collections all produce new collections. For example, if I do this:

val ys = xs filter (x => x % 2 == 0)

Then ys will be a new collection, while xs will still be the same as before this command. This is true no matter what xs was: Array, List, etc.

Naturally, this has a cost -- after all, you are producing a new collection. Scala's immutable collections are much better at handling this cost because they are persistent, but it depends on what operation is executed.

No collection can do much about filter, but a List has excellent performance on generating a new collection by prepending an element or removing the head -- the basic operations of a stack, as a matter of fact. Vector has good performance on a bunch of operations, but it only pays if the collection isn't small. For collections of, say, up to a hundred elements, the overall cost might exceed the gains.

So you can actually add or remove elements to an Array, and Scala will produce a new Array for you, but you'll pay the cost of a full copy when you do that.

Scala mutable collections add a few other methods. In particular, the collections that can increase or decrease size -- without producing a new collection -- implement the Growable and Shrinkable traits. They don't guarantee good performance on these operations, though, but they'll point you to the collections you want to check out.

Sign up to request clarification or add additional context in comments.

27 Comments

Here is an online test that demonstrates ArrayList being about 3 times faster than LinkedList, despite the necessary reallocations. I had to reduce N to make it work online. With N = 50000000 on my own machine, ArrayList is about 10 times faster than LinkedList. What results do you get on your machine? I would be really surprised if LinkedList was faster, because it needs more assignments per insertion (time), and each element needs an additional Node object (space).
@DanielC.Sobral: Here is a followup online test showing ArrayList being 12 times faster than LinkedList for inserts in the middle. LinkedList only appears faster when inserting near the beginning.
@DanielC.Sobral: growing an ArrayList which doubles in size each time it reallocates means that at most you do twice as many copies as there are elements -- or in other words, on average, each element is copied twice. No matter how much you grow the array, that is. And in return, you get a more efficient data structure with much better locality and cache characteristics. A linked list is virtually never faster. If you never need to look up elements, and you never need to iterate over the data structure, then a linked list may be faster. Otherwise? No.
@DanielC.Sobral: in those cases, ArrayList will usually still be faster, because it is fewer allocations, less indirection, better memory locality. Keep in mind that inserting into the middle of a list requires you to iterate to find the middle. And that is much slower for a linked list. The rule of thumb is that "for some things, it's obvious that ArrayList is faster. For other things, it is surprising that ArrayList is faster". :)
This wasn't always the case, but the gap between CPU and memory performance has grown extremely wide. A random memory access is painfully slow, and linked lists are all random accesses. ArrayList (or any other dynamic array data structure) has virtually no random accesses. From a hardware point of view, that is a huge advantage, almost no matter which operations you perform on your data structure
|
26

It's ArrayBuffer from scala.collection.mutable. You can find the scaladocs here.

Comments

0

It's hard to say exactly what you should do because you haven't said what behavior of ArrayList you're interested in using. It's more useful to think in terms of which scala traits you want to take advantage of. Here's a good explanation: http://grahamhackingscala.blogspot.com/2010/02/how-to-convert-java-list-to-scala-list.html.

That said, you probably want some sort of IndexedSeq.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.