If I need an Array with multiple degrees, I can't use a Vector. But let's consider the simple case of having only one degree: When to use Scala Vector, when Scala Array?
3 Answers
When it comes to time and space complexity, arrays are surprisingly versatile. You might expect that arrays are slow with regard to inserts and deletes until you consider modern memory architectures. CPUs can prefetch and stream arrays straight from memory while performing linear operations on them, such as copying for an insert or delete. Most other data-structures requires expensive indirections, defeating prefetching caches.
Immutability
Since linear access to arrays is very fast, I often (for smaller arrays) consider them as immutable and copy them on write.
How to choose
When I consider a data-structure for a certain task, I start by analyzing the performance implemented as a simple array. Only after this first step, I weigh the benefits and penalties of existing abstractions, such as vectors. Possible benefits of other data structures might be readability, code complexity, performance at scale, opportunities for garbage collection, ease of serialization and cache coherence. Readability and code complexity are on the top of my list, and this often weighs in favor of abstract data structures such as Vectors, Lists, Streams and Maps.
Consider GPU acceleration
When starting with arrays, I always consider the possibility of GPU execution. For example, machine learning heavily relies on vector (not to be confused with Scala vector) and matrix operations (linear algebra), which is accelerated on GPU hardware and often less memory intensive.
Comments
Choosing a data structure is, as always, a matter of context.
First of all, you have to take into account the issue at hand, the access pattern you expect to have and the performance characteristics. The Scala documentation includes a great comparison. Both collection share the common trait of being indexed, allowing fast random access, but you'll notice some differences between the two.
A key difference between the two, as suggested in the comments, is that a Vector is an immutable collection, while Arrays are mutable.
Furthermore, Arrays in Scala are effectively mapped over Java native arrays, making it quite easy to write idiomatic Scala code that can be used by just as idiomatic Java code elsewhere.
For further details, both the Array and Vector pages of the official Scala documentation include a good description. You can learn even more in the documentation section reserved to collections.
7 Comments
Array you have to be very careful of what you do with it, because at the first implicit conversion to a WrappedArray you end up manipulating something very different from a native Array. Idiomatic Scala code will seldom take advantage of the optimizations you cited.WrappedArray is a dangerous thing. We need some macro support here I guess.The Scala 3 official collections documentation doesn't even show or mention the Array type. It seems like an omission, I've created a ticket to get it fixed.
The Array API docs say:
Arrays are mutable, indexed collections of values.
Array[T]is Scala's representation for Java'sT[].
Vectors, on the other hand, are the immutable indexed collections.
However, there's the perhaps more popular ArrayBuffer, which, in fact, has a place in the official docs. So, if you're looking for mutability, should you use the Array or the ArrayBuffer? The short answer is, as always, it depends. ArrayBuffer is resizable, Array isn't. Arrays are specialized for built-in value types (except Unit), so Array[Int] is going to be more optimal than ArrayBuffer[Int] – the values won't have to be boxed.
See this SO answer for more details on the differences between ArrayBuffer and Array.
3 Comments
ArrayBuffer is like a list in Python, and Array is like a tuple in Python?Array isn’t.Array ~ np.array and Vector ~ tuple?
Vectorwhen you want immutability.Arraywhen you want mutability.