What is the most efficient implementation of finding counts of each distinct element in a Scala array?
1 Answer
you can do it like this:
val xs = Array("a", "b", "c", "c", "a", "b", "c", "b", "b", "a")
xs.groupBy(identity).mapValues(_.length)
or like this:
xs.foldLeft(Map[String, Int]().withDefaultValue(0))((acc, x) => acc + (x -> (acc(x) + 1)))
or you could use internal mutability to prevent copying if you want to be more efficient:
def countElems[A](xs: Array[A]): Map[A, Int] = {
val result = collection.mutable.Map[A, Int]().withDefaultValue(0)
xs foreach { x => result += (x -> (result(x) + 1)) }
result.toMap // this copies and makes it immutable, O(number of distinct elements)
}
1 Comment
Mustafa Orkun Acar
Could you write it with xs parameter of type DataFrame (one column) without using collect() ? I failed in using foreach on a DataFrame and updating the result Map.