4

I need a maxBy that returns all max values in case of equality.

Here is the signature and a first implementation :

def maxBy[A, B](as: Seq[A])(f: A => B)(implicit cmp: Ordering[B]) : Seq[A] = 
  as.groupBy(f).toList.maxBy(_._1)._2

Example :

maxBy(Seq(("a", "a1"),("a", "a2"),("b", "b1"),("b", "b2")))(_._1)
res6: Seq[(String, String)] = List(("b", "b1"), ("b", "b2"))

Updated with @thearchetypepaul comment

  def maxBy[A, B](l: Seq[A])(f: A => B)(implicit cmp: Ordering[B]) : Seq[A] = {
    l.foldLeft(Seq.empty[A])((b, a) =>
      b.headOption match {
        case None => Seq(a)
        case Some(v) => cmp.compare(f(a), f(v)) match {
          case -1 => b
          case 0 => b.+:(a)
          case 1 => Seq(a)
        }
      }
    )
  }

Is there a better way ?

5
  • 2
    Since you have a working answer already, and you haven't identified any shortcomings in your solution that make it unsuitable for you, this is a better fit for CodeReview. Commented Jan 14, 2016 at 21:24
  • Yes. foldLeft. Add the element to the acc if equal, replace acc with Seq(element) if element greater otherwise leave acc unchanged. Linear,one traverse, doesn't build groups of all the non-maximal elements. Not in front of a computer with Scala so not tested. Commented Jan 14, 2016 at 21:55
  • With my head still spinning from reading Functional Programming in Scala, is there any value in representing the result as a Monad - say a trait Contenders with implementations Winner(winner: A) and Tie(winners: Seq[A])? Commented Jan 14, 2016 at 22:50
  • I'd try to cache the result of f(v), so to not repeatedly call f on the same value many times. Commented Jan 15, 2016 at 13:33
  • And b can be None only the first time through. So check for l.isEmpty once outside the foldLeft then l.tail.foldLeft(l.head){...} and the outer match isn't needed. Then make the accumulator a pair of the list so far, and f(v) so you don't have to recalculate f(v) each time. Commented Jan 15, 2016 at 18:39

2 Answers 2

6

(1) Ordering#compare promises to denote the three possible results by a negative, positive, or zero number, not -1, 1, or 0 necessarily.

(2) Option#fold is generally (though not universally) considered to be more idiomatic than pattern matching.

(3) You are calling f possibly multiple times per element. TraversableOnce#maxBy used to do this before it was fixed in 2.11.

(4) You only accept Seq. The Scala library works hard to use CanBuildFrom to generalize the algorithms; you might want to as well.

(5) You can use the syntactic sugar B : Ordering if you would like.

(6) You prepend to the Seq. This is faster than appending, since prepending is O(1) for List and appending is O(n). But you wind up with the results in reverse order. foldRight will correct this. (Or you can call reverse on the final result.)

If you want to allow the use of CanBuildFrom,

def maxBy[A, Repr, That, B : Ordering](elements: TraversableLike[A, Repr])(f: A => B)(implicit bf: CanBuildFrom[Repr, A, That]): That = {
  val b = bf()
  elements.foldLeft(Option.empty[B]) { (best, element) =>
    val current = f(element)
    val result = best.fold(0)(implicitly[Ordering[B]].compare(current, _))
    if (result > 0) {
      b.clear()
    }
    if (result >= 0) {
      b += element
      Some(current)
    } else {
      best
    }
  }
  b.result
}

If you want to work with TraversableOnce,

def maxBy[A, B : Ordering](elements: TraversableOnce[A])(f: A => B): Seq[A] = {
  elements.foldRight((Option.empty[B], List.empty[A])) { case (element, (best, elements)) =>
    val current = f(element)
    val result = best.fold(0)(implicitly[Ordering[B]].compare(current, _))
    if (result > 0) {
      (Some(current), List(element))
    } else {
      (best, if (result == 0) element +: elements else elements)
    }
  }._2
}
Sign up to request clarification or add additional context in comments.

2 Comments

The bf parameter should be implicit, right ? And TraversableOnce#maxBy doesn't seem to call f multiple times : github.com/scala/scala/blob/v2.11.7/src/library/scala/…. BTW, thanks for such a complete answer !
@YannMoisan, yes implicit. And, yes, maxBy appears to have been changed in 2.11. github.com/scala/scala/commit/…. I will edit the answer.
0

if the dataset is small the performance shouldn't concern you that much
and you can just sort, reverse, and takeWhile.

def maxBy[A,B:Ordering](l:List[A], f: A => B): List[A] = {
   l.sortBy(f).reverse match {
      case Nil    => Nil
      case h :: t => h :: t.takeWhile(x => f(x) == f(h))
   }
}

where the f should be an isomorphism on A.
and you can also save f(h) before comparison

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.