Refactoring / layout of functional Scala

Question

This one liner...

 Console.println(io.Source.fromFile("names.txt").getLines.mkString.split(",").map{x:String => x.slice(1, x.length -1)}.sortBy { x => x}.zipWithIndex.map{t =>{ (t._2 +1)*(t._1.map{_.toChar - "A"(0).toChar + 1}.sum)}}.sum);

... is my solution to Project Euler problem 22. It seems to work, and it's written in (my attempt at) functional style.

This example is a bit extreme, but my question is a bit more general - how do you prefer to write/format/comment functional style code? The functional approach seems to encourage a sequence of method calls, which I find can get unreadable, and also leaves nowhere obvious to put comments.

Also, when I write procedural code, I find I write small methods each with one purpose and with meaningful names. When I write functional code, I seem to be developing a habit that produces lines a little like the one above, where (to me) the meaning is difficult to decipher - and also the individual computations are difficult to re-use elsewhere. Quite a lot of the functional code examples I see on the web are similarly terse (to me) obscure.

What should I be doing? Writing little functions for each part of the computatation with names meaningful in the current context? (even if they're little more than a wrapper for map, say?)

For the example I've given, what are better ways of writing it, and presenting it?

(Like all style questions, this one is subjective. There's no reason it should get argumentative, though!)

I don't have enough experience with Scala to give you good advice, but it might be helpful if you were to break each map onto a different line? I'm interested to find out what real Scala-ites have to say about this. — Jonathan Sterling
– Jonathan Sterling, Commented Nov 21, 2010 at 16:33
A very important and language-/paradigma-agnostic rule is not to make lines longer than 80 (or another sensible value) characters. 190 chars is just insane (and if you're code golfing, at least get it down to 130 or you'll never beat perl! ;) ). — user395760
– user395760, Commented Nov 21, 2010 at 16:37
Agreed, it's that long for pedagogical reasons. But my question is partly about how best to split it into smaller chunks, other than choosing breaks just to make the lines < 80 characters. I'm not code golfing, but I am trying to improve my functional programming... — The Archetypal Paul
– The Archetypal Paul, Commented Nov 21, 2010 at 16:42

Kevin Wright · Accepted Answer · 2010-11-21 18:57:57Z

19

A trivial first-attempt at tidying it up is to just remove the leading Console. the trailing ; and the explicit :String type - all of which are unnecessary - add some indentation and import io.Source:

import io.Source
println(
  Source.fromFile("names.txt").getLines.mkString.split(",").map{
    x => x.slice(1, x.length -1)
  }.sortBy {x => x}.zipWithIndex.map{
    t =>{ (t._2 +1)*(t._1.map{_.toChar - "A"(0).toChar + 1}.sum)}
  }.sum
)

The next step is to clean it up a little, use pattern matching when mapping over list of tuples and identity instead of x=>x. toChar is also unnecessary for characters, and single quotes can be used to represent character literals.

import io.Source
println(
  Source.fromFile("names.txt").getLines.mkString.split(",").map {
    x => x.slice(1, x.length -1)
  }.sortBy(identity).zipWithIndex.map {
    case (v, idx) =>{ (idx+1)*(v.map{_ - 'A' + 1}.sum)}
  }.sum
)

A few more changes also help make the intent of the code far clearer:

import io.Source
println(
  Source.fromFile("names.txt").getLines.mkString.split(",")
  .map { _.stripPrefix("\"").stripSuffix("\"") }
  .sortBy(identity)
  .map { _.map{_ - 'A' + 1}.sum }
  .zipWithIndex
  .map { case (v, idx) => (idx+1) * v }
  .sum
)

The next step, to make it more "functional", is to break it into "functions" (sneaky, huh?). Ideally each function will have a name clearly expressing its purpose, and it will be short (aim for it to be a single expression, so braces aren't required):

import io.Source

def unquote(s:String) = s.stripPrefix("\"").stripSuffix("\"")

def wordsFrom(fname:String) =
  Source.fromFile(fname).getLines.mkString.split(",").map(unquote)

def letterPos(c:Char) = c - 'A' + 1

println(
  wordsFrom("names.txt")
  .sortBy(identity)
  .map { _.map(letterPos).sum }
  .zipWithIndex
  .map { case (v, idx) => (idx+1) * v }
  .sum
)

wordsFrom is an obvious 1-liner, but I split it for easier formatting on stackOverflow

edited Nov 21, 2010 at 18:57

answered Nov 21, 2010 at 18:44

Kevin Wright

49.7k9 gold badges107 silver badges156 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

The Archetypal Paul Over a year ago

Thanks. I'm still a neophyte at Scala so this is really useful stuff. The refactoring is pretty much what I would have done. I've somehow gained the (probably erroneous) view that this small, single-expression function approach wasn't the way to do things, though.

Kevin Wright Over a year ago

I had to roll it back a couple of notches after seeing the code full-screen, it didn't look much better than the original after I broke down everything into functions :)

Aaron Novstrup Over a year ago

+1 This is pretty close to what I'd consider a perfect answer to this question. Minor nitpick: wouldn't sorted be preferable to sortBy(identity)?

pedrofurla Over a year ago

@Aaron Novstrup, yeah sorted seens better.

Jörg W Mittag Over a year ago

Minor nitpick: I don't like fname. Remember, as of Scala 2.8, parameter names are part of the public contract of a method, because of named arguments. What about path?

|

Brian · Accepted Answer · 2010-11-21 18:19:48Z

Here is what I think might be a better way to lay it out:

Console.println(
    io.Source.fromFile("names.txt")
    .getLines.mkString.split(",")
    .map{x:String => x.slice(1, x.length -1)}
    .sortBy { x => x}
    .zipWithIndex
    .map{t =>{ (t._2 +1)*(t._1.map{_.toChar - "A"(0).toChar + 1}.sum)}}
    .sum);

I feel like deep in the recesses of my brain there's some algorithm that makes decisions about code-layout trade-offs between horizontal and vertical space, but I don't seem to have direct access to enable me to articulate it. :)

Regarding introducing names rather than using lambdas, I think what I typically do is, if I am tempted to put a short comment describing the intent of the code, but a good identifier name may do the same, then I may factor a one-time lambda into a named function in order to get the readability benefit of the identifier name. The line with the toChar calls is the only one above that looks like a candidate to me. (To be clear, I'd factor (part of) the lambda inside the map, but the map call itself.) Alternatively, the introduction of vertical whitespace gives you a place to hang a //comment which is an alternative to introducing an identifier name.

(Disclaimer: I don't write Scala, so if anything I say conflicts with style conventions, then ignore me :), but I imagine a lot of this advice is mostly-language-agnostic.)

Daniel C. Sobral · Accepted Answer · 2010-11-23 14:21:24Z

Speaking strictly to how to format that code, without making any structural changes, I'd do it like this:

Console println (
  (
    io.Source
    fromFile "names.txt"
    getLines ()
    mkString ""
    split ","
    map (x => x.slice(1, x.length - 1))
    sortBy (x => x)
    zipWithIndex
  )
  map (t => (t._2 + 1) * (t._1 map (_.toChar - "A"(0).toChar + 1) sum) )
  sum
)

Or, perhaps, to get around the parameterless methods, I'd do this:

Console println (
  io.Source
  .fromFile("names.txt")
  .getLines
  .mkString
  .split(",")
  .map(x => x.slice(1, x.length - 1))
  .sortBy(x => x)
  .zipWithIndex
  .map(t => (t._2 + 1) * (t._1 map (_.toChar - "A"(0).toChar + 1) sum) )
  .sum
)

Note that there is plenty of space for comments, but, generally speaking, what is being done is usually clear. People not used to it might sometimes get lost halfway through, without variables to keep track of the meaning/type of the transformed value.

Now, some things I'd do differently are:

println ( // instead of Console println
  Source // put import elsewhere
  .fromFile("names.txt")
  .mkString // Unless you want to get rid of /n, which is unnecessary here
  .split(",")
  .map(x => x.slice(1, x.length - 1))
  .sorted // instead of sortBy
  .zipWithIndex
  .map { // use { only for multiple statements and, as in this case, pattern matching
    case (c, index) => (index + 1) * (c map (_ - 'A' + 1) sum) // chars are chars
  }
  .sum
)

I'd also not do sum and multiplying in the same step, so:

  .sorted
  .map(_ map (_ - 'A' + 1) sum)
  .zipWithIndex
  .map { case (av, index) => av * (index + 1) }
  .sum

Finally, I don't much like that string resizing, so I might resort to regex instead. Add a little refactoring, and this is something I'd probably write:

  import scala.io.Source
  def names = Source fromFile "names.txt" mkString

  def wordExtractor = """"(.*?)"""".r
  def words = for {
    m <- wordExtractor findAllIn names matchData
  } yield m group 1

  def alphabeticValue(s: String) = s map (_ - 'A' + 1) sum
  def wordsAV = words.toList.sorted map alphabeticValue

  def multByIndex(t: (Int, Int)) = t match {
    case (av, index) => av * (index + 1)
  }
  def wordsAVByIndex = wordsAV.zipWithIndex map multByIndex

  println(wordsAVByIndex.sum)

EDIT

The next step would be a renaming refactoring -- choosing names that better convey what each part of the code is doing. Ken suggested better names in the comments, and I'd appropriate them for one more variant (it also showcases nicely how much better naming improves readability).

import scala.io.Source
def rawData = Source fromFile "names.txt" mkString

// I'd rather write "match" than "m" in the next snippet, but
// the former is a keyword in Scala, so "m" has become more
// common in my code than "i". Also, make the return type of
// getWordsOf clear, because iterators can be tricky.
// Returning a list, however, makes a much less cleaner
// definition.

def wordExtractor = """"(.*?)"""".r
def getWordsOf(input: String): Iterator[String] = for {
  m <- wordExtractor findAllIn input matchData
} yield m group 1
def wordList = getWordsOf(rawData).toList

// I stole letterPosition from Kevin's solution. There, I said it. :-)

def letterPosition(c: Char) = c.toUpper - 'A' + 1 // .toUpper isn't necessary
def alphabeticValueOfWord(word: String) = word map letterPosition sum
def alphabeticValues = wordList.sorted map alphabeticValueOfWord

// I don't like multAVByIndex, but I haven't decided on a better
// option yet. I'm not very fond of declaring methods that return
// functions either, but I find that better than explicitly handling
// tuples (oh, for the day tuples/arguments are unified!).

def multAVByIndex = (alphabeticValue: Int, index: Int) => 
  alphabeticValue * (index + 1)
def scores = alphabeticValues.zipWithIndex map multAVByIndex.tupled

println(scores sum)

I like this the best. I'd rename some variables to make things a little clearer: names -> rawData (reflect that it's one big string), wordsAV -> alphabeticalValues (avoid the acronym), wordsAVByIndex -> scores (avoid the acronym, call them scores because that's what Project Euler calls them)
@Ken I didn' get to rename refactoring. :-) Indeed, I like all of your suggestions better than what the variables are presently named. Maybe I should even change the code, to make it more readable?
yes, please change it (by adding another variant to your answer) so that future readers of the page can benefit from your views on this.

Shiva Wu · Accepted Answer · 2010-11-22 09:43:26Z

1

In addition to Kevin's solution,

the key is to split the functionality clearly and neatly, taking in consideration of reusability and readability.

To make the code even shorter and cleaner, it seems that the for expression can be used.


def inputString(inputFile: String) = io.Source.fromFile(inputFile).getLines.mkString

def inputWords(input: String) = input.split("[,\"]").filter("" != )

Console.println {
    (for { (s, idx) <- inputWords(inputString("names.txt")).sorted.zipWithIndex }
        yield s.map(_ - 'A' + 1).sum * (idx + 1)).sum
}

The s.map(_-'A'+1) part can be further put into a function, say wordSum, if you want it to be more distinct.

answered Nov 22, 2010 at 9:43

Shiva Wu

1,0921 gold badge9 silver badges21 bronze badges

1 Comment

The Archetypal Paul Over a year ago

Thanks. It's all about personal preference, of course, but I find Kevin's reworking clearer.

Collectives™ on Stack Overflow

Refactoring / layout of functional Scala

4 Answers 4

8 Comments

Comments

3 Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

8 Comments

Comments

3 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related