3

The following code example crashes due to stack overflow when parsing an expression deeply nested in brackets.

Parser combinators are part of the standard library. Is there a way of making use of the library avoiding that?

(I'm not asking for the reason of why it crashes rather for the right way to deal with the standard library.)

parsing: ((((((((... 1 + 1 ...)))))))))

code:

import scala.util.parsing.combinator.syntactical.StandardTokenParsers

object ArithmeticParser1 extends StandardTokenParsers {   
  lexical.delimiters ++= List("(", ")", "+", "-", "*", "/")

  val reduceList: Int ~ List[String ~ Int] => Int = {
    case i ~ ps => (i /: ps)(reduce) 
  }

  def reduce(x: Int, r: String ~ Int) = (r: @unchecked) match {
    case "+" ~ y => x + y
    case "-" ~ y => x - y
    case "*" ~ y => x * y
    case "/" ~ y => x / y
  }

  def expr  : Parser[Int] = term ~ rep ("+" ~ term | "-" ~ term) ^^ reduceList
  def term  : Parser[Int] = factor ~ rep ("*" ~ factor | "/" ~ factor) ^^ reduceList
  def factor: Parser[Int] = "(" ~> expr <~ ")" | numericLit ^^ (_.toInt)

  def main(args: Array[String]) {
    val s = scala.io.Source.fromFile(args(0)).mkString
    val tokens = new lexical.Scanner(s)
    println(s)
    println(phrase(expr)(tokens))
  }
}
5
  • I cannot reproduce the crash. How many levels of nested brackets make the program throw an exception ? Commented Oct 13, 2012 at 21:47
  • 1
    I tried too, and (with the default jvm stack size) had to go up to level 3500 ( = number of pairs of brackets) before I encountered a stack overflow! This leaves quite some room for real world expressions... @buerger: I'd be interested to know if you had a stack overflow at a more reasonable level, and otherwise which use case requires that much nesting. Commented Oct 14, 2012 at 8:14
  • One way to improve the parsers library is make it use an explicit stack. Any level of nesting is reasonable. But not any hypothetical way of modifying the library is reasonable. Commented Oct 14, 2012 at 10:07
  • In theory, yes, there should not be a limit on the complexity of expressions that can be parsed. In practice, anything is bound in computing, either by time (at one point something is too long to compute to be of any practical use) or memory (which is definitely bounded). And to be clear, the scala compiler itself will have a stack overflow if you try to make it compile the expression ((((((((... 1 + 1 ...))))))))) with a nesting level of 3500. Just try it. Commented Oct 14, 2012 at 10:39
  • And using an explicit stack will certainly help things tremendously, but again at one point your stack will tak the whole memory and you will hit a wall. I know I will probably look pedantic, but all I'm trying to say is that you necessarily need limits. So really the only real question is: is a level of 3500 an unacceptable limit? Commented Oct 14, 2012 at 10:43

1 Answer 1

0

I'm not sure how you would deal with it with scala parser combinators. My first thought was trampolining[1] - but a quick google search seems to say that the default library doesn't support this. Hence I think the main way around this would be to use -Xss which is less than ideal.

However https://github.com/djspiewak/gll-combinators supports trampolining, and it seems like it has a similar API to the standard library.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.