Stack Overflow in GHCI when attempting to display a number

Question

In trying to learn Haskell, I have implemented a pi calculation in order to understand functions and recursion properly.

Using the Leibniz Formula for calculating pi, I came up with the following, which prints pi to the tolerance of the given parameter, and the number of recursive function calls in order to get that value:

reverseSign :: (Fractional a, Ord a) => a -> a 
reverseSign num = ((if num > 0
                        then -1
                        else 1) * (abs(num) + 2))

piCalc :: (Fractional a, Integral b, Ord a) => a -> (a, b)
piCalc tolerance = piCalc' 1 0.0 tolerance 0

piCalc' :: (Ord a, Fractional a, Integral b) => a -> a -> a -> b -> (a, b)
piCalc' denom prevPi tolerance count = if abs(newPi - prevPi) < tolerance
                                        then (newPi, count)
                                        else piCalc' (reverseSign denom) newPi tolerance (count + 1)
                                        where newPi = prevPi + (4 / denom)

So when I run this in GHCI, it seems to work as expected:

*Main> piCalc 0.001
(3.1420924036835256,2000)

But if I set my tolerance too fine, this happens:

*Main> piCalc 0.0000001
(3.1415927035898146,*** Exception: stack overflow

This seems wholly counter-intuitive to me; the actual calculation works fine, but just trying to print how many recursive calls fails??

Why is this so?

In case you don't know what a thunk is (I didn't when I started Haskell!) it's basically an unsolved computation. In your first example, before you print count, it won't have a value of 2000, it will have a value of ...+1)+1)+1)+1)+1) (I omitted the 2000 left-parentheses at the start :P). When you print that, it is actually added up. — Daniel Buckmaster
– Daniel Buckmaster, Commented Jan 30, 2013 at 9:54
I'll just add to what @DanielBuckmaster said that the important point is then that the thunks keep building up, taking more and more memory, while one naively expects count to be something like an Int (constant in space). You'll get used to this, but's definitely something that can bite you. — gspr
– gspr, Commented Jan 30, 2013 at 9:56

gspr · Accepted Answer · 2013-01-30 10:00:30Z

10

The count isn't ever evaluated during the computation, so it's left as a huge amount of thunks (overflowing the stack) until the very end.

You can force its evaluation during the computation by enabling the BangPatterns extension and writing piCalc' denom prevPi tolerance !count = ...

So why do we only need to force the evaluation of count? Well, all the other arguments are evaluated in the if. We actually need to inspect them all before calling piCalc' again, so thunks aren't building up; we need the actual values, not just "promises that they can be computed"! count, on the other hand, is never needed during the computation, and can remain as a series of thunks until the very end.

edited Jan 30, 2013 at 10:00

answered Jan 30, 2013 at 9:50

gspr

11.2k4 gold badges44 silver badges80 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Carl · Accepted Answer · 2013-01-30 18:52:51Z

8

This is a variant of the traditional foldl (+) 0 [1..1000000] stack overflow. The problem is that the count value is never evaluated during the evaluation of piCalc'. This means that it just carries an ever-growing set of thunks representing the addition to be done if needed. When it is needed, the fact that evaluating it requires stack depth proportional to the number of thunks causes the overflow.

The simplest solution makes use of the BangPatterns extension, changing the start of piCalc' to

piCalc' denom prevPi tolerance !count = ...

This forces the value of count to be evaluated when the pattern is matched, which means that it will never grow a giant chain of thunks.

Equivalently, and without the use of an extension, you could write it as

piCalc' denom prevPi tolerance count = count `seq` ...

This is exactly equivalent semantically to the above solution, but it uses seq explicitly instead of implicitly via a language extension. This makes it more portable, but a bit more verbose.

As for why the approximation of pi is not a long sequence of nested thunks, but count is: piCalc' branches on the result of a computation that requires the values of newPi, prevPi, and tolerance. It must examine those values before it decides if it's done or if it needs to run another iteration. It's that branch that causes the evaluation to be performed (when the function application is performed, which usually means something is pattern-matching on the result of the function.) On the other hand, nothing in the calculation of piCalc' depends on the value of count, so it isn't evaluated during the calculation.

edited Jan 30, 2013 at 18:52

answered Jan 30, 2013 at 9:51

Carl

27.2k4 gold badges67 silver badges88 bronze badges

3 Comments

Daniel Buckmaster Over a year ago

Can you explain why the thunking is not happening to the computed value of pi in this example, but only to the count?

Carl Over a year ago

@DanielBuckmaster That is because piCalc' branches on the result of a computation that requires the values of newPi, prevPi, and tolerance. It must examine those values before it decides if it's done or if it needs to run another iteration. It's that branch that causes the evaluation to be performed (when the function application is performed, which usually means something is pattern-matching on the result of the function.)

Daniel Buckmaster Over a year ago

Thanks! I think that'd be very valuable to have in an answer. It's the reason why count causes a stack overflow and not the actual calculation.

Collectives™ on Stack Overflow

Stack Overflow in GHCI when attempting to display a number

2 Answers 2

Comments

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related