Using list generator for memory efficient code in Haskell

Question

I would like to get a handle on writing memory efficient haskell code. One thing I ran across is that there is no dead easy way to make python style list generators/iterators (that I could find).

Small example:

Finding the sum of the integers from 1 to 100000000 without using the closed form formula.

Python that can be done quickly with minimal use of memory as sum(xrange(100000000). In Haskell the analogue would be sum [1..100000000]. However this uses up a lot of memory. I thought using foldl or foldr would be fine but even that uses a lot of memory and is slower than python. Any suggestions?

Haskell lists are lazy, which makes them closer to Python generators/iterators than Python lists, except that their interface looks like simple lists (because they are; Haskell is just a lazy language). — Alexis King
– Alexis King, Commented May 15, 2016 at 6:36
Compiled with -O2. Sum is implemented using folds in the source code. This sum allocates around 3 gb of memory and takes 1.8 seconds on my machine which is about 3x as long as long as python's print sum(xrange(100000001)) which is why I was surprised. I thought laziness would fix the memory issue and that speed should be around equal. Thanks for the help. — Helix
– Helix, Commented May 15, 2016 at 7:00
@Helix No lazyness makes it harder to understand space usage. Note that python has a very limited "lazyness". Sure generators produce one element at a time, however when doing a function call arguments are still evaluated before the call is performed. In Haskell arguments are not evaluated before the function call but only when, during the call, the value is needed. This means that chains of function call create big unevaluated structures. — Bakuriu
– Bakuriu, Commented May 15, 2016 at 7:14
btw: if compiled with -O2 while sum [...] will allocate around 3gb in total yes - but not at the same time (it's 44,312 bytes maximum residency on my system) — Random Dev
– Random Dev, Commented May 15, 2016 at 7:36

epsilonhalbe · Accepted Answer · 2016-05-15 14:16:32Z

TL;DR - I think the culprit in this case is - defaulting of GHC to `Integer`.

Admittedly I do not know enough about python, but my first guess would be that python switches to "bigint" only if necessary - therefore all calculations are done with Int a.k.a. 64-bit integer on my machine.

A first check with

$> ghci
GHCi, version 7.10.3: http://www.haskell.org/ghc/  :? for help
Prelude> maxBound :: Int
9223372036854775807

reveals that the result of the sum (5000000050000000) is less than that number so we have no fear of Int overflow.

I have guessed your example programs to look roughly like this

`sum.py`

print(sum(xrange(100000000)))

`sum.hs`

main :: IO ()
main = print $ sum [1..100000000]

To make things explicit I added the type annotation (100000000 :: Integer), compiling it with

$ > stack build --ghc-options="-O2 -with-rtsopts=-sstderr"

and ran your example,

$ > stack exec -- time sum
5000000050000000
   3,200,051,872 bytes allocated in the heap
         208,896 bytes copied during GC
          44,312 bytes maximum residency (2 sample(s))
          21,224 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0      6102 colls,     0 par    0.013s   0.012s     0.0000s    0.0000s
  Gen  1         2 colls,     0 par    0.000s   0.000s     0.0001s    0.0001s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time    1.725s  (  1.724s elapsed)
  GC      time    0.013s  (  0.012s elapsed)
  EXIT    time    0.000s  (  0.000s elapsed)
  Total   time    1.739s  (  1.736s elapsed)

  %GC     time       0.7%  (0.7% elapsed)

  Alloc rate    1,855,603,449 bytes per MUT second

  Productivity  99.3% of total user, 99.4% of total elapsed

1.72user 0.00system 0:01.73elapsed 99%CPU (0avgtext+0avgdata 4112maxresident)k

and indeed the ~3GB of memory consumption is reproduced.

Changing the annotation to (100000000 :: Int) - altered the behaviour drastically

$ > stack build
$ > stack exec -- time sum
5000000050000000
          51,872 bytes allocated in the heap
           3,408 bytes copied during GC
          44,312 bytes maximum residency (1 sample(s))
          17,128 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0         0 colls,     0 par    0.000s   0.000s     0.0000s    0.0000s
  Gen  1         1 colls,     0 par    0.000s   0.000s     0.0001s    0.0001s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time    0.034s  (  0.034s elapsed)
  GC      time    0.000s  (  0.000s elapsed)
  EXIT    time    0.000s  (  0.000s elapsed)
  Total   time    0.036s  (  0.035s elapsed)

  %GC     time       0.2%  (0.2% elapsed)

  Alloc rate    1,514,680 bytes per MUT second

  Productivity  99.4% of total user, 102.3% of total elapsed

0.03user 0.00system 0:00.03elapsed 91%CPU (0avgtext+0avgdata 3496maxresident)k
0inputs+0outputs (0major+176minor)pagefaults 0swaps

For the interested

The behaviour of the haskell version does not change a lot if you use libraries like conduit or vector (both boxed and unboxed).

sample programs

sumC.hs

import Data.Conduit
import Data.Conduit.List as CL

main :: IO ()
main = do res <- CL.enumFromTo 1 100000000 $$ CL.fold (+) (0 :: Int)
          print res

sumV.hs

import           Data.Vector.Unboxed as V
{-import           Data.Vector as V-}

main :: IO ()
main = print $ V.sum $ V.enumFromTo (1::Int) 100000000

funny enough the version using

main = print $ V.sum $ V.enumFromN (1::Int) 100000000

is doing worse than the above - even though the documentation says otherwise.

enumFromN :: (Unbox a, Num a) => a -> Int -> Vector a
O(n) Yield a vector of the given length containing the values x, x+1 etc. This operation is usually more efficient than enumFromTo.

Update

@Carsten's comment made me curious - so I had a look into the sources for integer - well integer-simple to be precise, because for Integer there exists other versions integer-gmp and integer-gmp2 using libgmp.

data Integer = Positive !Positive | Negative !Positive | Naught

-------------------------------------------------------------------
-- The hard work is done on positive numbers

-- Least significant bit is first

-- Positive's have the property that they contain at least one Bit,
-- and their last Bit is One.
type Positive = Digits
type Positives = List Positive

data Digits = Some !Digit !Digits
            | None
type Digit = Word#

data List a = Nil | Cons a (List a)

so when using Integer there is quite a bit of memory overhead compared to Int or rather unboxed Int# - I guess as this should be optimized, (though I have not confirmed that).

So Integer is (if I calculate correctly)

1 x Word for the sum-type-tag (here Positive
n x (Word + Word) for Some and the Digit part
1 x Word for the last None

a memory overhead of (2 + floor(log_10(n)) for each Integer in that calculation + a bit more for the accumulator.

wow interesting - I can understand the difference in time but the difference in memory consumption is puzzling me
@Carsten I tried to answer your question too - but I am not sure if everything is correct, I am by no means expert on that topic
@epsilonhalbe: IIRC, integer-gmp uses something along data Integer = Positive !ByteStuff# | Negative !ByteStuff# | SmallInt !Int#. So for small integers, it should perform the same as Int, but the checks whether (+) needs more space than Int# provides probably take their additional toll (and I'm not sure of the conditions that switch to Positive from SmallInt).

Collectives™ on Stack Overflow

Using list generator for memory efficient code in Haskell

1 Answer 1

TL;DR - I think the culprit in this case is - defaulting of GHC to `Integer`.

`sum.py`

`sum.hs`

For the interested

Update

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

TL;DR - I think the culprit in this case is - defaulting of GHC to Integer.

sum.py

sum.hs

For the interested

Update

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related

TL;DR - I think the culprit in this case is - defaulting of GHC to `Integer`.

`sum.py`

`sum.hs`