Data Parallel Haskell Prefix Sum

Question

I'm playing with some Data Parallel Haskell code and found myself in need of a prefix sum. However I didn't see any basic operator in the dph package for prefix sum.

I rolled my own, but, since I'm new to dph, I'm not sure if it's properly taking advantage of parallelization:

{-# LANGUAGE ParallelArrays #-}
{-# OPTIONS_GHC -fvectorise #-}

module PrefixSum ( scanP ) where
import Data.Array.Parallel (lengthP, indexedP, mapP, zipWithP, concatP, filterP, singletonP, sliceP, (+:+), (!:))
import Data.Array.Parallel.Prelude.Int ((<=), (-), (==), Int, mod)
-- hide prelude
import qualified Prelude 

-- assuming zipWithP (a -> b -> c) given 
-- [:a:] of length n and
-- [:b:] of length m, n /= m
-- will return
-- [:c:] of length min n m

scanP :: (a -> a -> a) -> [:a:] -> [:a:]
scanP f xs = if lengthP xs <= 1
                then xs
                else head +:+ tail
  where -- [: x_0, x_2, ..., x_2n :]
        evens = mapP snd . filterP (even . fst) $ indexedP xs
        -- [: x_1, x_3 ... :]
        odds = mapP snd . filterP (odd . fst)  $ indexedP xs
        lenEvens = lengthP evens
        lenOdds = lengthP odds
        -- calculate the prefix sums [:w:] of the pair sums [:z:]
        psums = scanP f $ zipWithP f evens odds
        -- calculate the total prefix sums as 
        -- [: x_0, w_0, f w_0 x_2, w_1, f w_1 x_4, ..., 
        head = singletonP (evens !: 0)
        body = concatP . zipWithP (\p e -> [: p, f p e :]) psums $ sliceP 1 lenOdds evens
        -- ending at either
        --    ... w_{n-1}, f w_{n-1} x_2n :]
        -- or
        --    ... w_{n-1}, f w_{n-1} x_2n, w_n :]
        -- depending on whether the length of [:x:] is 2n+1 or 2n+2
        tail = if lenEvens == lenOdds then body +:+ singletonP (psums !: (lenEvens - 1)) else body

-- reimplement some of Prelude so it can be vectorised
f $ x = f x
infixr 0 $
(.) f g y = f (g y)

snd (a,b) = b
fst (a,b) = a

even n = n `mod` 2 == 0
odd n = n `mod` 2 == 1

Hmm, is it even parallelizable? Seems like a pretty serial idea, but maybe I am missing something. — luqui
– luqui, Commented Jan 31, 2012 at 15:16
@luqui: The parallel prefix sum algorithm for an array of size n takes O(log n) parallel rounds of computation. There are two phases. In the forward phase, given {a_i | i \in [0,2n-1] } you calculate { a_2i + a_{2i+1} | i \in [0,n-1] } using n/2 parallel additions. In the backward phase, given { \sum_0^{2i+1} a_j | i \in [0,n-1] } and { a_i | i \in [0,2n-1] }, you calculate { \sum_0^i a_j | i \in [0, 2n-1 } using n/2 parallel additions. — rampion
– rampion, Commented Jan 31, 2012 at 16:24
@luqui: naturally, this only works properly for associative +, since there's the inherent assumption that (a_0 + a_1) + (a_2 + a_3) == ((a_0 + a_1) + a_2) + a_3 — rampion
– rampion, Commented Jan 31, 2012 at 16:27
For a normal array, this would be scanl, but I'm not seeing that in dph-par. — Louis Wasserman
– Louis Wasserman, Commented Jan 31, 2012 at 23:23
right. scanl is O(n) work in O(n) time, scanP is meant to be O(n) work in O(log n) time. — rampion
– rampion, Commented Feb 1, 2012 at 10:03

Don Stewart · Accepted Answer · 2012-05-03 17:21:54Z

2

Parallel prefix scans are supported, in fact, they're rather fundamental. So just pass (+) as your associative operator.

answered May 3, 2012 at 17:21

Don Stewart

138k36 gold badges372 silver badges472 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Data Parallel Haskell Prefix Sum

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related