why is the map function inherently parallel?

Question

I was reading the following presentation:

http://www.idt.mdh.se/kurser/DVA201/slides/parallel-4up.pdf

and the author claims that the map function is built very well for parallelism (specifically he supports his claim on page 3 or slides 9 and 10).

If one were given the problem of increasing each value of a list by +1, I can see how looping through the list imperatively would require a index value to change and hence cause potential race condition problems. But I'm curious how the map function better allows a programmer to successfully code in parallel.

Is it due to the way map is recursively defined? So each function call can be thrown to a different thread?

I hoping someone can provide some specifics, thanks!

because each application of the function f to an element of the input list is independent from any other application to any other element, so they can all be done independently from each other, i.e. in parallel. a hypothetical par_map would allocate the storage to back the resulting list, and spark execution of a new thread for each element e in the list, providing it the reference to the place which will need to be updated with the result of f e. When there are no more active threads, the map has finished. Of course you could make each thread work on a block of say 1000 es, too. — Will Ness
– Will Ness, Commented Nov 21, 2017 at 14:56

john_omalley · Accepted Answer · 2017-11-18 18:01:08Z

3

The map function applies the same pure function to n elements in a collection and aggregates the results. It doesn't matter the order in which you apply the function to the members of the collection because by definition the return value of the function is purely dependent upon the input.

answered Nov 18, 2017 at 18:01

john_omalley

1,3986 silver badges13 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

H_1317 Over a year ago

right, but could one not just split up the list into different chunks and process each chunk on its own thread imperatively with a loop? how would that differ from the above?

john_omalley Over a year ago

Perhaps nothing functionally... I think it's just a matter of complexity. The algorithm you're describing would be more complex... whereas you could just leverage a library to perform a parallel map. At least that's how I interpret his point.

H_1317 Over a year ago

I suppose i was more interested in the internals of map. if map is defined recursively then we would have: map(f(x), list) = if list is empty return the empty list else return f(head_of_list) concat with map(f(x),list.tail) .... if map is defined recursively, how does that allow map to avoid sequentially computing f(head_of_list) one by one? (like a loop would)

Jasper-M Over a year ago

@H_1317 The point is that if the API only provides tools for looping imperatively over a list, the programmer will have to implement parallelism himself. While the API of map allows the programmer to just supply the function he wants to apply to every element and then internally map can do whatever it wants with it, and handle the parallelism on behalf of the programmer.

H_1317 Over a year ago

@Jasper-M even if the API did not define map, could i not just define my own function called map that internally used a loop? what benefit does this particular recursive map function, if any, have over the loop version ?

|

Alexey Romanov · Accepted Answer · 2017-11-18 23:02:29Z

3

The others already explained that the standard map implementation isn't parallel.

But in Scala, since you tagged it, you can get the parallel version as simply as

val list = ... // some list
list.par.map(x => ...) // instead of list.map(x => ...)

See also Parallel Collections Overview and documentation for ParIterable and other types in the scala.collection.parallel package.

You can find the implementation of the parallel map in https://github.com/scala/scala/blob/v2.12.1/src/library/scala/collection/parallel/ParIterableLike.scala, if you want (look for def map and class Map). It requires very non-trivial infrastructure and certainly isn't just taking the recursive definition of sequential map and parallelizing it.

If one had defined map via a loop how would that break down?

The slides give F# parallel arrays as the example at the end and at https://github.com/fsharp/fsharp/blob/master/src/fsharp/FSharp.Core/array.fs#L266 you can see the non-parallel implementation there is a loop:

let inline map (mapping: 'T -> 'U) (array:'T[]) = 
     checkNonNull "array" array             
     let res : 'U[] = Microsoft.FSharp.Primitives.Basics.Array.zeroCreateUnchecked array.Length 
     for i = 0 to res.Length-1 do  
         res.[i] <- mapping array.[i] 
     res

edited Nov 18, 2017 at 23:02

answered Nov 18, 2017 at 20:45

Alexey Romanov

172k38 gold badges329 silver badges509 bronze badges

2 Comments

H_1317 Over a year ago

Thanks Alexey. Would you call the standard recursive definition of map, though still sequential, essential to creating easy to use parallelism? If one had defined map via a loop how would that break down? (I suspect the answers to these questions are maybe larger than a single comment, but any high level overview would be great!)

Alexey Romanov Over a year ago

No, it isn't essential or particularly relevant. The slides aren't talking about the recursive implementation at all, only about specification of map being possible to implement in parallel.

Collectives™ on Stack Overflow

why is the map function inherently parallel?

2 Answers 2

8 Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

8 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related