1

I was reading the following presentation:

http://www.idt.mdh.se/kurser/DVA201/slides/parallel-4up.pdf

and the author claims that the map function is built very well for parallelism (specifically he supports his claim on page 3 or slides 9 and 10).

If one were given the problem of increasing each value of a list by +1, I can see how looping through the list imperatively would require a index value to change and hence cause potential race condition problems. But I'm curious how the map function better allows a programmer to successfully code in parallel.

Is it due to the way map is recursively defined? So each function call can be thrown to a different thread?

I hoping someone can provide some specifics, thanks!

1
  • because each application of the function f to an element of the input list is independent from any other application to any other element, so they can all be done independently from each other, i.e. in parallel. a hypothetical par_map would allocate the storage to back the resulting list, and spark execution of a new thread for each element e in the list, providing it the reference to the place which will need to be updated with the result of f e. When there are no more active threads, the map has finished. Of course you could make each thread work on a block of say 1000 es, too. Commented Nov 21, 2017 at 14:56

2 Answers 2

3

The map function applies the same pure function to n elements in a collection and aggregates the results. It doesn't matter the order in which you apply the function to the members of the collection because by definition the return value of the function is purely dependent upon the input.

Sign up to request clarification or add additional context in comments.

8 Comments

right, but could one not just split up the list into different chunks and process each chunk on its own thread imperatively with a loop? how would that differ from the above?
Perhaps nothing functionally... I think it's just a matter of complexity. The algorithm you're describing would be more complex... whereas you could just leverage a library to perform a parallel map. At least that's how I interpret his point.
I suppose i was more interested in the internals of map. if map is defined recursively then we would have: map(f(x), list) = if list is empty return the empty list else return f(head_of_list) concat with map(f(x),list.tail) .... if map is defined recursively, how does that allow map to avoid sequentially computing f(head_of_list) one by one? (like a loop would)
@H_1317 The point is that if the API only provides tools for looping imperatively over a list, the programmer will have to implement parallelism himself. While the API of map allows the programmer to just supply the function he wants to apply to every element and then internally map can do whatever it wants with it, and handle the parallelism on behalf of the programmer.
@Jasper-M even if the API did not define map, could i not just define my own function called map that internally used a loop? what benefit does this particular recursive map function, if any, have over the loop version ?
|
3

The others already explained that the standard map implementation isn't parallel.

But in Scala, since you tagged it, you can get the parallel version as simply as

val list = ... // some list
list.par.map(x => ...) // instead of list.map(x => ...)

See also Parallel Collections Overview and documentation for ParIterable and other types in the scala.collection.parallel package.

You can find the implementation of the parallel map in https://github.com/scala/scala/blob/v2.12.1/src/library/scala/collection/parallel/ParIterableLike.scala, if you want (look for def map and class Map). It requires very non-trivial infrastructure and certainly isn't just taking the recursive definition of sequential map and parallelizing it.

If one had defined map via a loop how would that break down?

The slides give F# parallel arrays as the example at the end and at https://github.com/fsharp/fsharp/blob/master/src/fsharp/FSharp.Core/array.fs#L266 you can see the non-parallel implementation there is a loop:

let inline map (mapping: 'T -> 'U) (array:'T[]) = 
     checkNonNull "array" array             
     let res : 'U[] = Microsoft.FSharp.Primitives.Basics.Array.zeroCreateUnchecked array.Length 
     for i = 0 to res.Length-1 do  
         res.[i] <- mapping array.[i] 
     res 

2 Comments

Thanks Alexey. Would you call the standard recursive definition of map, though still sequential, essential to creating easy to use parallelism? If one had defined map via a loop how would that break down? (I suspect the answers to these questions are maybe larger than a single comment, but any high level overview would be great!)
No, it isn't essential or particularly relevant. The slides aren't talking about the recursive implementation at all, only about specification of map being possible to implement in parallel.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.