1

I am planning on building a Merge Sorting algorithm that uses multiple threads in Java, and I've looked around the Internet and SO (Multi-threading a merge sorting algorithm for example) but I can't seem to settle on an answer to some of my questions.

First of all, would the optimal number of threads created be the same as the number of cores of the CPU? Should I even consider logical cores when considering number of threads?

Second, what is the best way of implementing multi-threading in such an algorithm? I've heard there is more than one way of doing it (like inheriting from the "Thread" class or using implements Runnable, etc.).

Also, would using ArrayLists or LinkedLists be a better choice in this case, in terms of optimisation?

Any other notes/suggestions concerning the implementation are appreciated.

Cheers.

7
  • If you want to parallel the sorting on one machine where all threads share the memory I would not expect much in terms of speedup because I'm sure the memory-access is the limiting factor as long as your compare methods is not incredible complicated. I'm very interested in hearing form your findings. Commented Dec 21, 2015 at 14:55
  • 1
    If this is a learning project, then start simple. Write a mergesort that - at the top level - splits the data N ways, sorts in separate threads, then merges the N lists in pairs, pairs of pairs, etc. until complete. Then experiment with values of N to answer your own questions. If this is a serious production effort, then use Arrays.parallelSort(). It's about as good as you'll do for a general purpose implementation. I agree with @MrSmith42 that you'll not see huge speedups unless comparison is compute bound, but I've observed a factor of 1.5 sorting integers. Commented Dec 21, 2015 at 15:06
  • @MrSmith42 I've found this one that compares the sorting speed to different processors Comparing speeds if anyone is interested. It is an educational project. It's not something I will implement in a bigger project. and ->Gene Thank you. That was what I was planning to do :) It is a learning project nothing serious. Commented Dec 21, 2015 at 15:17
  • This seems an odd choice of algorithm for a concurrent implementation. Merge sort does all of its work on the Join side, i.e. the part about splitting the work up into smaller chunks doesn't do any work, per se, it just makes small chunks. The part about putting it all back together is where the work happens, and I doubt that would benefit from concurrency. Can you use a different sort algorithm? One that does its work on the Fork side? Quicksort, perhaps? Commented Dec 21, 2015 at 15:28
  • 1
    @ErickG.Hagstrom you don't have to spawn threads on every split in mergesort - 4 threads mergesorting can easily be faster than 1 thread mergesorting - that's still 4 threads doing merging. It's is one of the simplest sort algos to parallelize and makes sense for an assignment. Commented Dec 21, 2015 at 16:17

2 Answers 2

3

In Java 8, there is Arrays.parallelSort() which is also used by the Stream API if you request parallelism with parallelStream. The source to parallelSort should be pretty informative if you're looking into this for educational purposes.

Sign up to request clarification or add additional context in comments.

4 Comments

Thank you! Yes this is an educational project :)
@DefinitelyNotMe ah in that case a couple of other things - design your algo first, without worrying too much about (essentially irrelevant) lowlevel details like Runnable v subclassing Thread. Take a look at some of the tutorials for java.util.concurrent which will give you an idea for some of the basic patterns and implementation strategies, even if you don't (or are not allowed to) use those classes. Finally, other than implementing it for homework, LinkedList is almost never the answer for anything :)
Thank you. I already have the algorithm on paper and also coded using C++, so that's the easy part. I was just trying to imagine what implementing it would look like with multi-threading.
@DefinitelyNotMe oh i meant the parallel algo. Parallelizing an algo is still design, right, and it's easier to think about it in abstractly before getting too bogged down in 'what's a runnable'. java.util.concurrent also provides implementations for many such higher-level abstractions - tasks, futures, queues, you name it so it's useful to look at.
1

...would the optimal number of threads created be the same as the number of cores of the CPU?

I would assume so. A merge sort should be memory bandwidth limited, not cpu bandwidth limited. The main gain from multi-threading early on would take advantage of each core's local cache, typically level 1 and level 2 cache. Usually level 3 cache is shared between cores, so the only gain there is if the merge process is relatively CPU bound compared to the speed of the level 3 cache. Once run sizes get large enough to exceed cache limits, then I'm not sure there's much to be gained from multi-threading.

Microsoft's stable_sort begins by using insertion sort to create sorted groups of 32 elements, probably to take advantage of local cache. I'm not sure if that really helps or not on current processors, since it's based on code written in 1994.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.