Multithreading in .Net

Question

I've a configuration xml which is being used by a batch module in my .Net 3.5 windows application. Each node in the xml is mapped to a .Net class. Each class does processing like mathematical calculations, making db calls etc.

The batch module loads the xml, identifies the class associated with each node and then processes it.

Now, we have the following requirements:

1.Lets say there are 3 classes[3 nodes in the xml]...A,B, and C. Class A can be dependant on class B...ie. we need to execute class B before processing class A. Class C processing should be done on a separare thread.

2.If a thread is running, then we should be able to cancel that thread in the middle of its processing.

We need to implement this whole module using .net multi-threading.

My questions are: 1.Is it possible to implement requirement # 1 above?If yes, how?

2.Given these requirements, is .Net 3.5 a good idea or .Net 4.0 would be a better choice?Would like to know advantages and disadvantages please.

Thanks for reading.

Ade Miller · Accepted Answer · 2011-04-19 16:49:58Z

4

You'd be better off using the Task Parallel Library (TPL) in .NET 4.0. It'll give you lots of nice features for abstracting the actual business of creating threads in the thread pool. You could use the parallel tasks pattern to create a Task for each of the jobs defined in the XML and the TPL will handle the scheduling of those tasks regardless of the hardware. In other words if you move to a machine with more cores the TPL will schedule more threads.

1) The TPL supports the notion of continuation tasks. You can use these to enforce task ordering and pass the result of one Task or future from the antecedent to the continuation. This is the futures pattern.

        // The antecedent task. Can also be created with Task.Factory.StartNew.
        Task<DayOfWeek> taskA = new Task<DayOfWeek>(() => DateTime.Today.DayOfWeek);

        // The continuation. Its delegate takes the antecedent task
        // as an argument and can return a different type.
        Task<string> continuation = taskA.ContinueWith((antecedent) =>
            {
                return String.Format("Today is {0}.",
                                    antecedent.Result);
            });

        // Start the antecedent.
        taskA.Start();

        // Use the contuation's result.
        Console.WriteLine(continuation.Result);

2) Thread cancellation is supported by the TPL but it is cooperative cancellation. In other words the code running in the Task must periodically check to see if it has been cancelled and shut down cleanly. TPL has good support for cancellation. Note that if you were to use threads directly you run into the same limitations. Thread.Abort is not a viable solution in almost all cases.

While you're at it you might want to look at a dependency injection container like Unity for generating configured objects from your XML configuration.

Answer to comment (below)

Jimmy: I'm not sure I understand holtavolt's comment. What is true is that using parallelism only pays off if the amount of work being done is significant, otherwise your program may spend more time managing parallelism that doing useful work. The actual datasets don't have to be large but the work needs to be significant.

For example if your inputs were large numbers and you we checking to see if they were prime then the dataset would be very small but parallelism would still pay off because the computation is costly for each number or block of numbers. Conversely you might have a very large dataset of numbers that you were searching for evenness. This would require a very large set of data but the calculation is still very cheap and a parallel implementation might still not be more efficient.

The canonical example is using Parallel.For instead of for to iterate over a dataset (large or small) but only perform a simple numerical operation like addition. In this case the expected performance improvement of utilizing multiple cores is outweighed by the overhead of creating parallel tasks and scheduling and managing them.

edited Apr 19, 2011 at 16:49

answered Apr 19, 2011 at 14:49

Ade Miller

13.8k1 gold badge45 silver badges76 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

alex.b Over a year ago

Absolutely agreed, Task Parallel Library in .net 4 is the best thing to go. Nice tutorial is here, "Task Parallelism" section. I'm reading it right now:)

alex.b Over a year ago

And, what can I add is that if you're novice in .net multi-threading, I'd advice you to see how to work with Thread class here. Good for understanding, IMHO.

Jimmy Over a year ago

Hi, thanks all for ur valuable inputs.As "holtavolt" points out below, would usage of TPL be an overkill if the datasets in my app are not too large?

Ade Miller Over a year ago

Jimmy: I added to the answer. See above.

user74042 Over a year ago

Thanks a ton Ade for the detailed explanation on TPL!.Great stuff indeed.Futures pattern exactly fits into our app requirements.Just one question:If we end up sticking to .net 3.5 framework, would we be able to implement the futures pattern like functionality?And if yes, how complex would it be to implement?

|

Matt Davis · Accepted Answer · 2011-04-19 14:46:55Z

1

Of course it can be done.

Assuming you're new, I would likely look into multithreading, and you want 1 thread per class then I would look into the backgroundworker class, and basically use it in the different classes to do the processing.

What version you want to use of .NET also depends on if this is going to run on client machines also. But I would go for .NET 4 simply because it's newest, and if you want to split up a single task into multiple threads it has built-in classes for this.

edited Apr 19, 2011 at 14:46

Matt Davis

46.3k17 gold badges96 silver badges127 bronze badges

answered Apr 19, 2011 at 14:44

EKS

5,6317 gold badges48 silver badges62 bronze badges

Comments

holtavolt · Accepted Answer · 2011-04-19 14:53:44Z

Given your use case, the Thread and BackgroundWorkerThread should be sufficient. As you'll discover in reading the MSDN information regarding these classes, you will want to support cancellation as your means of shutting down a running thread before it's complete. (Thread "killing" is something to be avoided if at all possible)

.NET 4.0 has added some advanced items in the Task Parallel Library (TPL) - where Tasks are defined and managed with some smarter affinity for their most recently used core (to provide better cache behavior, etc.), however this seems like overkill for your use case, unless you expect to be running very large datasets. See these sites for more information:

http://msdn.microsoft.com/en-us/library/dd460717.aspx

http://archive.msdn.microsoft.com/ParExtSamples

Collectives™ on Stack Overflow

Multithreading in .Net

3 Answers 3

Answer to comment (below)

6 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Answer to comment (below)

6 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related