2

In chapter 4.4 Dynamic Parallelism, in Stephen Cleary's book Concurrency in C# Cookbook, it says the following:

Parallel tasks may use blocking members, such as Task.Wait, Task.Result, Task.WaitAll, and Task.WaitAny. In contrast, asynchronous tasks should avoid blocking members, and prefer await, Task.WhenAll, and Task.WhenAny.

I was always told that Task.Wait etc are bad because they block the current thread, and that it's much better to use await instead, so that the calling thread is not blocked.

Why is it ok to use Task.Wait etc for a parallel (which I think means CPU bound) Task?

Example: In the example below, isn't Test1() better because the thread that calls Test1() is able to continue doing something else while it waits for the for loop to complete?

Whereas the thread that calls Test() is stuck waiting for the for loop to complete.

    private static void Test()
    {
        Task.Run(() =>
        {
            for (int i = 0; i < 100; i++)
            {
                //do something.
            }
        }).Wait();
    }

    private static async Task Test1()
    {
        await Task.Run(() =>
        {
            for (int i = 0; i < 100; i++)
            {
                //do something.
            }
        });
    }

EDIT:

This is the rest of the paragraph which I'm adding based on Peter Csala's comment:

Parallel tasks also commonly use AttachedToParent to create parent/child relationships between tasks. Parallel tasks should be created with Task.Run or Task.Factory.StartNew.

2
  • IMHO If you would quote the whole section not just the 2 sentences then Stephen's statement would make more sense. Commented Jan 17, 2023 at 12:21
  • @PeterCsala I added the the rest of the quote. It doesn't help make anything clearer for me. Could you please elaborate? Commented Jan 18, 2023 at 0:41

6 Answers 6

3

You've already got some great answers here, but just to chime in (sorry if this is repetitive at all):

Task was introduced in the TPL before async/await existed. When async came along, the Task type was reused instead of creating a separate "Promise" type.

In the TPL, pretty much all tasks were Delegate Tasks - i.e., they wrap a delegate (code) which is executed on a TaskScheduler. It was also possible - though rare - to have Promise Tasks in the TPL, which were created by TaskCompletionSource<T>.

The higher-level TPL APIs (Parallel and PLINQ) hide the Delegate Tasks from you; they are higher-level abstractions that create multiple Delegate Tasks and execute them on multiple threads, complete with all the complexity of partitioning and work queue stealing and all that stuff.

However, the one drawback to the higher-level APIs is that you need to know how much work you are going to do before you start. It's not possible for, e.g., the processing of one data item to add another data item(s) back at the beginning of the parallel work. That's where Dynamic Parallelism comes in.

Dynamic Parallelism uses the Task type directly. There are many APIs on the Task type that were designed for Dynamic Parallelism and should be avoided in async code unless you really know what you're doing (i.e., either your name is Stephen Toub or you're writing a high-performance .NET runtime). These APIs include StartNew, ContinueWith, Wait, Result, WaitAll, WaitAny, Id, CurrentId, RunSynchronously, and parent/child tasks. And then there's the Task constructor itself and Start which should never be used in any code at all.

In the particular case of Wait, yes, it does block the thread. And that is not ideal (even in parallel programming), because it blocks a literal thread. However, the alternative may be worse.

Consider the case where task A reaches a point where it has to be sure task B completes before it continues. This is the general Dynamic Parallelism case, so assume no parent/child relationship.

The old-school way to avoid this kind of blocking is to split method A up into a continuation and use ContinueWith. That works fine, but it does complicate the code - rather considerably in the case of loops. You end up writing a state machine, essentially what async does for you. In modern code, you may be able to use await, but then that has its own dangers: parallel code does not work out of the box with async, and combining the two can be tricky.

So it really comes down to a tradeoff between code complexity vs runtime efficiency. And when you consider the following points, you'll see why blocking was common:

  • Parallelism is normally done on Desktop applications; it's not common (or recommended) for web servers.
  • Desktop machines tend to have plenty of threads to spare. I remember Mark Russinovich (long before he joined Microsoft) demoing how showing a File Open dialog on Windows spawned some crazy number of threads (over 20, IIRC). And yet the user wouldn't even notice 20 threads being spawned (and presumably blocked).
  • Parallel code is difficult to maintain in the first place; Dynamic Parallelism using continuations is exceptionally difficult to maintain.

Given these points, it's pretty easy to see why a lot of parallel code blocks thread pool threads: the user experience is degraded by an unnoticeable amount, but the developer experience is enhanced significantly.

Sign up to request clarification or add additional context in comments.

1 Comment

"parallel code does not work out of the box with async, and combining the two can be tricky." - "Parallel code is difficult to maintain in the first place; Dynamic Parallelism using continuations is exceptionally difficult to maintain." - Do you have any examples of this? I've read about fake async in your articles and I understand why it's bad, but it's not complicated or tricky. Would you be able to come up with an example that shows why it's difficult to maintain?
2

The thing is if you are using tasks to parallelize CPU-bound work - your method is likely not asynchronous, because the main benefit of async is asynchronous IO, and you have no IO in this case. Since your method is synchronous - you can't await anything, including tasks you use to parallelize computation, nor do you need to.

The valid concern you mentioned is you would waste current thread if you just block it waiting for parallel tasks to complete. However you should not waste it like this - it can be used as one participant in parallel computation. Say you want to perform parallel computation on 4 threads. Use current thread + 3 other threads, instead of using just 4 other threads and waste current one blocked waiting for them.

That's what for example Parallel LINQ does - it uses current thread together with thread pool threads. Note also its methods are not async (and should not be), but they do use Tasks internally and do block waiting on them.

Update: about your examples.

This one:

private static void Test()
{
    Task.Run(() =>
    {
        for (int i = 0; i < 100; i++)
        {
            //do something.
        }
    }).Wait();
}

Is always useless - you offset some computation to separate thread while current thread is blocked waiting, so in result one thread is just wasted for nothing useful. Instead you should just do:

private static void Test()
{
    for (int i = 0; i < 100; i++)
    {
        //do something.
    }
}

This one:

private static async Task Test1()
{
    await Task.Run(() =>
    {
        for (int i = 0; i < 100; i++)
        {
            //do something.
        }
    });
}

Is useful sometimes - when for some reason you need to perform computation but don't want to block current thread. For example, if current thread is UI thread and you don't want user interface to be freezed while computation is performed. However, if you are not in such environment, for example you are writing general purpose library - then it's useless too and you should stick to synchronous version above. If user of your library happen to be on UI thread - he can wrap the call in Task.Run himself. I would say that even if you are not writing a library but UI application - you should move all such logic (for loop in this case) into separate synchronous method and then wrap call to that method in Task.Run if necessary. So like this:

private static async Task Test2()
{
    // we are on UI thread here, don't want to block it
    await Task.Run(() => {
        OurSynchronousVersionAbove();
    });
    // back on UI thread
    // do something else
}

Now say you have that synchronous method and want to parallelize the computation. You may try something like this:

static void Test1() {
    var task1 = Task.Run(() => {
        for (int i = 0; i < 50;i++) {
            // do something
        }
    });
    var task2 = Task.Run(() => {
        for (int i = 50; i < 100;i++) {
            // do something
        }
    });
    Task.WaitAll(task1, task2);
}

That will work but it wastes current thread blocked for no reason, waiting for two tasks to complete. Instead, you should do it like this:

static void Test1() {
    var task = Task.Run(() => {
        for (int i = 0; i < 50; i++) {
            // do something
        }
    });

    for (int i = 50; i < 100; i++) {
        // do something
    }

    task.Wait();
}

Now you perform computation in parallel using 2 threads - one thread pool thread (from Task.Run) and current thread. And here is your legitimate use of task.Wait(). Of course usually you should stick to existing solutions like parallel LINQ, which does the same for you but better.

3 Comments

Can you think of a use case where Task.Wait() can be used for CPU bound work?
I've updated my question with an example.
@DavidKlempfner I've expanded my answer to make it more clear.
1

As I mentioned in the comments section if you look at the Receipt as a whole it might make more sense. Let me quote here the relevant part as well.

The Task type serves two purposes in concurrent programming: it can be a parallel task or an asynchronous task. Parallel tasks may use blocking members, such as Task.Wait, Task.Result, Task.WaitAll, and Task.WaitAny. In contrast, asynchronous tasks should avoid blocking members, and prefer await, Task.WhenAll, and Task.WhenAny. Parallel tasks also commonly use AttachedToParent to create parent/child relationships between tasks. Parallel tasks should be created with Task.Run or Task.Factory.StartNew.

In contrast, asynchronous tasks should avoid blocking members and prefer await, Task.WhenAll, and Task.WhenAny. Asynchronous tasks do not use AttachedToParent, but they can inform an implicit kind of parent/child relationship by awaiting an other task.

IMHO, it clearly articulates that a Task (or future) can represent a job, which can take advantage of the async I/O. OR it can represent a CPU bound job which could run in parallel with other CPU bound jobs.

Awaiting the former is the suggested way because otherwise you can't really take advantage of the underlying I/O driver's async capability. The latter does not require awaiting since it is not an async I/O job.


UPDATE Provide an example

As Theodor Zoulias asked in a comment section here is a made up example for parallel tasks where Task.WaitAll is being used.

Let's suppose we have this naive Is Prime Number implementation. It is not efficient, but it demonstrates that you perform something which is computationally can be considered as heavy. (Please also bear in mind for the sake of simplicity I did not add any error handling logic.)

static (int, bool) NaiveIsPrime(int number)
{
    int numberOfDividers = 0;
    for (int divider = 1; divider <= number; divider++)
    {
        if (number % divider == 0)
        {
            numberOfDividers++;
        }
    }
    return (number, numberOfDividers == 2);
}

And here is a sample use case which run a couple of is prime calculation in parallel and waits for the results in a blocking way.

List<Task<(int, bool)>> jobs = new();
for (int number = 1_010; number < 1_020; number++)
{
    var x = number;
    jobs.Add(Task.Run(() => NaiveIsPrime(x)));
}

Task.WaitAll(jobs.ToArray());
foreach (var job in jobs)
{
    (int number, bool isPrime) = job.Result;
    var isPrimeInText = isPrime ? "a prime" : "not a prime";
    Console.WriteLine($"{number} is {isPrimeInText}");
}

As you can see I haven't used any await keyword anywhere.

Here is a dotnet fiddle link and here is a link for the prime numbers under 10 000.

10 Comments

Could you showcase a scenario in parallel programming where using the Task.Wait method would be a good fit, without raising eyebrows? I think that this is the point that the OP wants to be clarified.
I understand what you're saying however this doesn't answer why it's ok to call Task.Wait() when it's CPU bound but not when it's I/O bound. "The latter does not require awaiting since it is not an async I/O job." - you can await any Task, it doesn't matter what it does behind the scenes.
Wouldn't it be better to use await Task.WhenAll(jobs.ToArray()); instead of Task.WaitAll(jobs.ToArray()); so that the calling thread can continue doing other things instead of just waiting for the jobs to finish?
@TheodorZoulias I've updated my post an example, please check it again.
@DavidKlempfner Yes, you can await any Tasks, since the structure exposes a GetAwaiter method. You can await anything if you define such a method. But as Stephen has pointed out you don't have to. It is absolutely fine to call Task.WaitAll for parallel tasks.
|
0

One of the risks of Task.Wait is deadlocks. If you call .Wait on the UI thread, you will deadlock if the task needs the main thread to complete. If you call an async method on the UI thread such deadlocks are very likely.

If you are 100% sure the task is running on a background thread, is guaranteed to complete no matter what, and that this will never change, it is fine to wait on it.

Since this if fairly difficult to guarantee it is usually a good idea to try to avoid waiting on tasks at all.

4 Comments

I would argue that deadlocks are not a problem, because they appear consistently, and they prevent completely the application from functioning. So you'll find them during the development and fix them. They won't even reach the testers. The big problem is blocking the UI, and causing non-responsiveness for the whole duration of the parallel operation. That's the kind of flaw that can slip to production, and cause major problems in the long run, because the freezing duration will get longer and longer as the database grows.
@TheodorZoulias Deadlocks might be easy to find and diagnose in the trivial case, but things are not always trivial. There is no guarantee that a task will consistently need the UI thread, or any other kind of resource. So really disagree that such problems are always found in dev.
@TheodorZoulias Blocking the UI can also be a problem, but this really depend on how slow the operation is expected to be. If I'm reading a single byte from a local file I would not worry all that much about blocking the UI.
Jonas I am not in a position to back my arguments with statistics. All I have is my personal experience from using .NET applications that occasionally deadlock (zero experience), and on the other hand my experience from seeing frustrated .NET developers who ask help on StackOverflow for solving async+deadlock related problems (very very many).
0

I believe that point in this passage is to not use blocking operations like Task.Wait in asynchronous code.

The main point isn't that Task.Wait is preferred in parallel code; it just says that you can get away with it, while in asynchronous code it can have a really serious effect.

This is because the success of async code depends on the tasks 'letting go' (with await) so that the thread(s) can do other work. In explicitly parallel code a blocking Wait may be OK because the other streams of work will continue going because they have a dedicated thread(s).

Comments

-2

I recommended using await instead of Task.Wait() for asynchronous methods/tasks, because this way the thread can be used for something else while the task is running.

However, for parallel tasks that are CPU -bound, most of the available CPU should be used. It makes sense use Task.Wait() to block the current thread until the task is complete. This way, the CPU -bound task can make full use of the CPU resources.

Update with supplementary statement.

Parallel tasks can use blocking members such as Task.Wait(), Task.Result, Task.WaitAll, and Task.WaitAny as they should consume all available CPU resources. When working with parallel tasks, it can be beneficial to block the current thread until the task is complete, since the thread is not being used for anything else. This way, the software can fully utilize all available CPU resources instead of wasting resources by keeping the thread running while it is blocked.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.