2

In an Asp.NET MVC application I have a project with a graph of activities to track.

A project doesn't have a single root but multiple ones. Every tree could be complex and deep and every node depends on the others on things like dates and fine grained user permissions.

I need to process all the project graph every time I do an operation on a node because even different branches depends on each other.

The structure is stored flat in a SqlServer DB.

To create the tree I have a recursive function that do a lot of things to create some data for every node (in the context of the current user).

For example I have a project with 3000 nodes that takes more than 2 seconds to process with a single call that create the entire graph.

public static List<Nodes> GetProject(...) {
  var list = new List<Nodes>;
  CreateTreeRecursive(...);
  return list;
}

Remember that I have multiple roots. This let me to parallelize the work and process every branch indipendently.

If I parallelize the execution with Task.Run or Parallel.ForEach the time to create the entire graph is in the range between 15 and 50 ms, 50 times faster.

public static List<Nodes> GetProject2(...) {

  var list = new List<Nodes>;

  Parallel.ForEach(...,
    (root) => {
      ...
    });

  return list;
}

The bad news is that you shouldn't create threads in ASP.NET.

In the specific case I don't have many concurrent users, but with an audience of ~200 users you can't really know for sure.

The other thing is that the roots in a project could be many, up to 100, so many threads would be created.

This solution would be so simple but is inapplicable.

Is there some way to do this in a simple manner or my only option is to offload the work to some external service that can span multiple threads and waiting asyncronously?

If this is the case I would appreciate some suggestions?

To be clear this is an operation that is made for any user interaction on the project. I can't cache the result, is too volatile. I can't enqueue somewhere and eventually get the result.

Thanks

5
  • Usually problems like this are handled by caching, is that feasible with your data source? You may not be able to cache a entire tree but subsets of the tree may be cache-able and not need to be recalculated. Commented Jun 29, 2016 at 14:19
  • Do the users really need to know the state of 3000 individual nodes? If they're different branches and they depend on each other, they don't really feel like they're different branches to me. Based on what you're saying it's impossible for us to solve your problem and at the best we can give broad general advice which isn't really useful to solving problems because we don't know what you know. Commented Jun 29, 2016 at 14:19
  • @ScottChamberlain Yes, I could cache the branches but if the edits are frequent I risk to have very short living cache entries. Every minimal piece of data changed means an invalidation of the branch. Commented Jun 29, 2016 at 14:42
  • @GeorgeStocker A branch can depends on another only for the starting date. Most of the time users don't need the entire picture, but sometimes is necessary. Commented Jun 29, 2016 at 14:42
  • I start to admit (I already thinked about it) that I need to cache something, but I am really interested in the parallelization problem in ASP:NET, would be the simplest solution. Commented Jun 29, 2016 at 14:45

1 Answer 1

1

The bad news is that you shouldn't create threads in ASP.NET.

This is not true and this wrong assumption is blocking the right solution.

You can create threads. The risk that you probably have in mind is that you might exhaust the capacity of the thread pool. This is not easy to do in general.

Your threads are CPU bound. This means that your server is totally overloaded long before the pool is exhausted. Pool capacity is not your limiting factor.

With some assumptions we can make up a concrete scenario: An 8 core server is saturated at 8 threads (that are runnable like here). But the thread pool would not be considered overloaded if there are less than 100 threads. (The actual number varies. 100 should be safe in a wide range of cases.)

Further, Parallel.ForEach uses pool threads. It does not create a meaningful amount of threads. It also does not occupy one thread per input item.

I don't see anything here to worry about.

Sign up to request clarification or add additional context in comments.

11 Comments

This would be fantastic. So if I have 100 branches Parallel.ForEach doesn't create 100 threads? And Task.Run? And what is the impact of the number of concurrent users?
For the first two questions I think you should research that a bit. A lot more than what I can add here has been documented. Regarding concurrent users, this metric is meaningless. Let's talk about concurrent requests. I don't see why parallelism would have an impact here assuming your server is not overloaded on CPU. If you overload it all bets are off but that is, I think, intuitively clear.
Since your computation is done after 50ms there is not a risk of crowding out other users.
@sevenmy is the question fully answered for you?
I'm doing some research and tests. I think I will accept the answer because in my case the solution seems acceptable, the parallelization really improve the performances, and I use it only when the graph is big (uncommon), more than 500 nodes, and in conjunction with caching for read-only scenarios, so shouldn't run for every request. For the sake of the conversation what do you suggest as an alternative approach, an external windows service or azure web job?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.