5

I'm using the Dask distributed scheduler, running a scheduler and 5 workers locally. I submit a list of delayed() tasks to compute().

When the number of tasks is say 20 (a number >> than the number of workers) and each task takes say at least 15 secs, the scheduler starts rerunning some of the tasks (or executes them in parallel more than once).

This is a problem since the tasks modify a SQL db and if they run again they end up raising an Exception (due to DB uniqueness constraints). I'm not setting pure=True anywhere (and I believe the default is False). Other than that, the Dask graph is trivial (no dependencies between the tasks).

Still not sure if this is a feature or a bug in Dask. I have a gut feeling that this might be related to worker stealing...

1 Answer 1

4

Correct, if a task is allocated to one worker and another worker becomes free it may choose to steal excess tasks from its peers. There is a chance that it will steal a task that has just started to run, in which case the task will run twice.

The clean way to handle this problem is to ensure that your tasks are idempotent, that they return the same result even if run twice. This might mean handling your database error within your task.

This is one of those policies that are great for data intensive computing workloads but terrible for data engineering workloads. It's tricky to design a system that satisfies both needs simultaneously.

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks. At least now I can stop trying to debug this. So it's feature. Maybe you can update the online documentation to highlight that this is a possibility? Also, not sure what's the point of having the "pure" argument if a task can run multiple times either way.
I also raised a github issue asking for a way to switch off work stealing for tasks that should run only once: github.com/dask/distributed/issues/847
If this happens and a task is run twice (e.g.), which task result is used in Dask's DAG? Is it just a straight up race?
It's not a race, but it is a random-ish choice.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.