0

I am studying this hadoop module at yahoo(https://developer.yahoo.com/hadoop/tutorial/module4.html) and I am reading the Speculative Execution part. My question is this,
Where will the abandoned tasks and discarded outputs go because of the speculative execution because according to the module

Because according to yahoo,
"If other copies were executing speculatively, Hadoop tells the TaskTrackers to abandon the tasks and discard their outputs"

1 Answer 1

1

Eventhough it was not explicit, discarding of abandoned tasks implies release resources (memory and cpu cores) from killed tasks and free up the disk space (erasing the output on disk). If you are using YARN, Node Manager will release containers.

Either original task or speculative task will be killed depending on whoever completes first. If speculative task completes first, original task is killed and if original task completes first, speculative task will be killed.

What happens when you kill a normal java process? Resources used by that process are released. Same thing will happen in this case too. The only difference is task is killed gracefully.

TaskAttemptKillEvent will provide more insight on this topic

Sign up to request clarification or add additional context in comments.

5 Comments

released? meaning data loss?
Data is already available with successful task. Failed task data will be discarded. There is no use with data from these tasks.
So if the data is discarded then the available data will be incomplete,, or will other jobs catch or execute another data?
Data is available from successful task - either Mapper or Reducer. If a job consists of 10 Mappers and 1 Reducer, each task will have complete data as per that task. Assume that for 11 tasks, 13 tasks have been launched including 2 speculative tasks. Only data from 11 tasks will be considered.
Note that Speculative tasks are duplicate tasks and hence we are not losing data.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.