4

I am pretty new to web worker and multi-thread design, what I need to design is a simple query task scheduler (using web worker right now) like:

var taskScheduler = {};
taskScheduler.pool = [];
function init(num){
    for(var i=0; i<num; i++){
        var workUnit = {};
        var worker = new Worker("worker.js");
        workUnit.busy = false;
        workUnit.worker = worker;
        taskScheduler.pool.push( workUnit );
    }
}
init(4);

Code below should be loop query workUnit availability and start new work Which I am not quite sure how to implement, I thought it should be something like:

taskScheduler.startWork = function(task){
    for(var i=0; i<taskScheduler.pool.length; i++){
        if(!taskScheduler.pool.busy){ // fire job, make unit busy then break; }
    }
}

Currently the main challenge is:

How do I keep checking worker availability while still be able to accept new task call( for example: if there is no worker available, it will keep asking which will block the ability to accept new job call of taskScheduler.startWork )

4 Answers 4

1

Checking worker availability is wrong concept from the beginning. Javascript is event driven language. You do not poll for changes, you listen for them using events.

What you should do is the following:

var tasksToProcess = [/*{task: this is sent to worker, onfinish: onfinish callback}*/];
var processedTask = null;
var worker = new Worker("taskProcessor.js");
// The worker dispatches a message when task is completed
worker.addEventListener("message", function(event) {
     //Usually I give messages names to simulate Javascript event style
     if(event.data.name == "task_success") {
         if(processedTask!=null) {
             if(processedTask.onfinish instanceof Function)
                 // If onfinish throws error, it will not crash this function
                 setTimeout(processedTask.onfinish,0);
             processedTask = null;
         }
         else
             console.error("Weird, task sucessfully completed when no task was running.");
         // No matter what, try to start next task
         runNextTask();
     }
});
function runNextTask() {
    // Only run when no task is being processed
    if(processedTask==null && tasksToProcess.length>0) {
        //Shift removes and returns first value of array
        processedTask = tasksToProcess.shift();
        worker.postMessage({name:"task", task: processedTask.task});
    }
}
/// Task in format {task: task data, onfinish: finish event callback}
function addTask(task) {
    // Add task to the end
    tasksToProcess.push(task);
    runNextTask();
}

This will work as long as the Worker properly calls message after finishing a task. YOu should also implement Worker onerror callback so that you can recover from errors.

Sign up to request clarification or add additional context in comments.

Comments

0

One way of doing it would be to call startWork after a delay if all the workers are busy

taskScheduler.startWork = function(task){
    for(var i=0; i<taskScheduler.pool.length; i++){
        if(!taskScheduler.pool.busy) {
          // fire job, make unit busy then return;
        }
    }
    setTimeout(function() {
      taskScheduler.startWork(task);
    }, 500);
}

This way has problems

  • Jobs aren't guaranteed to start in the order requested
  • There could be time when there are free workers that aren't doing anything
  • Might be tricky to manage the tasks waiting to go to the workers (say you need to stop them going to the workers)

But

  • It is simpler compared to a queue

A better option would be to implement a queue of tasks in the main thread. I'll leave this to another answer :-)

Comments

0

If you know each task will take approximately the same amount of time, depending on your use-case, you could forgo having any sort of "busy" status, and just fire of the tasks to the workers in turn. Each message will be queued by Javascript until the stack in each worker has cleared

taskScheduler.nextWorkerIndex = 0;
taskScheduler.startWork = function(task){
  var worker = taskScheduler.pool[taskScheduler.nextWorkerIndex];
  taskScheduler.nextWorkerIndex = (taskScheduler.nextWorkerIndex + 1) % taskScheduler.pool.length
  // Fire job
}

This has

  • Simple logic

but

  • If tasks don't take the same amount of time, then a task could have been sent to a worker that is busy while others are free

  • Difficult to manage the queue of tasks, since once a message sent, until it is received by the worker it is hidden from Javascript code.

Comments

0

A way of doing this would be to implement a queue of tasks. Where

  • When there is a task to be done, it is added to the queue
  • An attempt is made to process the oldest item in the queue in 2 cases

    • When a task is added to the queue
    • When a worker has finished its work

A simple partial-implementation is below

taskScheduler.queue = [];
taskScheduler.addToQueue(task) {
  queue.push(task);
  taskScheduler.processQueue();
}

taskScheduler.processQueue() {
  if (!queue.length) return;
  var task = queue.shift();
  for (var i=0; i<taskScheduler.pool.length; i++) {
    if (!taskScheduler.pool.busy) {
      // fire job, make unit busy then break
    }
  }
}

function init(num) {
  ...
  worker.onmessage = function(e) {
    // Assuming a worker is finished if it passes a message back to main thread 
    workUnit.busy = false;
    processQueue();
  }
  ...
}

This way

  • Tasks are guaranteed to be started in the order added to the queue

  • The workers spend very little time not doing anything when there are tasks waiting

  • You can add logic to control the queue if you need to. i.e. clear un-started tasks, re-arrange order, etc.

But

  • The logic is a little bit more complex than other options.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.