2

I have a function that uses the module cherio to get data from a website.

Now I'd like to iterate this function over an array of keywords, collect the intermediate results in an array named stats and finally print the results of the stats array to the console via console.log()

Whenever I run this script it triggers the async function quickly and prints an empty stats array.

Now my question: How can I wait for the async functions to complete so that I can print the array to console when it's populated / finished.

I have googled a lot and searched stack overflow. There seem to be many ways to accomplish my goal, but what is the most idiomatic way in node to do this?

Here is the way I solved it:

var request = require("request"),
    cheerio = require("cheerio"),
    base_url = "http://de.indeed.com/Jobs?q=";  // after equal sign for instance:   sinatra&l=

/* search syntax:
   - http://de.indeed.com/Jobs?q=node&l=berlin&radius=100
   - 
   - 
*/ 

// //
var search_words = ["django", "python", "flask", 
                    "rails", "ruby",
            "node", "javascript", "angularjs", "react",
            "java", "grails", "groovy",
            "php", "symfony", "laravel"
            ];

var counter = 0;
var stats = [];


function getStats(keyword) {
    url = base_url + keyword + "&l=";
    request(url, function(err, resp, body) {
    if(!err) {
        $ = cheerio.load(body);
        data = $("#searchCount")[0].children[0].data.split(" ").reverse()[0];

        stats.push([keyword, data]);
        counter++;
    }
    // list complete?
    if (counter === search_words.length) {
        console.log(stats);
    }
    });
}

for (var j=0; j<= search_words.length; j++) {
    getStats(search_words[j]);
}
1
  • I just ran your code and it works fine. Sure it's not a nice solution but works well. Commented Apr 4, 2016 at 15:42

3 Answers 3

3

Promise is the best solution for handling asynchronous operations.

Promise.all(search_words.map(function(keyword) {
  return new Promise(function(resolve, reject) {
    request(base_url + keyword + "&l=", function(err, resp, body) {
      if (err) {
        return reject(err);
      }
      $ = cheerio.load(body);
      resolve([keyword, $("#searchCount")[0].children[0].data.split(" ").reverse()[0]]);
    });
  });
})).then(function(stats) {
  console.log(stats);
});

Sign up to request clarification or add additional context in comments.

3 Comments

This I like a lot! Thank you.
By the way: What exactly are the arguments resolve and reject? I see they are being called as functions. "return reject(err);" and "resolve( [keyword..." . But if these two arguments are functions themselves: where are they defined? I mean where is the appending to the array stats done actually?
@Ugur I would suggest you to read this article for further information about Promise html5rocks.com/en/tutorials/es6/promises
1

The most common way I can think of is using a promise library like Q.

npm install --save q

Then use it in your code:

var Q = require('q');
var requestFn = q.denodeify(request);

Then you iterate over your values:

var promises = search_words.map(function(keyword) {
   url = base_url + keyword + "&l=";
   return requestFn(url);
});

Q.all(promises).then(function(values) {
    //values will contain the returned values from all requests (in array form)
}, function(rejects) {
   //rejected promises (with errors, for example) land here
});

The denodeify function from Q basically turns the callback-based function into one that returns a promise (a step-in for the future value, as soon as it's there). That function is requestFn (find a better name for it!). All these promises are collected in one Array which is passed to Q.all to make sure that all promises are fulfilled (if one is rejected, other promises are rejected too).

If that is not your intended behavior: There are loads of ways to play with the excellent Q library. See the documentation: https://github.com/kriskowal/q

I did not bullet proof test this code. You might need to play around with it a bit, but it should give you a good idea of how to soundly do things like this. Having a counter go up is usually a very unreliable way of handling asynchronous code.

2 Comments

Upvoting would be appreciated if you like it ;)
Would love to. But stackoverflow says I need at least 15 reputation points to do so
0

Other the usual (and correct) way you use to solve the problem, there are some modules that let you write code that is synchronous, if you really want.

Try to google for "nodejs synchronous" give as result some link to nodejs modules and/or methodologies to write synchronous code in nodejs, but I suppose they are usefull only to some specific problem (never used them myself)

4 Comments

Thank you for your answer. I am aware they are asynchronous. - I would like to know what the most idiomatic way is to solve this issue. - Do I have to nest two functions?
Basically yes. Or you can use some nodejs modules (like synchronize.js or node-sync) that allow you to serialize the calls. But as I said, I never used any of this modules myself, so I don't really know it they fit your problem
Now how would I nest the iteration and the asynchronous function? Which kind of iteration lets me wait for the result of a callback?
The modules I appointed have their own syntax for writing the code, so you should look at the module's documetation you may decide to use

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.