0

So I have this simple code to mine some videos url in order to apply another scraping function to it afterward. My problem is that I can't seem to return the url-filled array. I know that it's a problem of scope but I'm not that familiar with Javascript and my knowledge got me as far as I could.

Here is the code :

var request = require('request');
var cheerio = require('cheerio');

var startUrl = 'http://www.somewebsite.com/mostviewed';

var getVideoIds = function(url) {

    var urls = [];

    request(url, function(err, resp, body){
        if (err)
            throw err;
        $ = cheerio.load(body);


        var videoUrls = [];
        $('.videoTitle a').each(function() {
            videoUrls.push($(this).attr('href'));
        });
    });

   return urls;
}


var urlsToScrap = getVideoIds(startUrl);
console.log(urlsToScrap);

PS : the current code returns an empty array;

1
  • Let me check that out ! Commented Apr 30, 2014 at 16:19

1 Answer 1

1

You have two issues. One is that you're returning urls but it's never set to anything. You are pushing values onto videoUrls but you're returning the empty urls array. The other is that request is an asynchronous function. You will need to set a callback to set the video urls once it brings the scraped data back.

So:

var urls = [];

request(url, function(err, resp, body){
    if (err)
        throw err;
    $ = cheerio.load(body);

    $('.videoTitle a').each(function() {
        urls.push($(this).attr('href'));
    });

    onVideosScraped();
});

function onVideosScraped() {
    console.log(urls);  
}

This should work, and is a rudimentary way to do it. You can of course wrap any of this you want in functions to make it more reusable, but I hope this answers your question.

Sign up to request clarification or add additional context in comments.

1 Comment

Yes it does, I was about to post almost the same answer ! Thank you !

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.