Why my offset variable's value remains zero

Question

Trying to make a simple Tumblr scraper using node.js

var request = require('request');
var fs = require('fs');
var apiKey = 'my-key-here';
var offset = 0;

for (var i=0; i<5; i++) {
  console.log('request #' + i + '...');

  var requestURL = 'http://api.tumblr.com/v2/blog/blog.tumblr.com/posts/text?api_key='
    + apiKey
    + '&offset='
    + offset;

  console.log(requestURL);

  request(requestURL, function(error, response, body) {
    if (!error && response.statusCode == 200) {
      var resultAsJSON = JSON.parse(body);
      resultAsJSON.response.posts.forEach(function(obj) {
        fs.appendFile('content.txt', offset + ' ' + obj.title + '\n', function (err) {
          if (err) return console.log(err);
        });   
        offset++;  
      });       
    }
  }); 
}

By default, the API only returns a maximum of 20 latest posts. I want to grab all the posts instead. As a test, I want to get the latest 100 first, hence the i<5in the loop declaration.

The trick to do it is to use the offset parameter. Given an offset value of 20, for example, the API will not return the latest 20, but instead returns posts starting from the 21st from the top.

As I can't be sure that the API will always return 20 posts, I am using offset++ to get the correct offset number.

The code above works, but console.log(requestURL) returns http://api.tumblr.com/v2/blog/blog.tumblr.com/posts/text?api_key=my-key-here&offset=0 five times.

So my question is, why does the offset value in my requestURL remains as 0, even though I have added offset++?

Not this again. You fire off a request and expect it to complete before the loop goes to the next iteration. The requests doesn't even get started until after the loop completes which is why offset is zero for all of them. You need an asynchronous for-each loop. — Dan D.
– Dan D., Commented Jan 29, 2014 at 8:58
The thing is I'm writing the offset variable in appendFile, and they showed up correctly in the text file from 0 to 99. — hfz
– hfz, Commented Jan 29, 2014 at 9:01
That's just due to the callbacks to the requests occurring in the same sequence they were fired in but that is not guaranteed and you should not depend on it. — Dan D.
– Dan D., Commented Jan 29, 2014 at 9:04
I understand what you mean. I thought this is some sort of variable scoping gotcha that I'm unaware of, but now I'm not so sure. — hfz
– hfz, Commented Jan 29, 2014 at 9:08

pawel · Accepted Answer · 2014-01-29 10:33:21Z

1

You should increment the offset in the loop, not in callbacks. Callbacks fire only after the request has been completed, which means you make five requests with offset = 0 and it's incremented after you get a response.

  var requestURL = 'http://api.tumblr.com/v2/blog/blog.tumblr.com/posts/text?api_key='
    + apiKey
    + '&offset='
    + (offset++); // increment here, before passing URL to request();

Edit: To offset by 20 in each iteration, and use the offset in callback:

for (var i=0; i<5; i++) {
var offset = i * 20, requestURL = 'http://api.tumblr.com/v2/blog/blog.tumblr.com/posts/text?api_key='
    + apiKey
    + '&offset='
    + offset;

    (function(off){ 
        request(requestURL, function(error, response, body) {
            if (!error && response.statusCode == 200) {
                var resultAsJSON = JSON.parse(body);
                resultAsJSON.response.posts.forEach(function(obj) {
                    fs.appendFile('content.txt', off + ' ' + obj.title + '\n', function (err) {
                        if (err) return console.log(err);
                    });   
                    off++;  
                });       
            }
        });
    }(offset)); // pass the offset from loop to a closure
}

edited Jan 29, 2014 at 10:33

answered Jan 29, 2014 at 9:02

pawel

37.1k7 gold badges59 silver badges54 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

hfz Over a year ago

There's a difference between i and offset, though. offset needs to be the number of posts to be skipped, while i is the number of times I want to use the API. 1 API call gives up to 20 posts at once.

pawel Over a year ago

But in your question you increment offset by one in each iteration, so I've assumed offset == i. Use offset = i * 20 then.

hfz Over a year ago

offset++ is inside the forEach scope, which iterates all of the posts returned by the API. So I'm counting the posts there.

hfz Over a year ago

Tried running it. Got an error: }(offset)); // pass the offset from loop to a closure SyntaxError: Unexpected token }

pawel Over a year ago

forgot ); in the line before }(offset));

Collectives™ on Stack Overflow

Why my offset variable's value remains zero

1 Answer 1

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related