3

I have this piece of code:

var pg = require('pg');
var QueryStream = require('pg-query-stream');
var constr = 'postgres://devel:[email protected]/tcc';
var JSONStream = require('JSONStream');
var http = require('http');

pg.connect(constr, function(err, client, done) {
    if (err) {
        console.log('Erro ao conectar cliente.', err);
        process.exit(1);
    }

    sql = 'SELECT \
          pessoa.cod, \
          pessoa.nome, \
          pessoa.nasc, \
          cidade.nome AS cidade \
          FROM pessoa, cidade \
          WHERE cidade.cod IN (1, 2, 3);';

    http.createServer(function (req, resp) {
        resp.writeHead(200, { 'Content-Type': 'text/html; Charset=UTF-8' });
        var query = new QueryStream(sql);
        var stream = client.query(query);

        //stream.on('data', console.log);
        stream.on('end', function() {
            //done();
            resp.end()
        });
        stream.pipe(JSONStream.stringify()).pipe(resp);
    }).listen(8080, 'localhost');
});

When I run apache bench on it, it get only about four requests per second. If I run the same query in php/apache or java/tomcat I get ten times faster results. The database has 1000 rows. If I limit the query to about ten rows, then node is double faster than php/java.

What am I doing wrong?

EDIT: Some time ago I opened an issue here: https://github.com/brianc/node-postgres/issues/653

I'm providing this link because I posted there some other variations on the code I have tried. Even with comments and hints so far, I have not been able to get a descent speed.

6
  • 3
    Are all of your clients sharing the same connection to the database? If that is the case, your requests may be serialized. I am not sure this code properly utilizes the connection pool. Commented Oct 1, 2014 at 1:57
  • @Brandon, yes, they are. I also tried this example from node-postgres documentation, but the results were about the same. Commented Oct 1, 2014 at 11:38
  • 1
    @FernandoBasso, first of all, i don't really use any kind of sql database, but from what i read about querystream, it only keeps a low number of rows in memory. so i think the bottleneck is querystream, you could try doing it without querystream Commented Nov 8, 2014 at 21:41
  • Yes, you should do a benchmark without streaming because with PHP/apache you do not stream the query result I guess. Commented Nov 12, 2014 at 23:13
  • 2
    Just out of curiosity - you are not joining the 2 tables. Is that intentional? Commented Nov 14, 2014 at 13:47

4 Answers 4

6
  • pg-query-stream uses cursors.
  • it uses cursors (bold for emphasis).
  • you can read the code and change batchSize to better fit your needs.

For those who don't know what cursors are, in short they are a trade-off for keeping memory footprint small and not reading a whole table in memory. But if you get 100 rows at a time when you have 1000 results, that's 1000 / 100 round-trips; so probably 10x slower than a solution not using cursors.

If you know how many rows you need, add a limit to your query, and change the number of rows returned each time to minimize number of roundtrips.

Sign up to request clarification or add additional context in comments.

3 Comments

@giladmayani, are you saying that part of the answer is rude? Can you please detail what part? I will fix it.
I find the part where you repeat "it uses cursors" 2 extra times to be rude, sounds sarcastic.
You are mistaking emphasis for rudeness / sarcasm; but I tweaked the answer.
1

As far as I can tell from this code, you create a single one connection to the PostgreSQL and everything gets queued through it.

The pg module allows for this, it's described here: https://github.com/brianc/node-postgres/wiki/Queryqueue

If you want a real performance, the for each HTTP request you should fetch the connection from the pool, use it, release it and make 101% sure you always release (e.g. proper exception handling) or your server will die once the pool gets completely exhausted.

Once you are there you can tweak the connection pool parameters and measure performance.

Comments

0

Looks like you're waiting for the server to be created before the request gets relayed. Try moving http.createServer outside of the call. If you only want to use the http server in the request, you should try making the calls async.

3 Comments

nodejs is asynchronous iirc, also, moving the http.createServer outside the pg.connect would break it cause it use the variable name client which is gotten from pg.connect
@freeforalltousez It's async by default, but code inside of a call needs to use the async package. As of right now the code reads 1) Connect to pg, 2) query, 3) create server, pipe stream.
and here i thought the code reads 1) connect to pg, 2) create server and listen to port, 3) query when client (browser) connect to server, pipe stream
0

Maybe you should set http.agent.maxSockets value, try this:

var http = require('http');
http.agent.maxSockets = {{number}};

default maxSockets is 5

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.