1

I would like to store the connected sockets'id (socket.io) and some other information about these in node.js. The reason is, that I had to list the clients and their information or I had to find one by it's id. I am able to use just the socket object which is created when a client connect.

I thought that if a client connected, I will 'put' it's id and the additional information to the clients variable.

var clients = /* this is the question */;

io.on('connection', function(socket) {
    // I can't use `io` just the `socket`
})

I have two idea for this problem but I don't know which is the better structure to do this or which will be faster and use less memory if a lot of clients connected?

Object:

The unique id is the key, and the data was stored in it's value.

{
    '01234': {
        // ...
    },
    '56789': {
        // ...
    }
}

Array of objects:

The objects are stored in an array and their unique id and the data are stored in them too.

[
    {
        id: '01234'
        // ...
    },
    {
        id: '56789'
        // ...
    }
]

Which is faster or better for the performance and the memory? Or are there any other solution for this?

3
  • 1
    Both have their tradeoffs. Using an object will be faster due to key indexing but takes more memory in most cases. An array could take less memory (depends on if the compiler/interpreter can optimize it) but requires searching the list every time you want to find a particular socket. So you have to pick your tradeoff and, if performance/memory is a big deal, run some profiling to see which one is better for you. Commented Dec 22, 2015 at 22:18
  • Directly accessed elements are generally faster to get to than having to sort through a list to find an object. Commented Dec 22, 2015 at 22:18
  • If you go object (I did in our large scale socket implementation), I would suggest storing the id field in the object, as well as using for the key. Takes a bit more space, however it will help make the code more manageable in the long run. Commented Dec 22, 2015 at 23:42

2 Answers 2

1

Both approaches will be almost exactly the same in terms of memory. Storing data in objects or array of objects will not affect the memory consumption.

Performance-wise, though, if you often tend to access objects by their id, it's a good idea to store it as a key. You won't have to loop through each element of the collection to find it by it's id.

As @Josh said, though, you are creating a non-standard collection structure which might be difficult to work with.

If this is a concern to you, you could create an external index.

sockets : [ {socket1}, {socket2}, {socket3} ]
indexes : { socket1 : 0, socket2 : 1, socket3 : 2 }

This way, to access a socket by it's id, you can get it's position in the array by it's index stored in the indexes object. Though, you'll have to keep the sockets array and the indexes array in sync.

When adding sockets is easy. You add the socket to the array and the id to the index.

socket.on('add', function(socket){
    var len = sockets.push(socket);
    indexes[socket.id] = len-1;
})

Deleting is trickier. When you "delete" or splice an array, all indexes after the spliced item will be decremented. You'd then have to also decrement all your indexes. You lose in performance.

A better approach would be to not splice the array, but to set the sockets to "undefined" when deleting them. This way, even when deleting a socket, you don't have to update your index.

socket.on('delete' function(socket){
    sockets[indexes[socket.id]] = undefined;
    delete indexes[socket.id];
})

If your application is long running, you'd have to rebuild your indexes every maybe 3000 requests or so, as the "undefined" will start to bloat up your sockets/index array.

function rebuildIndex(){
    indexes = [];
    _.forEachRight(sockets, function(socket, index){
        if (_.isUndefined(socket)) sockets.splice(index, 1) 
        else indexes[socket.id] = index;
    })
}

Also, you could use a library I wrote (affinity) which is a relational algebra library. This library allow the creation of indexes on collection of objects (much like in a database), so you can still have a "normal" collection while having an index-based access on it.

Check here for a working example

var sockets = new affinity.Relation([
    {id : { type : affinity.Integer}}, 
    {socket : {type : affinity.Object}}
],[],{
    pk : 'id'
});

sockets.add(socket1);
sockets.add(socket2);

// then to have only the sockets array (to interact with db maybe)

var socketObjs = sockets.project(['socket']).elements()

this is the simple way of defining your sockets in a relation. Though, you are using twice the memory for the id field (as it is duplicated in the socket and in the ID column). If you wanted, you could also create a column for each of the socket's properties, much like a database table to prevent the duplication of the ID field :

var sockets = new affinity.Relation([
    {id : { type : affinity.Integer}}, 
    {userId : {type : affinity.Integer}},
    {openedDate : {type : affinity.Date}},
    {token : {type : affinity.String}}
    // ...
],[],{
    pk : 'id'
});

// Each socket is a row in the relation. Access the properties like :

sockets.restrict(sockets.get('id').eq('29823')).first()

// ...

Sign up to request clarification or add additional context in comments.

Comments

1

There are advantages and disadvantages of both.

Using an object:

{
    '01234': {
        // ...
    },
    '56789': {
        // ...
    }
}

You can do very simple lookups just by calling sockets[socketId] or whatever.

If you have a collection ([{},{},{}]), you have to iterate over the collection every time you want to find the object:

var socketIWant = sockets.filter(socket => id === 0123)[0];
// or whatever

However, the "collection" pattern is pretty common, and it may be worth it to keep your data in that structure to make it more intuitive for later development.

Also if you ever want to use a database to store your socket information, you can usually directly iterate over a collection serverside and store it one-to-one into, say, a NoSQL database.

If you go for the "object of objects" approach, you might have to do a bit of data manipulation when you query or save objects to your database:

var sockets = { '0123': {} };

Objects.keys(sockets).forEach(function(key) {
  MyDB.save(_.assign(sockets[key], { _id: key }));
});

Or something like the above. Some possibly extraneous "data munging". The collection approach would be a bit simpler to iterate over and save/query from the database (if one were to exist).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.