I’m trying to decide which of the following schemas is most efficient for implementation with mongodb. I require to keep track of friend id’s & mutual friend counts for each user in a system (user_id is unique across the collection). The number of friends may be up to 100,000.
Schema 1
{
“_id” : “…”,
“user_id” : “1”,
friends : {
“2” : {
“id” : “2”,
“mutuals” : 3
}
“3” : {
“id” : “3”,
“mutuals”: “1”
}
“4” : {
“id” : “4”,
“mutuals”: “5”
}
}
}
Schema 2
{
“_id” : “…”,
“user_id” : “1”,
friends : [
{
“id” : “2”,
“mutuals” : 3
},
{
“id” : “3”,
“mutuals”: 1
},
{
“id” : “4”,
“mutuals”: 5
}
]
}
Requirements:
- Given a user_id and friend id update the document such that if friend id exists increment mutuals by 1, else add new friend with a mutuals of 1
- Given a user_id and friend id update the document such that if friend exists and mutual count > 1 then decrement mutual count by 1, else remove friend from document
- With a list of ids, lookup in the document to identify which friend ids exist (I know this is something that can be done client side, but am interested in server side solution)
- What indexes should be used to speed up the above?
In my work in progress I have implemented much of this with schema 1, but am now starting to realise it may not be as suitable as schema 2. However, I am having trouble finding the most efficient methods for the above questions.