2

I have a MongoDB collection like this:

{
 id: "213",
 sales : {
     '2014-05-23': {
        sum: 23
     },
     '2014-05-22': {
        sum: 22    
     }
 }
},

{
 id: "299",
 sales : {
     '2014-05-23': {
        sum: 44
     },
     '2014-05-22': {
        sum: 19    
     }
 }
},

I'm looking for a query to get all documents in my collection sorted by sum (document with the largest sum on top...).

For the example data it should return something like this:

{
 id: "299",
 sales : {
     '2014-05-23': {
        sum: 44
     },
     '2014-05-22': {
        sum: 19    
     }
 }
},
 {
 id: "213",
 sales : {
     '2014-05-23': {
        sum: 23
     },
     '2014-05-22': {
        sum: 22    
     }
 }
},

Because: the sum 44 is the largest, therefore this document in the collection shows first.

Is that possibly (and fast enough)? Else I can redesign the database - maybe someone has a suggestion for that?

5
  • What do you mean by "sort by sum"? Do you want to sort all documents by the largest "sum" value? Or do you just want to sort the "sales" of each document by "sum"? In either case, your schema really needs to change here. Commented Jul 18, 2014 at 12:08
  • Oh thank you for point this out - i edit my question to make this clear Commented Jul 18, 2014 at 12:09
  • Just again. "document" or "sub-document" is what was asked. So are we sortin all of the documents in your collection by the largest "sales.sum" or do you just want each "document" to have the largest "sale.sum" listed first? And do you possibly mean the "sum of all sales" per document in the latter case? Answering those makes this a clear question. Commented Jul 18, 2014 at 12:17
  • thank you - i again updated my question. Is this understandable now? Commented Jul 18, 2014 at 12:27
  • Perfect. Very understandable. +1 Commented Jul 18, 2014 at 12:31

1 Answer 1

2

The performance of this is terrible, and because you are throwing away your best option, which is the aggregation framework.

The big rule you are breaking here is "Don't use data as keys".

So when you name "sub-documents" with "keys" that are actually data points then there is no easy way to process them. General MongoDB notation does not like this and you are forced in to JavaScript evaluation. This is still "spritely", but really slow by comparison to native methods.

Change your schema, and then you can use the aggregation framework. Here is the change first:

{
    "_id": 123,
    "sales": [
        { "date": ISODate("2014-05-23T00:00:00Z"), "sum": 23 },
        { "date": ISODate("2014-05-22T00:00:00Z"), "sum": 22 }
    ]
},
{
    "_id": 299,
    "sales": [
        { "date": ISODate("2014-05-22T00:00:00Z"), "sum": 19 },
        { "date": ISODate("2014-05-23T00:00:00Z"), "sum": 44 }
    ]
}

Now your data is in an array and the "paths" are consistent, this means you can sort things very easily:

db.collection.find().sort({ "sales.sum": -1 })

So the document with the "largest" "sales.sum" value will be first.

If then given the above example, you wanted to "sort" the inner array elements by the largest "sales.sum" as well, then you can use the aggregation framework:

db.collection.aggregate([
    { "$unwind": "$sales" },
    { "$sort": { "_id": 1, "sales.sum": -1 } },
    { "$group": {
        "_id": "$_id",
        "sales": { "$push": "$sales" }
    }},
    { "$sort": { "sales.sum": -1 } }
])

Your current documents can be treated this way by using JavaScript evaluation with mapReduce, but don't do it that way as it will be much slower.

Sign up to request clarification or add additional context in comments.

1 Comment

Incredible - thank you for that detailed answer. I will change the schema thats a good suggestion. There is only one point we need the data of the keys and i can rewrite that.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.