41

I have a database of students and their contact details. I'm trying to find out the postcode that houses the most students. The documents for the students look something like this...

{studentcode: 'smi0001', firstname: 'bob', surname: 'smith', postcode: 2001}

I thought I could use the aggregation framework to find out the postcode with the most students by doing something like...

db.students.aggregate({$project: { postcode: 1 }, $group: {_id: '$postcode', students: {$sum: 1}}})

this works as expected (returns postcodes as _id and the number of students in each postcode as 'students', but if I add $sort to the pipeline it seems to try sorting by the whole student collection instead of the results of the $group operation.

what I'm trying look like...

db.students.aggregate({$project: { postcode: 1 }, $group: {_id: '$postcode', students: {$sum: 1}}, $sort: {_id: -1}})

but it returns the whole collection and disregards the $project and $group... Am I missing something? I thought I'd just be able to sort by descending number of students and return the first item. Thanks in advance for any help.

2 Answers 2

81

You almost had it...

db.test.aggregate(
  {$group: {_id: '$postcode', students: {$sum: 1}}}, 
  {$sort: {_id: -1}}
);

gives (I added some test data matching your sample):

{
  "result" : [
    {
        "_id" : 2003,
        "students" : 3
    },
    {
        "_id" : 2002,
        "students" : 1
    },
    {
        "_id" : 2001,
        "students" : 2
    }
  ],
  "ok" : 1
}

You had an outer {} around everything, which was causing some confusion. The group and sort weren't working as separate operations in the pipeline.

You didn't really need the project for this case.

Update You probably want to sort by "students", like so, to get the biggest zipcodes (by population) first:

db.test.aggregate(
  {$group: {_id: '$postcode', students: {$sum: 1}}}, 
  {$sort: {students: -1}}
);
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks heaps for the advice. I can't believe it was just a misplaced bracket problem. These are the sort of problems I used to have learning SQL 15 years ago, moving to mongodb has meant leaving behind so much prior knowledge, but I reckon it'll be worth it. Cheers,
It works for your use case, but this approach doesn't always guarantee the results you expect to see. For example results will be incorrect when you need to group by field1 but keep it sorted by field2.
I had postcodes POST1, POST2, POST3, each post code have unlike number of students. What should be my sort query to get the sum of each POSTs. db.test.aggregate( {$group: {_id: {'postcodes':'$postcodes'}, students: {$sum: 1}}}, What should be the sort query here. );
5

I think your syntax is slightly wrong. Each aggregation operation in the pipeline should be its own document.

db.students.aggregate( {$project: ...}, {$group: ...}, {$sort: ...} )

In your case, it should be:

db.students.aggregate(
    {$project: { postcode: 1 }}, 
    {$group: {_id: '$postcode', students: {$sum: 1}}}, 
    {$sort: {students: -1}}
)

I've tested it on a sample collection based on your schema and it works for me, sorting the grouped post codes by number of students, descending.

4 Comments

Does $project do anything for you in this case?
@WesFreeman You're right, the $project could be omitted. I guess if you had really large documents, trimming them down to only the necessary information for further processing in the pipeline might be of advantage, but in this case, not much is gained.
yeah, my 'student' documents actually have a heap more fields, so I'm using project to cut out the unneeded fields.
Thanks for the advice. my new command is > db.students.aggregate({$project: { postcode: 1 }}, {$group: {_id: '$postcode', students: {$sum: 1}}}, {$sort: {students: -1}}) and it works perfectly.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.