0

I have a book list in json as following:

{
    "_id" : ObjectId("1"),
    "author" : [
        "Mary",
        "Tony"
    ],
    "booktitle" : "Book1",
    "Category" : "Children"
}
{
        "_id" : ObjectId("2"),
        "author" : [
            "Joe",
            "Tony"
        ],
        "booktitle" : "Book2",
        "Category" : "Children"
}
{
            "_id" : ObjectId("3"),
            "author" : [
                "Joe",
            ],
            "booktitle" : "Book3",
            "Category" : "comedy"
}
.......

I hope to get top 10 authors who write books belongs to "Children". Because some book are not just written by one author. I don't know how to do it.

Given the example above, book1 and book2 belongs to "Children", Tony writes 2 books, Mary and Joe writes 1 book. Thus, Top 3 writers are in order Tony, Mary, Joe.

I just write down:

db.table.find({Category: "Children"}).aggregate({$group:{_id: '', count : {$sum : 1}}}, {$sort: {count : -1}})

but don't know how to write the group part. Thanks.

0

1 Answer 1

0

tl;dr

Actually, it is pretty easy.

db.books.aggregate([
    { $match:{ Category:"Children" }},
    { $project:{ _id:0, author:1 }},
    { $unwind:"$author" },
    { $group:{ _id:"$author", count:{ $sum: 1} }},
    { $sort:{ count:-1 }},
    { $limit: 10 } 
])

In detail

Sample data

{ "_id" : ObjectId("570ec0a29307105e18d42bae"), "author" : [ "Mary", "Tony" ], "booktitle" : "Book1", "Category" : "Children" }
{ "_id" : ObjectId("570ec0cf9307105e18d42baf"), "author" : [ "Joe", "Tony" ], "booktitle" : "Book2", "Category" : "Children" }
{ "_id" : ObjectId("570ec0f59307105e18d42bb0"), "author" : [ "Joe" ], "booktitle" : "Book3", "Category" : "comedy" }

The aggregation explained

  1. { $match:{ Category:"Children" }} First we find all books in the category in question
  2. { $project:{ _id:0, author:1 }} We only want the author array, which we
  3. { $unwind:"$author" }. That means that for each array element we create a new document, which retains the other fields of the original document. In our case, this is none, and hence, at this stage the result of our aggregation looks like this:

    { "author" : "Mary" }
    { "author" : "Tony" }
    { "author" : "Joe" }
    { "author" : "Tony" }
    
  4. { $group:{ _id:"$author", count:{ $sum: 1} }} Next, we simply group by author and increment a counter for each occurrence of the same distinct author name.
  5. { $sort:{ count:-1 }} We sort by the field we just created, count, in descending order and
  6. { $limit: 10 } limit to the ten authors which have the highest count

The aggregation gives the following

Result

{ "_id" : "Tony", "count" : 2 }
{ "_id" : "Joe", "count" : 1 }
{ "_id" : "Mary", "count" : 1 }
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.