62

I've got a list of articles, and each of them has an array property which lists various individuals mentioned in them:

_id: {
    $oid: "52b632a9e4f2ba13c82ccd23"
},
providerName: "The Guardian",
url: "http://feeds.theguardian.com/c/34708/f/663860/s/3516cebc/sc/38/l/0L0Stheguardian0N0Cmusic0C20A130Cdec0C220Cwaterboys0Efishermans0Eblues0Etour0Ehammersmith/story01.htm",
subject: "The Waterboys – review",
class_artist: [
    "paul mccartney"
]

I've been trying (unsuccessfully) to get a list of all the individual artists (class_artist), based on the number of articles they've been tagged in within the past 7 days.

I've gotten as far as:

var date = new Date();
date.setDate(date.getDate() - 7);

db.articles.group({
    key: { class_artist: 1 },
    cond: { class_date: { $gt: date } },
    reduce: function ( curr, result ) { result.cnt++; },
    initial: { cnt : 0 }
}).sort({cnt: -1});

But unfortunately, it doesn't count them based on the individual array values, but by array compositions (that is, lists of artists).

I tried using the $unwind function, but have not been able to make it work.

1 Answer 1

167

What framework are you using? This is not MongoDB shell and looks like some weird wrapper around MapReduce. In that case $unwind would not be available, and you need it for user in the aggregation framework. Here's what you want in the mongo shell:

db.articles.aggregate([
  {$match: { class_date: { $gte: date } } },
  {$project: { _id: 0, class_artist: 1 } },
  {$unwind: "$class_artist" },
  {$group: { _id: "$class_artist", tags: { $sum: 1 } }},
  {$project: { _id: 0,class_artist: "$_id", tags: 1 } },
  {$sort: { tags: -1 } }
])

So efficiently:

  1. Filter by date because you already set a var for the last 7 days
  2. Project only the field(s) we need { We need only one! }
  3. Unwind the array so we now have a record for every array element in every document
  4. Group on the Artist from the expanded documents
  5. Project into a document format you can use as group messed around with _id
  6. Sort the results in reverse order to see the top tagged first

And the great thing about aggregation is you can gradually build up those stages to see what is going on.

Shake and bake into your own driver implmentation or ODM framework as required.

Sign up to request clarification or add additional context in comments.

1 Comment

FYI, the "weird wrapper" format is the group() command which is implemented in JavaScript and predates the Aggregation Framework. See also: MongoDB aggregation comparison: group(), $group and MapReduce.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.