1

I have documents with the following structure:

{
...,
trials:[ {...,
          ref:[{a:1,b:2},{a:2,b:2},...]
         },
         {...,
          ref:[{a:1,b:2}]
         },
         ...,
       ]
}

Where ref is an array guaranteed to be of length of at least 1.

If I want to count the individual occurrences of each of elements in each of the ref arrays I would use the following aggregation. (This works fine)

db.cl.aggregate([
   {$unwind:"$trials"},
   {$unwind:"$trials.ref"},
   {$group:{_id:"$trials.ref", count:{$sum:1}}}
])

Now I want to do the same thing, but only with the last element in each ref array. I need a way to only select the last element of each array in the aggregation pipeline.

I first thought I could add a intermediate step to just get all the elements that I want to group by doing something like this:

db.cl.aggregate([
   {$unwind:"$trials"},
   {$group:{_id:null,arr:{$push:"$trials.ref.-1"}}},...
])

I've also tried using a position operator with $match.

db.cl.aggregate([
    {$unwind:"$trials"},
    {$match:{"trials.ref.$":-1}},...
])

Or trying to project the last element.

db.cl.aggregate([
    {$unwind:"$trials"},
    {$project:{ref:"$trials.ref.1"}}
])

Neither of these get me anywhere. The $pop operator is not valid in the aggregation pipeline. $last operator isn't really useful here.

Any ideas on how to only use the last element of the ref array? I'd rather keep with the aggregation framework and NOT use Map Reduce.

3
  • Much tougher question than you might think. See here Commented Jun 17, 2014 at 14:24
  • I'm fully aware that this is an interesting question. I've only shown some of my attempts here. I've spent quite some time thinking about this one. I could just query all the ref arrays and do the computation locally, but I'd rather not. Commented Jun 17, 2014 at 14:28
  • I referenced the other question because the problem you are facing is a "top-n" one. You cannot $slice in the aggregation framework. And this is a problem given what you want to do. You can emulate it with layers of $first and $match as shown. But there is no easy way to get the "n" results of an array per grouping. This is basically what you are asking. Commented Jun 17, 2014 at 14:35

1 Answer 1

1

The aggregation framework really has no way of dealing with this. Aside from lacking any "slice" type operator, the real problem here is the lack of any marker to tell where your inner array ends, and there really isn't any way to do that with any other form of document re-shaping.

For now at least, the mapReduce approach is very simple, and does not even require a reducer:

db.cl.mapReduce(
    function() {
        this.trials.forEach(function(trial) {
            trial.ref = trial.ref.slice(-1);
        });

        var id = this._id;
        delete this._id;

        emit( id, this );
    },
    function(){},
    { "out": { "inline": 1 } }
)

In the future there might be some hope. Some form of $slice has sought after for some time. But I did notice this interesting snippet inside the $map operator code. Just to list here as well:

    output.reserve(input.size());
    for (size_t i=0; i < input.size(); i++) {
        vars->setValue(_varId, input[i]);

        Value toInsert = _each->evaluateInternal(vars);
        if (toInsert.missing())
            toInsert = Value(BSONNULL); // can't insert missing values into array

        output.push_back(toInsert);
    }

Note the for loop and the index value. I for one would be voting to have this exposed as a variable within the $map operator, as where you know the current position and the length of the array you can effectively do "slicing".

But for now, there is not a way to tell where you are in the array using $map and if you $unwind both of your arrays, you loose the end-points of the inner arrays. So the aggregation framework is lacking in the solutions to the right now.

Sign up to request clarification or add additional context in comments.

1 Comment

I ended up doing this on the application side. I just retrieve the full array from each trial.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.