MongoDB: MapReduce ora Aggregation Framework for this kind of query?

Question

I've the following documents in my collection:

{
  "attr_a" : {
        "_id" : ObjectId("51877b939a9cd88730d6030e"),
        "key" : "a-key-1",
        "name" : "a name 1"    
    },

    "attr_b" : {
          "_id" : ObjectId("51877b9392lfd88730d602sw"),
          "key" : "b-key-1",
          "name" : "b name 2"    
      }

}


{
  "attr_a" : {
        "_id" : ObjectId("51877b939a9cd88730d6030e"),
        "key" : "a-key-2",
        "name" : "a name 2"    
    },

  "attr_b" : {
        "_id" : ObjectId("51877b9392lfd88730d602sw"),
        "key" : "b-key-2",
        "name" : "b name 2"    
    }

}

And I want to create a query the produces as result something like:

{
  "attr_a" : [{
        "_id" : ObjectId("51877b939a9cd88730d6030e"),
        "key" : "a-key-1",
        "name" : "a name 1"    
    }, 
    {
          "_id" : ObjectId("51877b939a9cd88730d6030e"),
          "key" : "a-key-2",
          "name" : "a name 2"    
      }
    ],

"attr_b" : [{
      "_id" : ObjectId("51877b9392lfd88730d602sw"),
      "key" : "b-key-1",
      "name" : "b name 2"    
  }, 
  {
        "_id" : ObjectId("51877b9392lfd88730d602sw"),
        "key" : "b-key-2",
        "name" : "b name 2"    
  }]

}

I'm using MongoDB 2.6.x and Spring Data 1.6.5 and I'm trying to figure out if it can be done with the aggregation framework or with the map-reduce (in both case it should be done in real time) or in another different way.

Or should I take in consideration to introduce a Solr/Elastic Search?

Any suggestion?

Thanks in advance, Alexio Cassani

What's the relation between the data output and the tool to use? According to my experience, you decide to use mongo, elasticsearch or any other no sql, based on performance for the usage you will be giving: mongo for precise queries, aggregation, mapreduce and elasticsearch... well to search, not so precise sometimes, but fast as hell. Even if there was no way to do so, you should evaluate the option of doing it programatically, handle the data after you get it from the database, not change the tool because you couldn't find a way to get a very specific data structure. — jmdiego
– jmdiego, Commented Nov 12, 2014 at 20:12
As said I need that data in real time, but what I've omitted is that performance, even when the result set contains thousands of entries and requests/sec are of the same order of magnitude, is the key for my use case. I know that I can work on the result set programmatically but I think that using the native features of MongoDB or if not possibile using an external component like Solr is a more scalable choice than doing it in Java by myself. And probably it will cost less in money and time. — Alexio Cassani
– Alexio Cassani, Commented Nov 13, 2014 at 12:34

JohnnyHK · Accepted Answer · 2014-11-12 14:58:57Z

1

You can do that by grouping the entire collection into a single doc, using $push to assemble the arrays:

db.test.aggregate([{$group: {
    _id: null, 
    attr_a: {$push: '$attr_a'},
    attr_b: {$push: '$attr_b'}
}}])

answered Nov 12, 2014 at 14:58

JohnnyHK

313k70 gold badges632 silver badges479 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

MongoDB: MapReduce ora Aggregation Framework for this kind of query?

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related