4

Is there a way to convert a nested document structure into an array? Below is an example:

Input

"experience" : {
        "0" : {
            "duration" : "3 months",
            "end" : "August 2012",
            "organization" : {
                "0" : {
                    "name" : "Bank of China",
                    "profile_url" : "http://www.linkedin.com/company/13801"
                }
            },
            "start" : "June 2012",
            "title" : "Intern Analyst"
        }
    },

Expected Output:

"experience" : [
           {
            "duration" : "3 months",
            "end" : "August 2012",
            "organization" : {
                "0" : {
                    "name" : "Bank of China",
                    "profile_url" : "http://www.linkedin.com/company/13801"
                }
            },
            "start" : "June 2012",
            "title" : "Intern Analyst"
        }
    ],

Currently I am using a script to iterate over each element, convert them to an array & finally update the document. But it is taking a lot of time, is there a better way of doing this?

1
  • Please mention your mongodb version? Also mention if you want to transform your document permanently? Commented Mar 22, 2016 at 20:53

4 Answers 4

6

You still need to iterate over the content, but instead you should be writing back using bulk operations:

Either for MongoDB 2.6 and greater:

var bulk = db.collection.initializeUnorderedBulkOp(),
    count = 0;

db.collection.find({ 
   "$where": "return !Array.isArray(this.experience)"
}).forEach(function(doc) {
    bulk.find({ "_id": doc._id }).updateOne({
        "$set": { "experience": [doc.experience["0"]] }
    });
    count++;

    // Write once in 1000 entries
    if ( count % 1000 == 0 ) {
        bulk.execute();    
        bulk = db.collection.initializeUnorderedBulkOp();
    }
})

// Write the remaining
if ( count % 1000 != 0 )
    bulk.execute();

Or in modern releases of MongoDB 3.2 and greater, the bulkWrite() method is preferred:

var ops = [];

db.collection.find({ 
   "$where": "return !Array.isArray(this.experience)"
}).forEach(function(doc) {
   ops.push({
       "updateOne": {
           "filter": { "_id": doc._id },
           "update": { "$set": { "experience": [doc.experience["0"]] } }
       }
   });

   if ( ops.length == 1000 ) {
       db.collection.bulkWrite(ops,{ "ordered": false })
       ops = [];
   }
})

if ( ops.length > 0 )
    db.collection.bulkWrite(ops,{ "ordered": false });

So when writing back to the database over a cursor, then bulk write operations with "unordered" set is the way to go. It's only one write/response per batch of 1000 requests, which reduces a lot of overhead, and "unordered" means that writes can happen in parallel rather than in a serial order. It all makes it faster.

Sign up to request clarification or add additional context in comments.

Comments

1

For mongoDB version >4.2 :

db.doc.aggregate([{ $match: {'experience.0': { $exists: false } } },
    {$project:{experience:["$experience.0"]}}, { $merge: { into: "doc", on: "_id" }
])

Note : Here we're merging the updated field/document with existing, but not replacing/updating entire document, default behavior of $merge is merge whenMatched document is found, You can pass other options like replace/keepExisting etc.

Ref: $merge

1 Comment

This answer is based on @Saleem's answer, as my edit on his answer to include these new changes(features in mongoDB) was rejected & suggestion is to make this as a new answer !!
1

I am not sure, why aren't there any good answers yet.

It's super easy with aggregation "$set", set is used to add a new field. here you can add a new field with same name into an array. So it will override the older field.

Refer below example:

db.collectionName.aggregate[
   // match/other aggregations
   {$set: { "experience": ["$experience"] } }
];

1 Comment

This should be the accepted answer. Dead simple yet effective.
0

See if this query works with your MongoDB version

For MongoDB version 3.2+:

db.doc.aggregate([
    {$project:{experience:["$experience.0"]}}
])

MongoDB < 3.2:

db.doc.aggregate([
    {$group: {_id:"$_id", experience:{$push:"$experience.0"}}}
])

It should transform your document into:

{ 
    "_id" : ObjectId("56f1b046a65ea8a72c34839c"), 
    "experience" : [
        {
            "duration" : "3 months", 
            "end" : "August 2012", 
            "organization" : {
                "0" : {
                    "name" : "Bank of China", 
                    "profile_url" : "http://www.linkedin.com/company/13801"
                }
            }, 
            "start" : "June 2012", 
            "title" : "Intern Analyst"
        }
    ]
}

A better approach if you want to alter documents in collection permanently using aggregation framework.

Lets assume your collection name is doc

db.doc.aggregate([
    {$group: {_id:"$_id", experience:{$push:"$experience.0"}}},
    {$out: "doc"}
])

Query above will transform all of your documents in place.

1 Comment

It's still an incorrect approach regardless of the version. You keep repeating the same thing over and over without understanding what is wrong with it.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.