1

I have been working on this for about a week, have learnt a lot about pymongo but still can't crack it.

I have the following JSON in mongo

{ 
    "_id" : ObjectId("5845e25xxxxxxxxxx"), 
    "timestamp" : ISODate("2016-08-24T14:59:04.000+0000"), 
    "new_key" : "cambiar", 
    "Records" : [
        {
            "RecordType" : "WFD", 
            "Properties" : [
                {
                    "Property" : {
                        "IsReadOnly" : "False", 
                        "ValueType" : "System", 
                        "Name" : "LastWrite"
                    }, 
                    "value" : "42"
                }, 
                {
                    "Property" : {
                        "IsReadOnly" : "True", 
                        "ValueType" : "String", 
                        "Name" : "Time_as_String"
                    }, 
                    "value" : "24-08-2016 14:59:08"
                }, 
                {
                    "Property" : {
                        "IsReadOnly" : "False", 
                        "ValueType" : "32", 
                        "Name" : "YES"
                    }, 
                    "value" : "1472065148"

there are many more properties below. I am trying to return just the "value" : 1472065148 and nothing else. I have written this query

x = dataset.find_one({"Records.Properties.Property.Name":'YES'},{"Records.Properties.Property.value":1})

but beacuase the 'value' is on the same level as all the other values, I get all of the value results, not just the one I am hoping for.

Is there a way to print only the result object after the object that matches the query???? "Name" : "YES" being the object i'm querying and "value" : "1472065148" being the object I want to print.

------------------------ ADDITIONAL PART ---------------

Above is one document which has 'name' : 'Yes' values that I want to retreive. However every Document has the same 'Name : YES' inside it. What i would like to do, is first select the document based on a different 'Name' : 'xxxx' by its value. For example above - look up 'name' : 'lastwrite' check that its value is 42 (thus selection this document and not the others) before retrieving the 'name' : 'YES' value (as in the answer you have given me).

(if this counts as a new question please let me know and I will remove it and post a new question)

1 Answer 1

3

The only option you have with the existing structure is to use aggregation.

db.dataset.aggregate([{
    $unwind: "$Records"
}, {
    $unwind: "$Records.Properties"
}, {
    $match: {
        "Records.Properties.Property.Name": 'YES'
    }
}, {
    $project: {
        "_id": 0,
        "value": "$Records.Properties.value"
    }
}]).pretty()

Sample Response

{ "value" : "1472065148" }

Assume you were able to update your structure as follows (Removed some fields for brevity). The change here is the Records is no longer an array but just an embedded document.

db.dataset.insertMany([{
  "timestamp": ISODate("2016-08-24T14:59:04.000+0000"),
  "Records": {
      "RecordType": "WFD",
      "Properties": [{
          "Property": {
              "Name": "LastWrite"
          },
          "value": "42"
      }, {
          "Property": {
              "Name": "YES"
          },
          "value": "1472065148"
      }]
  }
}])

Query

db.dataset.find({"Records.Properties.Property.Name":'YES'},{"Records.Properties.$":1}).pretty()

Response

{
    "_id": ObjectId("5859e80591c255c059a3da50"),
    "Records": {
        "Properties": [{
            "Property": {
                "Name": "YES"
            },
            "value": "1472065148"
        }]
    }
}

Update for the additional part:

There are couple of ways you can take care of this. Things can be optimized a bit, but I'll leave that up to you.

Option 1 :

$map applies an equals comparison between the criteria passed and fields in each result element and generates an array with true and false values. $anyElementTrue inspects this array and returns true only if there is at least one true value in the array. Match stage to include only elements with matched value of true.

data field to keep the data using system variable $$ROOT.

Complete Query

db.dataset.aggregate([{
    $unwind: "$Records"
}, {
    $project: {
        "_id": 0,
        "matched": {
            "$anyElementTrue": {
                "$map": {
                    "input": "$Records.Properties",
                    "as": "result",
                    "in": {
                        "$and": [{
                            $eq: ["$$result.Property.Name", "LastWrite"]
                        }, {
                            $eq: ["$$result.value", "42"]
                        }]
                    }
                }
            }
        },
        "data": "$$ROOT"
    }
}, {
    "$match": {
        "matched": true
    }
}, {
    $unwind: "$data.Records"
}, {
    $unwind: "$data.Records.Properties"
}, {
    $match: {
        "data.Records.Properties.Property.Name": 'YES'
    }
}, {
    $project: {
        "_id": 0,
        "value": "$data.Records.Properties.value"
    }
}]).pretty()

Option 2:

Better option (superior in performance) so use this if you the driver that supports $redact.

Similar to the above version but this one combines both the project and match stage into one. The $cond with $redact accounts for match and when match is found it keeps the complete tree or else discards it.

Complete Query

db.dataset.aggregate([{
    $unwind: "$Records"
}, {
    "$redact": {
        "$cond": [{
                "$anyElementTrue": {
                    "$map": {
                        "input": "$Records.Properties",
                        "as": "result",
                        "in": {
                            "$and": [{
                                $eq: ["$$result.Property.Name", "LastWrite"]
                            }, {
                                $eq: ["$$result.value", "42"]
                            }]
                        }
                    }
                }
            },
            "$$KEEP",
            "$$PRUNE"
        ]
    }
}, {
    $unwind: "$Records.Properties"
}, {
    $match: {
        "Records.Properties.Property.Name": 'YES'
    }
}, {
    $project: {
        "_id": 0,
        "value": "$Records.Properties.value"
    }
}]).pretty()
Sign up to request clarification or add additional context in comments.

12 Comments

Thankyou :) extremely helpful.
I have multiple json documents in the db, I'm trying to extract 3 properties with respective values per json. I tried the first method, and it gives me good results, but I can't print them individually like before using 'for x in find_restuls' ' print x' where i can set the range. I tried the second method, changing the script that coverts the XML´s to json and imports them into Mongo in the first place, but when i change from array to object the 'records.append(records)' function stops working, and when i use 'update' it misses information.
I don't have any knowledge of python language. This looks like a problem belonging there. You can ask another question with the problem you are facing.
I have one final question, I hope you can help. I have had success with the first answer you provided, thanks again. I am hoping to use this method to do something slightly different.
I have one final question, I hope not to push my luck. I have had success with the first answer you provided, thanks again. I am hoping to use this method to do something slightly different. -still need the value associated with Name:Yes (as above) -I am hoping to first filter the document based on the Value associated with Name : Parents (as below)
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.