0

I have a collection with the following documents (for example):

{
  "_id": {
    "$oid": "61acefe999e03b9324czzzzz"
  },
  "matchId": {
    "$oid": "61a392cc54e3752cc71zzzzz"
  },
  "logs": [
    {
      "actionType": "CREATE",
      "data": {
        "talent": {
          "talentId": "qq",
          "talentVersion": "2.10",
          "firstName": "Joelle",
          "lastName": "Doe",
          "socialLinks": [
            {
              "type": "FACEBOOK",
              "url": "https://www.facebook.com"
            },
            {
              "type": "LINKEDIN",
              "url": "https://www.linkedin.com"
            }
          ],
          "webResults": [
            {
              "type": "VIDEO",
              "date": "2021-11-28T14:31:40.728Z",
              "link": "http://placeimg.com/640/480",
              "title": "Et necessitatibus",
              "platform": "Repellendus"
            }
          ]
        },
        "createdBy": "DEVELOPER"
      }
    },
    {
      "actionType": "UPDATE",
      "data": {
        "talent": {
          "firstName": "Joelle new",
          "webResults": [
            {
              "type": "VIDEO",
              "date": "2021-11-28T14:31:40.728Z",
              "link": "http://placeimg.com/640/480",
              "title": "Et necessitatibus",
              "platform": "Repellendus"
            }
          ]
        }
      }
    }
  ]
},
{
  "_id": {
    "$oid": "61acefe999e03b9324caaaaa"
  },
  "matchId": {
    "$oid": "61a392cc54e3752cc71zzzzz"
  },
  "logs": [....]
}

a brief breakdown: I have many objects like this one in the collection. they are a kind of an audit log for actions takes on other documents, 'Match(es)'. for example CREATE + the data, UPDATE + the data, etc.

As you can see, logs field of the document is an array of objects, each describing one of these actions. data for each action may or may not contain specific fields, that in turn can also be an array of objects: socialLinks and webResults.

I'm trying to remove sensitive data from all of these documents with specified Match ids. For each document, I want to go over the logs array field, and change the value of specific fields only if they exist, for example: change firstName to *****, same for lastName, if those appear. also, go over the socialLinks array if exists, and for each element inside it, if a field url exists, change it to ***** as well.

What I've tried so far are many minor variations for this query:

      $set: {
        'logs.$[].data.talent.socialLinks.$[].url': '*****',
        'logs.$[].data.talent.webResults.$[].link': '*****',
        'logs.$[].data.talent.webResults.$[].title': '*****',
        'logs.$[].data.talent.firstName': '*****',
        'logs.$[].data.talent.lastName': '*****',
      },

and some play around with this kind of aggregation query:

[{
      $set: {
        'talent.socialLinks.$[el].url': {
          $cond: [{ $ne: ['el.url', null] },'*****', undefined],
        },
      },
    }]

resulting in errors like: message: "The path 'logs.0.data.talent.socialLinks' must exist in the document in order to apply array updates.",

But I just cant get it to work... :(

Would love an explanation on how to exactly achieve this kind of set-only-if-exists behaviour. A working example would also be much appreciated, thx.

1 Answer 1

1

Would suggest using $\[<indentifier>\] (filtered positional operator) and arrayFilters to update the nested document(s) in the array field.

In arrayFilters, with $exists to check the existence of the certain document which matches the condition and to be updated.

db.collection.update({},
{
  $set: {
    "logs.$[a].data.talent.socialLinks.$[].url": "*****",
    "logs.$[b].data.talent.webResults.$[].link": "*****",
    "logs.$[b].data.talent.webResults.$[].title": "*****",
    "logs.$[c].data.talent.firstName": "*****",
    "logs.$[d].data.talent.lastName": "*****",
    
  }
},
{
  arrayFilters: [
    {
      "a.data.talent.socialLinks": {
        $exists: true
      }
    },
    {
      "b.data.talent.webResults": {
        $exists: true
      }
    },
    {
      "c.data.talent.firstName": {
        $exists: true
      }
    },
    {
      "d.data.talent.lastName": {
        $exists: true
      }
    }
  ]
})

Sample Mongo Playground

Sign up to request clarification or add additional context in comments.

7 Comments

Thank you so much for the quick and correct answer. a follow up question for the 'regular' non array document: how should my $set look if I want to update each element of an array field, only if that field exists. example document: { talent: {firstName: 'ab', socialLinks: [{a:1, url:2},{a:2, url:22}]}}. If I try: $set: {talent.webResults.$[].link: *****} I get an error: MongoError: The path 'talent.webResults' must exist in the document in order to apply array updates.
Hi, I think this demo may meet your requirement. You need filter all the documents that is with "talent.webResults": { $exists: true }.
Hmmm, do you want in same query as the answer or separate query? And the sample document is same as the question?
Wouldn't that mean only a sub-set of the documents, only the ones where that field exists? I'm actually looking to select multiple documents based on ID, and then for each document, if it has a webResults array field, iterate over the array elements and start-out some data, as well as go over 'regular' fields I know exist, like first/lastName and star them our as well.. ----- this would be a different query on a different collection, without the infamous logs array field.
some example data here: mongoplayground.net/p/Bc_iQdBFekU
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.