0

After searching for a whole day, I am doubting whether MongoDB can fulfill below requirement:

Q: How can I filter out documents that meet below conditions ?

  • In last array element of students_replies, there is a reply from a student whose name containing string 'ason'.
id_1: first_school, students_replies: [
    {Date:20210101, replies: [
        {name: jack, reply: 'I do not like this idea'},
        {name: jason, reply: 'I would rather stay at home'},
        {name: charles, reply: 'I have an plan to improve'},
        ]}, 
    {Date:20210401, replies: [
        ...]}, 
    {Date:20210801, replies: [
        ...]},
]

id_2: second_shool, students_replies: [..]
id_3: third_shool, students_replies: [...]

Mongoplayground

5
  • Can you provide please valid json (use mongoplayground.net) and expected result? Have you tried $elemMatch? MongoDB Aggregation? Commented Dec 15, 2021 at 8:00
  • thanks for reply. I have tried $in, $elemMatch, $indexOfByte in both query and aggregation, none of them worked for me. Most of them will match the whole value instead of a portion (string matching to be specific) of that. for example, {key: 'This is apple'}, my matching condition will be containing ple in the value, not value equals to 'This is apple'. below is the valid json and expected results will be only documents with key:1 and key:3 will be outputted. mongoplayground.net/p/_-MFlpzF6eY Commented Dec 15, 2021 at 8:49
  • What is the desired output? Commented Dec 15, 2021 at 9:58
  • Like this MongoPlayground ? Commented Dec 15, 2021 at 10:07
  • Do you need the output documents themselves also filtered such that the array of replies only contains the matching replies? Commented Dec 15, 2021 at 10:18

2 Answers 2

1

Use $slice and $regex

For your example this becomes:

db.collection.aggregate([
  // project only the last reply
  {
    "$project": {
      key: 1,
      last_reply: {
        "$slice": [
          "$students_replies",
          -1
        ]
      }
    }
  },
  // filter the documents
  {
    "$match": {
      "last_reply.replies.name": {
        "$regex": "ason"
      }
    }
  }
])

https://mongoplayground.net/p/a9piw2WQ8n6

Sign up to request clarification or add additional context in comments.

5 Comments

I seems that it will also output key: 2 document, which is not expected.
I followed your advise and worked out a much simpler solution, mongoplayground.net/p/cEaiYQXq8cN, many thanks for your help.
@YanTian OK it doesn't output key: 2 (not when I run it) but it does output other replies than the last one, I didn't catch that part. But you figured out to use $slice for this. I'll update the answer using your combined solution.
@YanTian Note that even if you have an index on students_replies.replies.name, it probably won't be used here due to the projection. For performance reasons, you could put an index on students_replies.replies.name and then add an additional $match before the $project on students_replies.replies.name. That way MongoDB can scan the index for the regex without fetching the documents, and also it needs to execute $project and second $match for less documents. It's always best to reduce the number of documents in the pipeline as early as possible.
thanks for the insights. I totally agree that It's always best to reduce the number of documents in the pipeline as early as possible, @YuTing also pointed out this. Because of that best practice, I guess, he produced a very complicated answer even though he already knew those simpler answers.
1

Since you need last array element of students_replies, use $arrayElemAt

db.collection.aggregate([
  {
    "$match": {
      $expr: {
        $regexMatch: {
          input: {
            $reduce: {
              input: {
                $arrayElemAt: [
                  "$students_replies.replies",
                  -1
                ]
              },
              initialValue: "",
              in: {
                $concat: [
                  "$$value",
                  "$$this.name",
                  ","
                ]
              }
            }
          },
          regex: "ason"
        }
      }
    }
  },
  {
    "$project": {
      "students_replies": 0
    }
  }
])

mongoplayground


another answer

db.collection.aggregate([
  {
    $match: {
      $expr: {
        $ne: [
          {
            $filter: {
              input: {
                $map: {
                  input: {
                    $arrayElemAt: [
                      "$students_replies.replies",
                      -1
                    ]
                  },
                  as: "r",
                  in: "$$r.name"
                }
              },
              as: "s",
              cond: {
                $regexMatch: {
                  input: "$$s",
                  regex: "ason"
                }
              }
            }
          },
          []
        ]
      }
    }
  },
  {
    "$project": {
      "students_replies": 0
    }
  }
])

mongoplayground

5 Comments

thanks for reply, it works, though it looks a bit complicated and I do need time to digest your answer.
sorry, YuTing, though your answers output expected results, I won't mark it as answer as they are too complicated and I find a much simpler answer based on the input from @herman. I pasted my solution in the comment to his answer. Appreciate your time and efforts to offer help.
@YanTian $match should always be at the top level of aggregate to achieve the best query speed. When you have thousands or millions of data.
thanks for pointing out. will keep that in mind. it is reasonable to clear out all unwanted data before passing through any further aggregation stages. Even though your answer is complicated for me as of now, I still learn a lot from it and I believe that it will make me better prepared for more advanced aggregation in the future.
@YuTing although you're filtering early, I don't think the regex filter in these solutions will actually use an index (assuming there is one). Also when I run these they don't actually include the last reply.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.