1

I have multiple documents and each document has a set of tweets. I can find the document by name as follows:

client = MongoClient('localhost', 27017)
db = client['sample_app']
s = db['s']
s.find(
            {
                "name": "temp16"
            }
        )

When I run the above query I get the following data:

{"_id": {"$oid": "5e57db66c6bb04eb902589a2"}, "name": "temp16", "tweets": [{"tweet_id": "1234762637361086465", "tweet_text": "Had an extensive review regarding preparedness on the COVID-19 Novel Coronavirus. Different ministries & states are working together, from screening people arriving in India to providing prompt medical attention.", "tweet_handle": "@narendramodi", "labels": ["A", "B", "C", "D", "E"]}, {"tweet_text": "There is no need to panic. We need to work together, take small yet important measures to ensure self-protection.", "tweet_id": "1234762662413660165", "tweet_handle": "@narendramodi", "labels": ["A", "B", "C", "D", "E", "F"]}]}

My intention is to get the tweet with id "1234762662413660165" in this document alone. So I try the following:

s.find(
            {
                "name": "temp16",
                'tweets': {"tweet_id": "1234762662413660165"}
            },
        )

However I get None

What am I doing wrong?

2 Answers 2

1

You need to use $elemMatch

import pymongo
db = pymongo.MongoClient()['mydatabase']
db.mycollection.insert_one({"name": "temp16", "tweets": [{"tweet_id": "1234762637361086465", "tweet_text": "Had an ...", "tweet_handle": "@narendramodi", "labels": ["A", "B", "C", "D", "E"]}, {"tweet_text": "There is ...", "tweet_id": "1234762662413660165", "tweet_handle": "@narendramodi", "labels": ["A", "B", "C", "D", "E", "F"]}]})

tweets = db.mycollection.find({"name": "temp16", 'tweets': {'$elemMatch': {"tweet_id": "1234762662413660165"}}})

for tweet in tweets:
    print(tweet)
Sign up to request clarification or add additional context in comments.

4 Comments

Hi, thank you for the answer, but I am getting the whole document
I only want the tweet
db.mycollection.find({"name":"temp16"}, {"tweets":{"$elemMatch": {"tweet_id": "1234762662413660165"}}}) worked for me
If you just want the tweet text you can change the last line to print(tweet.get('tweets')[0].get('tweet_text'))
0

here's two ways of doing it using aggregation pipelines:

db.collection.aggregate(
    { $match: { name: 'temp16' } },
    { $unwind: '$tweets' },
    { $match: { 'tweets.tweet_id': '1234762662413660165' } },
    { $replaceWith: '$tweets' }
)

db.collection.aggregate(
    { $match: { name: 'temp16' } },
    {
        $replaceWith: {
            $arrayElemAt: [
                {
                    $filter: {
                        input: "$tweets",
                        as: "tweet",
                        cond: { $eq: ["$$tweet.tweet_id", '1234762662413660165'] }
                    }
                }, 0]
        }
    }
)

first one is short and sweet but it has the added overhead of unwinding and creating documents in memory.

2 Comments

Thank you for the answer, I can do it in mongo shell but unable to replicate in pymongo
i'm not familiar with pymongo, but this seems to be how you run aggregation pipelines with it. basically just supply each stage as an array item.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.