1

I had some data sample like these:

[
{
    "_id": 1,
    "host": "host1",
    "type": "type1",
    "data": [
        {
            "t": 10000,
            "v": 90
        },
        {
            "t": 10001,
            "v": 94
        },
    ]
},
{
    "_id": 2,
    "host": "host1",
    "type": "type1",
    "data": [
        {
            "t": 10000,
            "v": 99
        },
        {
            "t": 10001,
            "v": 93
        },
    ]
},
{
    "_id": 3,
    "host": "host1",
    "type": "type1",
    "data": [
        {
            "t": 10000,
            "v": 94
        },
        {
            "t": 10001,
            "v": 100
        },
    ]
}]

my query is:

my_filter = {'host': 'host1', 'type': 'type1', 'data': {'$elemMatch': {'t': 10000}}}
projection = {'host': 1, 'type': 1, 'data': {'$elemMatch': {'t': 10000}}}
sort_key = 'data.0.v'

rs = db.find(my_filter, projection).sort(sort_key, 1)

rs = list(rs)
for v in rs:
    print(v["data"][0]['v'])

but output like that sort doest work:

98
98
98
96
100
98
98

Notice:

  • now use: Python==3.6.9, pymongo==3.10.1, MongoDB==4.2.6
  • the length of documents is 10000, the length of nested array is 1440
  1. I only need data that meet the conditions in Nested Array, not all, because it might be a large array
  2. I need sort data, but I can't change the write order
  3. I also used $aggregate, but when data is large, it performance is bad, so I hope do some operation with find()

$aggregate like these:

    rs = db.aggregate([
    {"$match": {'host': 'host1', 'type': 'type1', 'data': {'$elemMatch': {'t': 10000}}}},
    {"$project": {'host': 1, 'type': 1,
                  'data': {"$filter": {
                      "input": "$data",
                      "as": "data",
                      "cond": {"$eq": ["$$data.t", 10000]}}
                  }
                  }},
    {"$sort": {'data.0.v': 1}}])

sorry for my poor English, but is here a good solution?

4
  • Do you have an index on these fields? Commented Dec 22, 2020 at 8:52
  • What versions of Mongodb, python and pymongo are you running? Commented Dec 22, 2020 at 9:37
  • python==3.6.9, pymongo==3.10.1 , mongo version==4.2.6 Commented Dec 22, 2020 at 10:34
  • Because there are too many keys in the embedded array, i think it is not easy to create index for each requirement,so i try to find another way Commented Dec 22, 2020 at 10:58

2 Answers 2

0

I couldn't reproduce your issue using the software versions you mention. If you have bash and docker you could see if your results are different:

PROJECT_NAME=sort_with_elemmatch

MONGODB_VERSION=4.2.6
PYTHON_VERSION=3.6.9
PYMONGO_VERSION=3.10.1

docker network create local_temp 2> /dev/null
docker run --rm --network local_temp -d --name mongodb_temp mongo:${MONGODB_VERSION}

cd "$(mktemp -d)" || exit

cat << EOF > requirements.txt
pymongo==${PYMONGO_VERSION}
EOF

cat << 'EOF' > ${PROJECT_NAME}.py
from pymongo import MongoClient
from random import randint

db = MongoClient('mongodb://mongodb_temp')['mydatabase'].mycollection

for i in range(20):
    db.insert_one(
    {
        "host": "host1",
        "type": "type1",
        "data": [
            {
                "t": 10000,
                "v": randint(0, 100)
            },
            {
                "t": 10001,
                "v": randint(0, 100)
            },
        ]
    })

my_filter = {'host': 'host1', 'type': 'type1', 'data': {'$elemMatch': {'t': 10000}}}
projection = {'host': 1, 'type': 1, 'data': {'$elemMatch': {'t': 10000}}}
sort_key = 'data.0.v'

rs = db.find(my_filter, projection).sort(sort_key, 1)

rs = list(rs)
for v in rs:
    print(v["data"][0]['v'])
EOF

cat << EOF > Dockerfile
FROM python:${PYTHON_VERSION}
COPY ./* /
RUN pip install -r /requirements.txt
CMD ["python", "${PROJECT_NAME}.py"]
EOF

docker build --tag ${PROJECT_NAME}:latest .
docker run --rm --network local_temp --name ${PROJECT_NAME} ${PROJECT_NAME}:latest
docker stop "$(docker ps -a -q --filter name=mongodb_temp)" > /dev/null
docker image rm ${PROJECT_NAME}:latest > /dev/null
docker network rm local_temp > /dev/null

prints:

5
17
18
19
20
28
29
37
59
59
61
63
64
66
68
77
82
82
100
100
Sign up to request clarification or add additional context in comments.

1 Comment

thank you for trying,i found the question, please look at answer below
0

I found the problem, This seems to be related to the order of keys in the embedded array:

    for i in range(20):
        data_0, data_1 = {"t": 10000, "v": random.randint(0, 100)}, {"t": 10001, "v": random.randint(0, 100)}
        insert_d = {
            "host": "host1",
            "type": "type1",
            "data": [data_0, data_1] if i != 10 else [data_1, data_0]
        }
        db.insert_one(insert_d)

    my_filter = {'host': 'host1', 'type': 'type1', 'data': {'$elemMatch': {'t': 10000}}}
    projection = {'host': 1, 'type': 1, 'data': {'$elemMatch': {'t': 10000}}}
    sort_key = 'data.0.v'

    rs = db.find(my_filter, projection).sort(sort_key, 1)

    rs = list(rs)
    for v in rs:
        print(v["data"][0]['v'])

if you try this, you will found sort doesnt work as expected, Most of the values are ordered, but one is unordered, so I want to know how sort works

1 Comment

I hunted around and maybe this might help ... stackoverflow.com/questions/28889240/…

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.