1

When use aggregate with mongoengine, it return a CommandCursor instead of mongoengine object list, which mean that the mongonengine is not really be used,

For example: if some document doesn't has a title field, a error will be raised. How can I convert my results to mongoengine object?

class Post(Document):
    title = StringField(max_length=120, required=True)
    author = ReferenceField(User)

Host.objects()
# [<Post: Post object>, <Post: Post object>, ...]

pipeline = [
    {
        "$match": {
            'types': type,
        }
    },
    {
        "$project": {
            "name": 1,
            'brating': {
                "$divide": [
                    {"$add": ["$total_score", 60]},
                    {"$add": ["$total_votes", 20]}
                ]
            }
        }
    },
    {"$sort": {"brating": -1}},
    {"$limit": 100}

]

Host.objects.aggregate(*pipeline)
# <class 'pymongo.command_cursor.CommandCursor'>

list(Host.objects.aggregate(*pipeline))
# <class 'list'>
1
  • It may be worth posting a new question which explains why you need to use aggregate at all here. Usually an aggregation pipeline is used to transform and combine multiple document rather than just retrieve them. Commented Dec 29, 2016 at 16:16

1 Answer 1

1

The aggregate function is just a shortcut to the underlying pymongo function.

The documents that come back from aggregate may involve some $group or other stage that means they bear no relation to your object model so mongoengine couldn't convert them to mongoengine objects.

In the case of your pipeline you are using a $project stage to return a new type of document which only has name and brating fields.

Mongoengine isn't going to be able to do what you want here so you have a couple options:

  • Store the brating field on the Post documents. Initialise the rating at 0 when the post is created and when $total_score or $total_votes are updated, also update the rating.

  • Accept that you are getting back non-mongoengine objects and handle them accordingly. The cursor will yield normal python dictionaries which you can then access the fields post['name'] or post['brating'] in your client code.

  • Use a normal .objects query and sort on the client side.

The final step will obliviously be a problem if you have lots of documents but for a small number try something like:

posts = Post.objects(types=type).only("name", "total_score", "total_votes")
top_posts = sorted(list(posts),key=lambda p: (p.total_score+60)/(p.total_votes+20))[:100]
Sign up to request clarification or add additional context in comments.

6 Comments

The problem is I have to use aggregate for some complex query mongoengine can't handle it, is there no way to use them same time?
Why can't mongoengine handle the query? Maybe post your pipeline in the question.
I just added pipeline code, please give it a look and how I can do it with mongoengine's .objects notation.
It's hard to answer because you haven't given a complete question but I've added some options. You are not going to be able to do what you want with mongoengine, although it is technically possible to implement what you are requesting you would have to write it yourself so it may be easier to look at other ways of doing this.
Thanks. The solution of example code is working on my case, what I still can't sure is which method has better performance? use mongoengine objects then iterate it, calculate new field, sorted it, or use aggregate of pymongo then check each field in flask template manually. BTW: my collection has about 4000 documents.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.