0

I am creating a java program to process the Collection of MongoDB as queue. So when I dequeue, I want the document that was inserted first.

To do that so, I have a field called created, which represents the time stamp for the document creation, and my initial idea was to use aggregation $min to find the smallest document using created field.

However it occurred to me why not use findOne() without any argument. It will always return the first document in the collection.

So my question is should I do that? Would it be a good approach to use findOne() and dequeue first record from the Mongo Queue? And what are the drawback if I do that so.

PS: The Mongo Queue program is created to serve the requests of the devices on basis of First Come First Serve. But as it would take some time to execute the request and device can't accept another request while it is processing one. So to prevent the drop of one request I am using the queue to process request one by one.

12
  • You might want to check github.com/gaillard/mongo-queue-java. There are n number of ways to do a thing and it totally depends on the usecase if we also include points from performance perspective. Commented Jun 11, 2015 at 10:45
  • 1
    As per Javadoc, findOne() method returns a single object from this collection. It does not say it's first one. Commented Jun 11, 2015 at 10:46
  • @chridam $natural is the order on disk and not the first document in the collection. The first document is basically the lowest _id value. Commented Jun 11, 2015 at 10:55
  • 1
    @user3561036 you should try and tell us. Commented Jun 11, 2015 at 10:57
  • 2
    @findOne always returns a random document from a collection. Commented Jun 11, 2015 at 10:57

3 Answers 3

4

Interesting how many people here commented incorrectly, but you are right in that a raw .findOne() with a blank query or .findOne({}) will return the first document in the collection, that being "the document with the lowest _id value".

Ideally for a queue processing system, you want to remove the document at the same time as doing this. For this purpose the Java API supports a .findAndRemove() method:

    DBCollection data = mongoOperation.getCollection("data");
    DBObject removed = data.findAndRemove(new DBObject());

So that will return the first document in the collection as described and "remove" it from the collection so that no other operations can find it.

You can call .findAndModify() and set all the options yourself alternately, but if all you are after is the "oldest document first" which is what the _id guarantees then this is all you want.

Sign up to request clarification or add additional context in comments.

4 Comments

So id is in form of like string+integer in collection. And isn't ids assigned randomly? So that being said how it would return with the lowest id?
@StackPointer Not if you stick with the default ObjectId it isn't. It is "monotonic" which means "ever increasing" so each one you add is greater than the next. If you put something else in, than make sure you "timestamp" as part of a composite _id value, or use the "sort" option on .findAndModify() with the remove flag as well on another field that contains a timestamp.
Considering that findAndRemove is not actually a command within mongodb core I would mostly say that this method is nothing special but just a helper for the remove flag on findandmodify. It is good to note that findandmodify will result in massive performance loss due to the way it works.
@Sammaye The reference to .findAndModify() is already made here. As for "performance loss", considering the "use case" in question how would you say the performance (and accuracy) compares to separate .find() and then .remove() operations which involve multiple calls over the wire to the database? It's a rhetorical question really.
3

findOne returns element in natural order. This is not necessarily same as insertion order. It is the order in which document appears in the disk. It may appear that it is being retrieved in insertion order but with deletes and inserts, you will start seeing document appear out of order.

One of the ways to guarantee that elements always appear in insertion order is to use capped collections. If your application is not impacted by its restrictions, it might be the simplest way to get a queue implemented with capped collection.

Capped collections can also be used with tailable cursor so that the logic that is retrieving items from the queue can continue to wait for items if no items are available to process.

Update: If you can not use capped collection you would have to sort the result by _id if it is ObjectId or keep timestamp based field in collection and order the result by that field.

Comments

0

FindOne returns using the $natural order within the internal MongoDB bTree that exists behind the scenes.

The function does not, by default, sort by _id and nor will it pick the lowest _id.

If you find it returns the lowest _id regularly then that is because of document positioning within the $natural index.

Getting the first document of the collection and the first document of a sorted set are two totally different things.

If you wanted to use findAndModify to grab a document off the pile, which I personally would recommend a optimistic lock then you would need to use:

findAndModify({
    sort: {_id: -1},
    remove: true
})

The reason why I would not commend this approach is because of that process crashes or the server goes down in the distributed worker set then you have lost that data point. Instead you want a temporary (optimistic type) lock which can be released in the event that it has not been processed correctly.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.