2

I'm having a problem with MongoDB using Java when I try adding documents with customized _id field. And when I insert new document to that collection, I want to ignore the document if it's _id has already existed.

In Mongo shell, collection.save() can be used in this case but I cannot find the equivalent method to work with MongoDB java driver.

Just to add an example:

  • I have a collection of documents containing websites' information with the URLs as _id field (which is unique)
  • I want to add some more documents. In those new documents, some might be existing in the current collection. So I want to keep adding all the new documents except for the duplicate ones.
  • This can be achieve by collection.save() in Mongo Shell but using MongoDB Java Driver, I can't find the equivalent method.

Hopefully someone can share the solution. Thanks in advance!

2 Answers 2

1

In the MongoDB Java driver, you could try using the BulkWriteOperation object with the initializeOrderedBulkOperation() method of the DBCollection object (the one that contains your collection). This is used as follows:

MongoClient mongo = new MongoClient("localhost", port_number);
DB db = mongo.getDB("db_name");

ArrayList<DBObject> objectList; // Fill this list with your objects to insert
BulkWriteOperation operation = col.initializeOrderedBulkOperation();

for (int i = 0; i < objectList.size(); i++) {
    operation.insert(objectList.get(i));
}

BulkWriteResult result = operation.execute();

With this method, your documents will be inserted one at a time with error handling on each insert, so documents that have a duplicated id will throw an error as usual, but the operation will still continue with the rest of the documents. In the end, you can use the getInsertedCount() method of the BulkWriteResult object to know how many documents were really inserted.

This can prove to be a bit ineffective if lots of data is inserted this way, though. This is just sample code (that was found on journaldev.com and edited to fit your situation.). You may need to edit it so it fits your current configuration. It is also untested.

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks for your answer. But I think it doesn't really solve my question. The situation here is that I want to add a list of new documents to a current collection. One of the new documents can have the same _id (i.e. existing) in a current collection. What I want is to keep adding all the new documents and ignoring the documents with duplicate _id.
@dumblebee Oh, looks like I misunderstood your question. I edited my answer. Thanks for the precision.
Hi, thanks for the recommendation about using BulkWriteOperation. I think your usage example is using the older version of MongoDB Java Driver (2.x). But yes, I will figure it out with the latest version 3.2 :D
0

I guess save is doing something like this.

fun save(doc: Document, col: MongoCollection<Document>) {

    if (doc.getObjectId("_id") != null) {
        doc.put("_id", ObjectId()) // generate a new id
    }

    col.replaceOne(Document("_id", doc.getObjectId("_id")), doc)
}

Maybe they removed save so you decide how to generate the new id.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.