0

I take data from a search box and then insert into MongoDB as a document using the regular insert query. The data is stored in a collection for the word "cancer" in the following format with unique "_id".

{
  "_id": {
    "$oid": "553862fa49aa20a608ee2b7b"
  },
  "0": "c",
  "1": "a",
  "2": "n",
  "3": "c",
  "4": "e",
  "5": "r"
}

Each document has a single word stored in the same format as above. I have many documents as such. Now, I want to remove the duplicate documents from the collection. I am unable to figure out a way to do that. Help me.

4
  • Does stackoverflow.com/questions/14184099/… help ? or stackoverflow.com/questions/13190370/… ? Commented Apr 23, 2015 at 9:08
  • No Sourabh. Here, I am confused why the alphabets of a word are being assigned a value. Commented Apr 23, 2015 at 9:12
  • 1
    Normally you would do this by making the word the key since that is unique Commented Apr 23, 2015 at 9:15
  • Now, I have many number of duplicate documents with same word. How can I remove them? Commented Apr 23, 2015 at 9:21

1 Answer 1

2

an easy solution in mongo shell: `

use your_db
db.your_collection.createIndex({'1': 1, '2': 1, '3': 1, etc until you reach maximum expected letter count}, {unique: true, dropDups: true, sparse:true, name: 'dropdups'})
db.your_collection.dropIndex('dropdups')

notes:

  • if you have many documents expect this procedure to take very long time
  • be careful this will remove documents in place, better clone your collection first and try it there.
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.