4

I have a large database of elements each of which has unique key. Every so often (once a minute) I get a load more items which need to be added to the database but if they are duplicates of something already in the database they are discarded.

My question is - is it better to...:

  1. Get Django to give me a list (or set) of all of the unique keys and then, before trying to add each new item, check if its key is in the list or,
  2. have a try/except statement around the save call on the new item and reply on Django catching duplicates?

Cheers, Jack

1 Answer 1

1

If you're using MySQL, you have the power of INSERT IGNORE at your finger tips and that would be the most performant solution. You can execute custom SQL queries using the cursor API directly. (https://docs.djangoproject.com/en/1.9/topics/db/sql/#executing-custom-sql-directly)

If you are using Postgres or some other data-store that does not support INSERT IGNORE then things are going to be a bit more complicated.

In the case of Postgres, you can use rules to essentially make your own version of INSERT IGNORE.

It would look something like this:

CREATE RULE "insert_ignore" AS ON INSERT TO "some_table"
WHERE EXISTS (SELECT 1 FROM some_table WHERE pk=NEW.pk) DO INSTEAD NOTHING;

Whatever you do, avoid the "selecting all rows and checking first approach" as the worst-case performance is O(n) in Python and essentially short-circuits any performance advantage afforded by your database since the check is being performed on the app machine (and also eventually memory-bound).

The try/except approach is marginally better than the "select all rows" approach but it still requires constant hand-off to the app server to deal with each conflict, albeit much quicker. Better to make the database do the work.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.