I have been trying to do a write operation which is about half a million records on my SQLite DB. To give a context here, i was working on a project which initially didn't expect a lot of data and so I went a ahead with SQLite which was holding things good until more data coming in and the performance of the write operation went down terribly. It is taking ages to insert half a million records now. Btw, I am using C# code in a single thread to insert the data. Any suggestions to improve the performance of write would be much appreciated.
1 Answer
Any time you do more than a trivial number of writes, you need to surround all of them with a transaction.
If you don't do this, SQLite wraps each individual INSERT in its own transaction, and this is what is slowing you down. If you use your own transaction, SQLite uses that one and does not create transactions for each INSERT.
6 Comments
Marc Gravell
I'm genuinely curious: do you know if it does any better now that SSDs are pretty normal?
asherber
Good question – I don't know the answer.
SineR
@asherber thanks for the information, for using the transaction I have one doubt though which is, what will happen to the entire transaction when there are constraints in inserting few records. And how do we work around such situation?
asherber
@user1693597 When you say 'constraints', do you mean errors? If there are any errors inserting, then the whole transaction will roll back and no records will be inserted – that's the definition of what a transaction does. Depending on your use case, maybe it makes sense to wrap smaller groups of INSERTs into transactions, like a transaction every 1000 records or something, so that when one INSERT fails it doesn't take all the others with it. Just keep in mind that the number of transactions is much more limiting in terms of speed than the number of INSERTs.
Franck
@MarcGravell for bulk insert of around 800-900 inserts i have seen a huge difference on SSD over my old 7200 rpm. But small normal transaction of 2-3 inserts i haven't seen any difference. But the quantity of data is so low that the user cannot see difference between 20ms and 2ms. the bulk was around 5 seconds and now instanteneous and that is visible to the user. (update i am using sqlite 3.0)
|