6

I'm having trouble getting Hibernate to perform a bulk insert on MySQL.

I'm using Hibernate 3.3 and MySQL 5.1

At a high level, this is what's happening:

@Transactional
public Set<Long> doUpdate(Project project, IRepository externalSource) {
    List<IEntity> entities = externalSource.loadEntites();
    buildEntities(entities, project);
    persistEntities(project);
}
public void persistEntities(Project project) {
     projectDAO.update(project);
}

This results in n log entries (1 for every row) as follows:

Hibernate: insert into ProjectEntity (name, parent_id, path, project_id, state, type) values (?, ?, ?, ?, ?, ?)

I'd like to see this get batched, so the update is more performant. It's possible that this routine could result in tens-of-thousands of rows generated, and a db trip per row is a killer.

Why isn't this getting batched? (It's my understanding that batch inserts are supposed to be default where appropriate by hibernate).

3 Answers 3

7

Pascal's answer is correct. However, because you are using MySQL, I also highly recommend that you try using the rewriteBatchedStatements=true parameter in your JDBC URL.

This parameter causes the JDBC driver to dynamically re-write your INSERT batches to use a single "multi-valued" INSERT, e.g.:

INSERT INTO mytable (mycol) VALUES (0);
INSERT INTO mytable (mycol) VALUES (1);
INSERT INTO mytable (mycol) VALUES (2);

will be re-written to:

INSERT INTO mytable (mycol) VALUES (0), VALUES (1), VALUES (2);

This can make a significant difference in some cases. See http://www.jroller.com/mmatthews/entry/speeding_up_batch_inserts_for for some example measurements.

Sign up to request clarification or add additional context in comments.

Comments

6

As documented in the Chapter 13. Batch processing:

If you are undertaking batch processing you will need to enable the use of JDBC batching. This is absolutely essential if you want to achieve optimal performance. Set the JDBC batch size to a reasonable number (10-50, for example):

hibernate.jdbc.batch_size 20

Hibernate disables insert batching at the JDBC level transparently if you use an identity identifier generator.

Don't forget to flush and then clear the session regularly or you'll get OutOfMemoryException as documented in 13.1. Batch inserts.

But IMO, for tens-of-thousands of rows, you should consider using the StatelessSession interface.

2 Comments

Will this enable batching of SQL-statements? (Whether the sessions keeps state sounds rather orthogonal to batching, but perhaps I infer too much from the class name)
The part about the identifier generator is a little cryptic - this thread explains it: forum.hibernate.org/viewtopic.php?p=2436697
0

Pascal has pretty much nailed it in hibernate context. As an alternative, you can use jbdc template's Batchsqlupdate. However I have to warn you that hibernate cached instances may not reflect the changes done using above. In our project, we had to take precautionary measures to overcome this, by creating a different schedule (another problem created, but within our control)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.