I'm writing a program to run a batch process over hundreds of thousands of entities of a few related types. I was originally doing this with a single transaction per persist. This seemed very slow, so I tried doing somewhat naive batch updates in the way described in http://docs.jboss.org/hibernate/core/3.3/reference/en/html/batch.html, with longer transactions and occasional flush+clears. I'm running into a ConstraintViolationException for some of my entity types, because I have unique field constraints. However, I'm unsure of how to check for existing instances; I currently have a criteria to list collisions, but it seems to not return entities that I have saveOrUpdated within the same transaction.
A made-up example may help:
entities Family, Person, Name
Family has many Persons (One to Many)
Persons have many names, different Persons can have the same Name. (Many to Many)
My updates include persisting a Family along with its Persons and Names, but I'm not sure how to dedupe Names (may collide with existing Name in db or another Name in the same update batch). I could just keep track of new entities' unique constraint fields outside of hibernate, but I thought this is probably not necessary. Is there any built-in way of checking for duplicates both in the db and uncommitted changes? I saw Hibernate batch updates with constraintviolationexception, but I do not savor using exceptions in the normal codepath. Thanks, I appreciate any guidance.