You should not override them. It is wrong from a perspective of what an entity is supposed to be.
Why is it wrong to compare entities?
A hibernate-entity (an instance of an entity-class) is your programs representation of an entry in the database (A 'row', in the simplest case of a single SQL-table). This row exists outside of your program logic and memory. It is similar to an object representing a 'file' or a 'thread', it is not contained in your program, it is a proxy to something outside of your program.
At what point does it make sense to say that two file-objects are the same? When
they represent the same file. But how often do you even compare file-objects?
You don't, because the same file should not have two different representations
in the same context.
In this way - the hibernate-concept of an "Entity" matches the
Domain-Driven-Design (DDD) concept of an entity, which is probably no accident. You can read up on it on SO:
What is an Entity? ;
Difference Entity vs ValueObject
But isn't an equals/hashcode-contract helpful for disambiguating entities?
Hibernate aims to preserve the abstraction of the entity representing your row.
Even if you read the same row from
two queries in the same session, it will return the same object again.
You can only get two different java-objects representing the same database-row
if you detach/reattach or use multiple sessions.
This is not a standard usecase of hibernate, but it is entirely possible this
happens, for example when multithreading.
So imagine your program finds itself with these two objects:
Person(id=3, name='Sarah', occupation='programmer')
Person(id=3, name='Sarah', occupation='manager')
Which occupation should you continue with for Sarah? Is this an error? Does one version 'win'?
You cannot decide just from these two rows, the context of what you are doing matters.
And the context to decide this is not included in a HashMap or an entity-class,
it is in the code where these two appear.
If you were to define some equality on Person and throw these two objects into
the same HashSet or whatever, then one of two things would happen:
'first-save wins' - one version of Sarah gets discarded based on ordering
which is probably an implementation-detail of your code
You have two conflicting versions of Sarah in your datastructure.
Both are bad. Notice how it doesn't matter how the equality is defined, any general
equality would lead to these results - the mistake is not the specific definition of the
equality, the mistake is to leave disambiguation to a part of the code that does
not have the right context to decide it.
Why are the given approaches wrong?
Two approaches are given in the answers on this site:
- base equality on a business key
- Have hashcode always return the same value for all objects of the class
- base equality on ID
Number 1 might actually work, but bear in mind that business-keys are often
immutable in most but not all usecases - the classic example is a user (entity)
changing their email-adress (business-key).
And keep in mind this doesn't help you resolve ambiguity still,
and having an equality that you shouldn't be using is not useful.
Number 2 completely defeats the purpose of hash-based datastructures. It is
plain silly to do this. Your HashSets will decay into linked-lists. You are not hashing anything, so you can't have a hashset.
I think a lot of people (myself included) did not notice this flaw with the provided implementations
because the code looks so careful and advanced (taking care of Proxies and
whatnot)
Number 3 works best as ID is the property that actually matches your object to its database-counterpart. It does not work with unpersisted entities so it is limited, and it, too, does not solve the problems mentioned above - you always need custom disambiguation logic if you have two java-objects representing the same database-row.
The approach works better if null-IDs (of not-persisted entities) are not a problem. It is possible to generate IDs in the application rather than the database and just never face the situation that an ID field changes. Read the excellent article Don't let hibernate steal your identity to see how this solves the problem.
What should I do instead?
If you feel the need to hash your entities, by using them as keys in a map or as
elements in a set, consider these alternatives:
Use identity-based collections: IdentityHashMap, IdentityHashSet. They
determine equality with reference-equality (==) not .equals()
Extract business-keys or IDs and use them as keys. Any Set<Person> can be a
Map<PersonKey, Person>, any Map<Person, Something> can be a Map<PersonKey, Something>.
Just use a List<Person>, unlike sorting or hashing containers it does not limit mutability of its elements.
If you frequently store the same extra-information on your entities in a
Map<Person, Extra>, make Extra a transient field on Person
Is this problem exclusive to Hibernate?
There is a very general problem at play here:
You cannot modify an object with respect to its hash-value while it is in a
hash-based datastructure (a similar rule holds for sorting-based datastructures).
Java, like many other languages, does not have any compiler-rules to help you here.
You have to ensure this is true, otherwise your objects get 'lost'. No hibernate here.
Where hibernate comes into play is that it may assign an ID to your object when it saves it,
so mutations may happen at places where a beginner may not expect them, but they still happen in a controlled manner.
Making broken and wrong code bearably predictable by silently converting all HashSets into LinkedLists is an incredibly duct-tapey-solution that should not be the recommendation to people asking how to properly write Entity-classes.