10

Which index will perform better for foreign keys of type integer in postgresql 9.3?

I would assume a hash index, because foreign key comparisson are always made with =

Or does a btree compare as fast as a hash when used for JOINS on foreign keys?

Because in postgresql primary keys use btree's that would suggest they are also better for foreign keys.

0

2 Answers 2

10

From the manual On PostgreSQL 9.3:

Caution Hash index operations are not presently WAL-logged, so hash indexes might need to be rebuilt with REINDEX after a database crash if there were unwritten changes. Also, changes to hash indexes are not replicated over streaming or file-based replication after the initial base backup, so they give wrong answers to queries that subsequently use them. For these reasons, hash index use is presently discouraged.

There is also no proof that an hash index has any performance benefits over a btree.

Sign up to request clarification or add additional context in comments.

5 Comments

Very strange. You'd think a hash index would be much faster, but numerous benchmarks have shown that it's only slightly faster at best. Hash is much faster than tree-based for typical in-memory data structures (e.g. Java's HashMap vs TreeMap). Any reason it's slow in Postgres, or is it just because Postgres's hash index implementation is not well-maintained?
This is no longer relevant in postgresql 10+ postgresql.org/docs/10/sql-createindex.html
@sudo big trees have much better data locality, while hash tables are randomly distributed. The cache performance makes a big difference especially when reading from disk.
IIRC even in Postgres 10+ switching my indexes to hash didn't show any improvement. I forget what the situation was. Data locality... I don't understand why that'd help during equality checks unless one side is sorted.
@sudo if one side is a btree index scan, then yes it will be sorted
3

It depends.

If you're not doing any queries using the foreign key, then you don't need any index. The referential integrity is enforced using the index of the referenced primary key.

The question of which index type to use (if any) is thus the same for any column(s). If your queries would benefit from an index for a = comparison, and you're using PostgreSQL 10 or newer, then a HASH index is a reasonable choice. If the same column is involved in any ordering operations (ORDER BY, <, >=, etc.) then you may as well use a BTREE.

If you are concerned about the relative performance, then you'll need to test them yourself, using your own data distribution and query load. A tree index may still perform better than a hash, due to data access locality (sequential vs. random).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.