3

How is the foreign key implemented, for example, in PostgreSQL?

I noticed that a lot of hashing is involved in the creation of a foreign key, so I suppose that a hash based index is created on the foreign key column that references the primary key column. If so (for instance, when we want to remove a row from the referenced table - this one with the primary key or the so called master table) we can easily check if the row from the referenced table is actually referenced or not. What's more, probably DBMS requires that there is at least a B+ tree index on the referenced primary key column, because when we want to insert a new row to the referencing table, we can easily check if a row with the required primary key value exists in the referenced table. Some sources claim that a trigger is used to ensure the foreign key constraint.

2
  • 2
    A (foreign key) constraint is an abstract concept. Implementations may implement it anyway they want. In practice, for performance reasons it is almost necessary to add an index. For theoretical reasons it is required that the referenced flelds in the FK are at least an unique way to address a row in the target table. That in practice often implies an index. Commented Mar 15, 2015 at 23:57
  • Yes, I know that there is a b+tree index on the primary (or unique) key, but is there any additional structure (for example, a hash-based index) on the referencing column(s)? Commented Mar 17, 2015 at 9:02

2 Answers 2

2

In Postgres, the referenced column(s) in the master table need to have a UNIQUE or PRIMARY KEY constraint. Per documentation:

The referenced columns must be the columns of a non-deferrable unique or primary key constraint in the referenced table.

Both of which are currently always implemented with a btree index. So there is always a btree index on the referenced column(s):

The referencing column(s) do(es) not have to be indexed at all. If rows in the master table are never updated or deleted, this may be good enough. Else, the referencing column(s) should also be indexed, but that's not enforced by the system. This is just about performance optimization, not data integrity.

The actual implementation of the FK constraint itself is an entry in the system catalog pg_constraint, a special internal trigger and another entry in pg_depend.

Sign up to request clarification or add additional context in comments.

5 Comments

Thank you for your answer. As far as I know the b+tree index is created by default when a new primary key is defined (so this is a CPU intensive task because the rows have to be sorted before creating the b+tree in a bulk mode). I observed that a lot of hashing was involved in the creation of the FK in postgres, so I infer that probably an additional structure (maybe a hash-based index) is created on the referencing column(s) apart from the new entries in the pg_constraint and pg_depend.
@adam.cajf: No additional structure is created other than what I already listed. What you observe is the verification of relational integrity. Upon creation, Postgres has to establish that existing values obey the new constraint.
Yes, it makes sense to check if the values exist, but in this case the b+tree on the primary key should be used. So, why there is a lot of hashing?
@adam.cajf: Not sure what you mean. But an index would hardly be useful when visiting most of the table anyway. The index is probably not going to be used for this particular purpose.
What you probably saw was the implementation of an outer join between the primary key and the foreign key to detect foreign key entries not in the primary key.
0

I checked that, indeed, there is no index created by PostgreSQL for a foreign key (using this query: https://stackoverflow.com/a/25596855/1245175).

On the other hand, a few triggers are created for a foreign key:

test=# SELECT tgname AS trigger_name
  FROM pg_trigger
 WHERE tgname !~ '^pg_';
 trigger_name
--------------
(0 rows)

test=# ALTER TABLE LINEITEM ADD CONSTRAINT LINEITEM_FK1 FOREIGN KEY (L_ORDERKEY)  REFERENCES ORDERS;
ALTER TABLE
test=# SELECT tgname AS trigger_name                                                               
  FROM pg_trigger
 WHERE tgname !~ '^pg_';
         trigger_name        
------------------------------
 RI_ConstraintTrigger_a_16419
 RI_ConstraintTrigger_a_16420
 RI_ConstraintTrigger_c_16421
 RI_ConstraintTrigger_c_16422

So, I suppose that during the foreign key creation in PostgreSQL, a hash map is created for the referenced table and then a probing is executed for each row of the referencing table.

Interestingly enough, MonetDB creates indexes of different types for primary and foreign keys (probably join-index and hash-index, respectively).

sql>select * from sys.idxs;
+------+----------+------+-------------+
| id   | table_id | type | name        |
+======+==========+======+=============+
| 6467 |     6446 |    0 | orders_pk   |
| 6470 |     6464 |    1 | lineitem_fk |
+------+----------+------+-------------+
2 tuples (3.921ms)

What's more, Oracle enforces primary key constraints using indexes and by default it does not create any index for a foreign key, however, there are some foreign key indexing tips: https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:292016138754

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.