5

I have a partitioned table (call it A) with a serial primary key that is referenced by another table (call it B). I know that I can't actually create a foreign key from one to the other (since I don't know from which partition the data is actually stored), so instead, I am attempting to mimic the behavior of a foreign key using check constraints. Something like the following:

CREATE TABLE A (
    MyKey SERIAL PRIMARY KEY
);

CREATE TABLE B (
    AKey INT, -- Should have: REFERENCES A (MyKey),
              -- but can't due to Postgres limitations
);

CREATE TABLE APart1 (
    Field1 INT,
    PRIMARY KEY (MyKey)
) INHERITS (A);

CREATE TABLE APart2 (
    Field2 INT,
    PRIMARY KEY (MyKey)
) INHERITS (A);

CREATE FUNCTION ValidateKeyInA(aKey INT) RETURNS BOOL AS $$
    BEGIN
        PERFORM * FROM A WHERE MyKey = aKey;
        IF FOUND THEN
            RETURN TRUE;
        END IF;
        RETURN FALSE;
    END;
$$ LANGUAGE PLPGSQL;

ALTER TABLE B ADD CHECK (ValidateKeyInA(AKey));

WITH aKey AS (INSERT INTO APart1 (Field1) VALUES (1) RETURNING MyKey)
INSERT INTO B (AKey) SELECT * FROM aKey;

WITH aKey AS (INSERT INTO APart2 (Field2) VALUES (2) RETURNING MyKey)
INSERT INTO B (AKey) SELECT * FROM aKey;

This works just fine, until I go to dump and restore the database. At that point, Postgres doesn't know that table B depends on the data in table A (and its partitions), and B happens to be dumped prior to table A. I tried to add the "DEFERRABLE" keyword to the line where I add the constraint, but Postgres doesn't support deferrable check constraints.

My proposed approach is to convert my check constraint to a constraint trigger, which I CAN defer, then import my database dump in a transaction. Is there a more straight-forward approach to this? For example, is there some way for me to tell Postgres not to dump table B until table A and all of its partitions have been dumped (e.g., add dependencies from B to the partitions of A)? Some other pattern that I should be using instead? Thank you.

6
  • 2
    I don't understand how you implemented table partitioning. Which one is the master table? Why you don't use inheritance? Btw your check function could be simpler: CREATE FUNCTION ValidateKeyInA(aKey INT) RETURNS BOOL AS $$ SELECT count(*) > 0 FROM A WHERE Key = aKey; $$ LANGUAGE sql; Commented Mar 3, 2015 at 7:22
  • @TommasoDiBucchianico I edited my question to explicitly show how I am creating the partitions of A. I was unaware that writing a single select statement in a PLPGSQL function would implicitly return the result. Thanks for the advice (I learn something new almost every time I visit the SO page). Commented Mar 3, 2015 at 12:51
  • Only sql functions returns implicitly the result of a query. My function is in sql language, not plpgsql Commented Mar 3, 2015 at 12:59
  • pg_dump dumps the tables in alphabetical order, so actually table A is dumped before table B. In your example the order in the dump should be: A, apart1, apart2, B. Can you name your tables so that the alphabetical order correspond to the "functional" order? Commented Mar 3, 2015 at 19:22
  • @TommasoDiBucchianico Can you cite your source for that? I found this thread, which indicates that not only is the order not alphabetical, but it is in fact non-deterministic (referring to Postgres v8.4.2). I looked through the pg_dump documentation for 9.3 here, and also couldn't find where any ordering is guaranteed. Commented Mar 3, 2015 at 21:06

2 Answers 2

1

pg_dump sorts automatically the table alphabetically (see my comment above). However, if you want to change the order how the table are dumped and restored but cannot rename your tables according to the desired order, you can use the --use-list option with pg_restore. See http://www.postgresql.org/docs/9.3/static/app-pgrestore.html

pg_restore allow to control the order, how the database elements are restored with the option --use-list.

You have first to dump the database in custom format using the option -Fc, otherwise you cannot restore the dump with pg_restore:

pg_dump -Fc your_database -f database.dump

Than you generate a file which lists all elements in the dump:

pg_restore --list database.dump > backup.txt

The file backup.txt will be used as input for the pg_restore option --use-list, but first you can edit the file and change the order of the lines with copy/paste. You can independent change both table creation and data insert. Pay attention that your list remain consistent. You can also delete lines completely in order to exclude elements from the restore.

Finally restore your dump with the option --use-list:

pg_restore -d your_database --use-list backup.txt database.dump

I tested this procedure with your example and changed the order of tables A and B. If table A is restored first, dump is restored without errors. Otherwise, if B is restored first, the restore fails as expected with the error:

pg_restore: [archiver (db)] COPY failed for table "b": ERROR: new row for relation "b" violates check constraint "b_akey_check" DETAIL: Failing row contains (1). CONTEXT: COPY b, line 1: "1" WARNING: errors ignored on restore: 1

Sign up to request clarification or add additional context in comments.

Comments

0

Both of the options given by @TommasoDiBucchianico are valid approaches, but I still wanted something different due to the following pitfalls:

Option #1: Rename the tables such that the alphabetical ordering of them matches the order in which to load the tables.

This one was avoided because 1) it relies on an undocumented feature of pg_dump, and 2) it forces me to give less-than optimal names to each of the tables.

Option #2: Provide a text file that contains the tables in the order in which I want them to be loaded by pg_restore.

I really liked this option, but the downside is that anytime a table is renamed, added, or dropped, someone would have to manually modify the text file to re-define the ordering.

Instead of attempting to re-order the data, I decided instead to convert all of my offending check-constraints to constraint triggers. While check constraints are pre-data, constraint triggers are post-data. That means that the constraint triggers are not added until ALL of the data is loaded, which works without requiring the data in any particular order. The following shows how I modified the example in the original post to use constraint triggers:

CREATE TABLE A (
    MyKey SERIAL PRIMARY KEY
);

CREATE TABLE B (
    AKey INT
);

CREATE TABLE APart1 (
    Field1 INT,
    PRIMARY KEY (MyKey)
) INHERITS (A);

CREATE TABLE APart2 (
    Field2 INT,
    PRIMARY KEY (MyKey)
) INHERITS (A);

CREATE FUNCTION ValidateKeyInA() RETURNS TRIGGER AS $$
    BEGIN
        PERFORM * FROM A WHERE MyKey = NEW.AKey;
        IF NOT FOUND THEN
            RAISE EXCEPTION '%: AKey not found in A', TG_NAME;
        END IF;
        RETURN NEW;
    END;
$$ LANGUAGE PLPGSQL;

CREATE CONSTRAINT TRIGGER "ValidateTableB"
    AFTER INSERT OR UPDATE ON B FROM A
    FOR EACH ROW EXECUTE PROCEDURE ValidateKeyInA();

WITH aKey AS (INSERT INTO APart1 (Field1) VALUES (1) RETURNING MyKey)
INSERT INTO B (AKey) SELECT * FROM aKey;

WITH aKey AS (INSERT INTO APart2 (Field2) VALUES (2) RETURNING MyKey)
INSERT INTO B (AKey) SELECT * FROM aKey;

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.