2

I recently hit a Postgres error (my DB corrupted), basically, some rows on the DB duplicated (along with the primary key). Before I keep going, these are the errors addressed in this post:

ERROR:  uncommitted xmin 393410960 from before xid cutoff 393413059 needs to be frozen

OR

ERROR:  failed to find parent tuple for heap-only tuple at (3,8) in table "your_table"

Obviously, the xmin value of 393410960, the cut-off value of 393413059 and the ctid value of (3,8) could be different in your case.


How I get these errors:

If you want to get these errors (and you have this problem) this is how you could find them:

your_db=# VACUUM FULL your_table;
ERROR:  uncommitted xmin 393410960 from before xid cutoff 393413059 needs to be frozen

And to get the second error:

your_db=# REINDEX TABLE your_table;
ERROR:  failed to find parent tuple for heap-only tuple at (3,8) in table "your_table"

DON'T PANIC! The solution to this is given below :)

0

2 Answers 2

2

Before you read this, please note I take no responsibility for data loss or corruption or any problems this causes!

I'd suggest you back up everything! But don't do pgdump, do a full filesystem backup. Use rsync and put it somewhere else.

There also may be other solutions out there, so don't do this one first before doing more research. I can state, however, that this did work for me.


So in order to fix this, I followed the advice given in the post at:

http://www.postgresql-archive.org/BUG-10189-Limit-in-9-3-4-no-longer-works-when-ordering-using-a-composite-multi-type-index-td5802079.html

Basically, what I did was the following:

your_db=# BEGIN;
BEGIN
your_db=# DELETE FROM your_table WHERE ctid='(3,8)';
DELETE 1
your_db=# END;
COMMIT
your_db=# VACUUM FULL your_table;
VACUUM
your_db=# REINDEX TABLE your_table;
REINDEX

Only the lines that start with your_db=# are ones that I wrote. So what you can see there is that I deleted the offending row and then ran a reindex. If that fails, you delete the next offending row and reindex until it succeeds.

Hope this helps.

Sign up to request clarification or add additional context in comments.

Comments

0

I encountered the following issue:

ERROR:  failed to find parent tuple for heap-only tuple at (3,8) in table "your_table"

when running:

REINDEX INDEX CONCURRENTLY index_name

After investigating, I found that the query:

SELECT * from your_table where ctid='(3,8)'

returns one row. I also noticed that xmax was 0, meaning the row exists in the table and hasn't been deleted. However, when I queried the row using its primary key, I got zero results—indicating corruption in the primary key index.

Knowing this, I wrote an SQL procedure to scan the table and find all "corrupted rows":

CREATE OR REPLACE FUNCTION find_corruptions(
) RETURNS VOID AS
$$
DECLARE
    _ctid   VARCHAR;
    _ctid2   VARCHAR;
    _pkey BIGINT;
BEGIN
    RAISE NOTICE '% STARTING find_corruptions()', clock_timestamp();
    FOR _ctid, _pkey IN SELECT ctid, pkey FROM your_table
        LOOP
            SELECT ctid FROM your_table WHERE pkey = _pkey INTO _ctid2;
            IF NOT FOUND THEN
                RAISE NOTICE '[CORRUPTION] ctid=% pkey=%', _ctid, _pkey;
            END IF;
        END LOOP;

    RAISE NOTICE '% COMPLETED find_corruptions()', clock_timestamp();
END;
$$ LANGUAGE PLPGSQL SECURITY DEFINER
                    SET client_min_messages TO 'notice'
                    SET log_min_messages TO 'notice';

Once identified, you can easily delete the corrupted rows using queries like:

DELETE FROM your_table where pkey=1234 and ctid='(3,8)'
...

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.