13

I'm having trouble understanding how locks interact with transactions in Postgres.

When I run this (long) query, I am surprised by the high degree of locking that occurs:

BEGIN;
TRUNCATE foo;
\COPY foo FROM 'backup.txt';
COMMIT;

The documentation for \COPY doesn't mention what level of lock it requires, but this post indicates that it only gets a RowExclusiveLock. But when I run this query during the \COPY:

SELECT mode, granted FROM pg_locks
WHERE relation='foo'::regclass::oid;

I get this:

mode    granted
RowExclusiveLock    true
ShareLock   true
AccessExclusiveLock true

Where the heck is that AccessExclusiveLock coming from? I assume it's coming from the TRUNCATE, which requires an AccessExclusiveLock. But the TRUNCATE finishes quickly, so I'd expect the lock to release quickly as well. This leaves me with a few questions.

When a lock is acquired by a command within a transaction, is that lock released at the end of the command (before the end of the transaction)? If so, why do I observe the above behavior? If not, why not? In fact, since transactions don't touch the table until the COMMIT, why does a TRUNCATE in a transaction need to block the table at all?

I don't see any discussion of this in the documentation for transactions in PG.

1 Answer 1

29

There are a couple of misconceptions to be cleared up here.

First, a transaction does touch the table before it is committed. The comment you are quoting says that a ROLLBACK (and a COMMIT as well) don't touch the table, which is something different. They record the transaction state in the commit log (in pg_clog), and COMMIT flushes the transaction log to disk (one notable exception to this is TRUNCATE, which is relevant for your question: the old table is kept around until the end of the transaction and gets deleted during COMMIT).

If all changes were held back until COMMIT and no locks would be taken, COMMIT would be quite expensive and would routinely fail because of concurrent modifications. The transaction would have to remember the state of the database as it was before and check if the changes still apply. This way of handling concurrency is called optimistic concurreny control, and while it is a decent strategy for an application, it won't work well for a relational database, where COMMIT should be efficient and should not fail (unless there are major problems with the infrastructure).

So what relational databases use is pessimistic concurrency control or locking, i.e. they lock a database object before they access it to prevent concurrent activity from getting in their way.

Second, relational databases use two-phase locking, in which locks (at least the user-visible, so-called heavyweight locks) are always held until the end of the transaction. This is necessary (but not sufficient) to keep transactions in a logical order (serializable) and consistent. What if you release a lock, and somebody else removes the row that your inserted, but uncommitted row refers to via a foreign key constraint?

Answer to the question

The upshot of all this is that your table will keep the ACCESS EXCLUSIVE lock from TRUNCATE until the end of the transaction. Isn't it evident why that is necessary? If other transactions were allowed to even read the table after the (as of yet uncommitted) TRUNCATE, they would find it empty, since TRUNCATE really empties the table and does not adhere to MVCC semantics. Such a dirty read (of uncommitted data that might yet be rolled back) cannot be allowed.

If you really need read access to the table during the refill, you could use DELETE instead of TRUNCATE. The downside is that this is a much more expensive operation that will leave the table with a lot of “dead tuples” that have to be removed by autovacuum, resulting in a lot of empty space (table bloat). But if you are willing to live with a table and indexes that are bloated such that table and index scans will take at least twice as long, it is an option.

Sign up to request clarification or add additional context in comments.

6 Comments

Thanks, @Laurenz! I'm still struggling to grok how a TRUNCATE inside a transaction works, though. If "TRUNCATE really empties the table", and "ROLLBACK [doesn't] touch the table", then how does TRUNCATE get rolled back?
I'm also having trouble understanding what you mean by "dirty read" (I see it mentioned in the docs), but that probably warrants its own question.
I explained "dirty read" and added a link. You are right that my explanation is a bit inconsistent when it comes to TRUNCATE. As described in the code, the old table is not thrown away immediately, but at commit time, so in that case COMMIT does not exactly touch the table, but removes it. I'll try to make that clearer in my explanation.
@laurenzalbe what happens if one operation tries to update a row that is already locked by previous operation within the same transaction? Does that latter operation waits for the lock throughout the end of the transaction??
@MarçalJuan Thank you.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.