Description
The following table lists 2 DB servers using PostgreSQL streaming replication for synchronization:
| Server | Role | Method | Mode |
|---|---|---|---|
| db01 | Primary | ||
| db02 | Standby | Streaming replication | async |
db02 is configured as a failover but also used as a reporting DB, meaning SQL queries run on db02 while it is kept up-to-date with db01 using standard asynchronous streaming replication.
Issue
It seems that the Standby (db02) can actually slow down the Primary (db01) when the Standby (db02) falls behind and needs to catch up after a long running query (e.g. report query) on the Standby (db02) prevented it from staying up-to-date with the Primary (db01).
Breaking it down into steps:
- Long running SQL query on the Standby (db02) prevents it from staying up-to-date with the Primary (db01).
- SQL query completes (or is killed after reaching timeout) on the Standby (db02).
- The Primary (db01) then gives priority to the Standby (db02) for catching up. During this time performance is impacted as the Primary (db01) is processing less queries from the application while the Standby (db02) is catching up.
Question
Is it expected that a Primary DB could be impacted by a Standby DB falling behind even when using standard streaming replication which is asynchronous by default ?
Configuration
Here's the configuration for both servers
db01 (Primary)
listen_addresses = '*'
port = 5432
data_directory = '/data/postgres/14/pg_data'
shared_buffers = 6GB
effective_cache_size = 12GB
archive_mode = on
archive_timeout = 15min
archive_command = '/usr/bin/pgbackrest --stanza=prod archive-push %p'
autovacuum = on
autovacuum_vacuum_scale_factor = 0.1
autovacuum_vacuum_threshold = 50
checkpoint_completion_target = 0.9
default_statistics_target = 100
huge_pages = on
logging_collector = on
log_autovacuum_min_duration = 60s
log_directory = pg_log
log_filename = 'postgresql-%Y-%m-%d.log'
log_line_prefix = '%t [%p]: user=%u,db=%d,app=%a,client=%h'
log_lock_waits = on
log_min_messages = warning
log_rotation_age = 0
log_rotation_size = 1GB
max_locks_per_transaction = 512
max_wal_senders = 10
max_wal_size = 4GB
min_wal_size = 2GB
password_encryption = 'scram-sha-256'
ssl = on
ssl_ciphers = 'HIGH:+3DES:!aNULL'
ssl_min_protocol_version = 'TLSv1.2'
ssl_prefer_server_ciphers = on
superuser_reserved_connections = 3
synchronous_commit = on
track_counts = on
track_activity_query_size = 8192
wal_buffers = '-1'
wal_compression = off
wal_keep_size = 1600MB
wal_level = replica
work_mem = 64MB
fsync = on
autovacuum_max_workers = 5
autovacuum_work_mem = 256MB
checkpoint_timeout = 600s
maintenance_work_mem = 520MB
max_connections = 90
db02 (Standby)
listen_addresses = '*'
port = 5432
data_directory = '/data/postgres/14/pg_data'
shared_buffers = 6GB
effective_cache_size = 12GB
archive_mode = on
archive_timeout = 15min
archive_command = '/bin/true'
autovacuum = on
autovacuum_vacuum_scale_factor = 0.1
autovacuum_vacuum_threshold = 50
checkpoint_completion_target = 0.9
default_statistics_target = 100
huge_pages = on
logging_collector = on
log_autovacuum_min_duration = 60s
log_directory = pg_log
log_filename = 'postgresql-%Y-%m-%d.log'
log_line_prefix = '%t [%p]: user=%u,db=%d,app=%a,client=%h'
log_lock_waits = on
log_min_messages = warning
log_rotation_age = 0
log_rotation_size = 1GB
max_locks_per_transaction = 512
max_wal_senders = 10
max_wal_size = 4GB
min_wal_size = 2GB
password_encryption = 'scram-sha-256'
ssl = on
ssl_ciphers = 'HIGH:+3DES:!aNULL'
ssl_min_protocol_version = 'TLSv1.2'
ssl_prefer_server_ciphers = on
superuser_reserved_connections = 3
synchronous_commit = on
track_counts = on
track_activity_query_size = 8192
wal_buffers = '-1'
wal_compression = off
wal_keep_size = 1600MB
wal_level = replica
work_mem = 64MB
fsync = on
autovacuum_max_workers = 5
autovacuum_work_mem = 256MB
checkpoint_timeout = 600s
maintenance_work_mem = 520MB
max_connections = 90
max_standby_archive_delay = 900s
max_standby_streaming_delay = 900s
primary_conninfo = 'host=db01 port=5432 user=repuser'
synchronous_standby_namesis empty ondb01?