Standby setup¶
Command-line interface¶
From the command-line interface, creating a standby instance is done through
the instance create
command:
$ pglift instance create standby --standby-for <primary dsn> --standby-password
Password for the replication user:
Repeat for confirmation:
The --standby-for
option should be a connection string to the primary
server (e.g. host=primary port=5433
).
If the primary is also a pglift instance, you must use the dedicated
replication
user, set user=replication
in the dsn.
A replication slot can be specified with --standby-slot <slot name>
. The
slot has to be first created on the primary instance, either at instance
creation with the --slot <name>
option or through instance alter
--create-slot <name>
if the primary already exists.
pglift will call pg_basebackup utility to create a standby by default. However, if pgBackRest is used on site, and a backup is available in the repository, pglift will create the standby from this backup, which can be significantly more efficient than a basebackup.
Note
If the primary instance has a password set for the super-user role, and is
needed for local authentication through the password file in particular, it
might be useful to provide the same password through --surole-password
option when creating the standby.
Note
If Prometheus postgres_exporter was set up on the primary instance and is
wanted on the standby, don’t forget to provide --prometheus-password
option to the above command with the same password as on the primary
instance.
Promoting a standby instance:
$ pglift instance promote standby
Note
If pgBackRest was set up on the instance, a pgbackrest check
command may
be automatically executed (after confirmation) upon promotion of the standby.
If the promoted instance is meant for testing only and further intended to be
demoted, this “check” step may be avoided.
Demotion of a former primary instance is possible through instance demote
command:
$ pglift instance demote instance_b \
--from="host=instance_a.dbserver role=rewinduser" --password
This operation first runs pg_rewind
using the connection string
of the “source server” specified through --from
option along with the
--password
option for the role used for rewinding. Then it restores the
original configuration of the instance as it was before “demotion” but with
standby replication set up.
Ansible¶
Standby instance management is also possible from the Ansible interface. Please refer to the tutorial for starting up with this.
The dalibo.pglift.instance
module exposes a standby
field to define
connection to the primary, password and replication slot.
First, we will define a password for the replication user on the control
node and store that in Ansible vault:
user@control:~$ cat << EOF | ansible-vault encrypt > pglift-vars
replication_role_password: $(openssl rand -base64 9)
EOF
The following playbook will create 2 instances on the same host:
pg1
as a primary, port5432
pg2
as a standby, port5433
---
- name: Setup standby
hosts: localhost
tasks:
- name: Create primary instance 14/pg1
dalibo.pglift.instance:
name: pg1
state: started
port: "{{ pg1.port }}"
settings:
listen_addresses: '*'
shared_buffers: 1GB
replrole_password: "{{ replication_role_password }}"
replication_slots:
- name: replislot
- name: Create standby instance 14/pg2
dalibo.pglift.instance:
name: pg2
state: started
port: "{{ pg2.port }}"
settings:
listen_addresses: '*'
shared_buffers: 1GB
standby:
primary_conninfo: host=127.0.0.1 user=replication port={{ pg1.port }}
password: "{{ replication_role_password }}"
slot: replislot
And finally, run the playbook:
user@control:~$ ansible-playbook --extra-vars @pglift-vars --ask-vault-password standby-setup.yml
Vault password:
PLAY [standby setup] ***********************************************************************************************************************
TASK [Gathering Facts] *********************************************************************************************************************
ok: [localhost]
TASK [primary instance 14/main] ************************************************************************************************************
changed: [localhost]
TASK [replica instance 14/main] ************************************************************************************************************
changed: [localhost]
PLAY RECAP *********************************************************************************************************************************
localhost : ok=3 changed=2 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
We can find our instances running on managed
node :
user@managed:~$ pglift instance list
┏━━━━━━┳━━━━━━━━━┳━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┓
┃ name ┃ version ┃ port ┃ datadir ┃ status ┃
┡━━━━━━╇━━━━━━━━━╇━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━┩
│ pg1 │ 14 │ 5432 │ /srv/pgsql/14/pg1/data │ running │
│ pg2 │ 14 │ 5433 │ /srv/pgsql/14/pg2/data │ running │
└──────┴─────────┴──────┴────────────────────────┴─────────┘
The following playbook will promote pg2
and rebuild pg1
as a
standby:
---
- name: Promote standby
hosts: localhost
tasks:
- name: Promote standby instance 14/pg2, add a new slot
dalibo.pglift.instance:
name: pg2
state: started
port: "{{ pg2.port }}"
settings:
listen_addresses: '*'
shared_buffers: 1GB
standby:
primary_conninfo: host=127.0.0.1 user=replication port={{ pg1.port }}
status: promoted
replication_slots:
- name: newslot
- name: Drop 14/pg1 instance
dalibo.pglift.instance:
name: pg1
state: absent
- name: Rebuild 14/pg1 instance as a standby
dalibo.pglift.instance:
name: pg1
state: started
port: "{{ pg1.port }}"
settings:
listen_addresses: '*'
shared_buffers: 1GB
standby:
primary_conninfo: host=127.0.0.1 user=replication port={{ pg2.port }}
password: "{{ replication_role_password }}"
slot: newslot
Run the playbook:
user@control:~$ ansible-playbook --extra-vars @pglift-vars --ask-vault-password standby-promote.yml
Vault password:
PLAY [promote standby] **************************************************************************************************
TASK [Gathering Facts] **************************************************************************************************
ok: [localhost]
TASK [promote standby instance 14/pg2] **********************************************************************************
ok: [localhost]
TASK [delete 14/pg1 instance] *******************************************************************************************
changed: [localhost]
TASK [rebuild 14/pg1 instance as a standby] *****************************************************************************
changed: [localhost]
PLAY RECAP **************************************************************************************************************
localhost : ok=5 changed=2 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
Finally we can see pg1
as a standby and both running :
user@managed:~$ pglift instance get pg1
name version port data_checksums locale encoding extensions pending_restart standby
pg1 14 5432 False C UTF8 False primary_conninfo: user=replication
channel_binding=prefer
host=127.0.0.1 port=5433
sslmode=prefer sslcompression=0
sslsni=1
ssl_min_protocol_version=TLSv1.2
gssencmode=prefer
krbsrvname=postgres
target_session_attrs=any
status: demoted
slot:
replication_lag: 0
user@managed:~$ pglift instance list
┏━━━━━━┳━━━━━━━━━┳━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┓
┃ name ┃ version ┃ port ┃ datadir ┃ status ┃
┡━━━━━━╇━━━━━━━━━╇━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━┩
│ pg1 │ 14 │ 5432 │ /srv/pgsql/14/pg1/data │ running │
│ pg2 │ 14 │ 5433 │ /srv/pgsql/14/pg2/data │ running │
└──────┴─────────┴──────┴────────────────────────┴─────────┘