Standby setup

Command-line interface

From the command-line interface, creating a standby instance is done through the instance create command:

$ pglift instance create standby --standby-for <primary dsn> --standby-password
Password for the replication user:
Repeat for confirmation:

The --standby-for option should be a connection string to the primary server (e.g. host=primary port=5433). If the primary is also a pglift instance, you must use the dedicated replication user, set user=replication in the dsn.

A replication slot can be specified with --standby-slot <slot name>. The slot has to be first created on the primary instance, either at instance creation with the --slot <name> option or through instance alter --create-slot <name> if the primary already exists.

pglift will call pg_basebackup utility to create a standby by default. However, if pgBackRest is used on site, and a backup is available in the repository, pglift will create the standby from this backup, which can be significantly more efficient than a basebackup.

Note

If the primary instance has a password set for the super-user role, and is needed for local authentication through the password file in particular, it might be useful to provide the same password through --surole-password option when creating the standby.

Note

If Prometheus postgres_exporter was set up on the primary instance and is wanted on the standby, don’t forget to provide --prometheus-password option to the above command with the same password as on the primary instance.

Promoting a standby instance:

$ pglift instance promote standby

Note

If pgBackRest was set up on the instance, a pgbackrest check command may be automatically executed (after confirmation) upon promotion of the standby. If the promoted instance is meant for testing only and further intended to be demoted, this “check” step may be avoided.

Demotion of a former primary instance is possible through instance demote command:

$ pglift instance demote instance_b \
    --from="host=instance_a.dbserver role=rewinduser" --password

This operation first runs pg_rewind using the connection string of the “source server” specified through --from option along with the --password option for the role used for rewinding. Then it restores the original configuration of the instance as it was before “demotion” but with standby replication set up.

Ansible

Standby instance management is also possible from the Ansible interface. Please refer to the tutorial for starting up with this.

The dalibo.pglift.instance module exposes a standby field to define connection to the primary, password and replication slot.

First, we will define a password for the replication user on the control node and store that in Ansible vault:

user@control:~$ cat << EOF | ansible-vault encrypt > pglift-vars
replication_role_password: $(openssl rand -base64 9)
EOF

The following playbook will create 2 instances on the same host:

  • pg1 as a primary, port 5432

  • pg2 as a standby, port 5433

standby-setup.yml
---
- name: Setup standby
  hosts: localhost
  tasks:
    - name: Create primary instance 14/pg1
      dalibo.pglift.instance:
        name: pg1
        state: started
        port: "{{ pg1.port }}"
        settings:
          listen_addresses: '*'
          shared_buffers: 1GB
        replrole_password: "{{ replication_role_password }}"
        replication_slots:
          - name: replislot

    - name: Create standby instance 14/pg2
      dalibo.pglift.instance:
        name: pg2
        state: started
        port: "{{ pg2.port }}"
        settings:
          listen_addresses: '*'
          shared_buffers: 1GB
        standby:
          primary_conninfo: host=127.0.0.1 user=replication port={{ pg1.port }}
          password: "{{ replication_role_password }}"
          slot: replislot

And finally, run the playbook:

user@control:~$ ansible-playbook --extra-vars @pglift-vars --ask-vault-password standby-setup.yml
Vault password:

PLAY [standby setup] ***********************************************************************************************************************

TASK [Gathering Facts] *********************************************************************************************************************
ok: [localhost]

TASK [primary instance 14/main] ************************************************************************************************************
changed: [localhost]

TASK [replica instance 14/main] ************************************************************************************************************
changed: [localhost]

PLAY RECAP *********************************************************************************************************************************
localhost                  : ok=3    changed=2    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

We can find our instances running on managed node :

user@managed:~$ pglift instance list
┏━━━━━━┳━━━━━━━━━┳━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┓
┃ name ┃ version ┃ port ┃ datadir                ┃ status  ┃
┡━━━━━━╇━━━━━━━━━╇━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━┩
│ pg1  │ 14      │ 5432 │ /srv/pgsql/14/pg1/data │ running │
│ pg2  │ 14      │ 5433 │ /srv/pgsql/14/pg2/data │ running │
└──────┴─────────┴──────┴────────────────────────┴─────────┘

The following playbook will promote pg2 and rebuild pg1 as a standby:

standby-promote.yml
---
- name: Promote standby
  hosts: localhost
  tasks:
    - name: Promote standby instance 14/pg2, add a new slot
      dalibo.pglift.instance:
        name: pg2
        state: started
        port: "{{ pg2.port }}"
        settings:
          listen_addresses: '*'
          shared_buffers: 1GB
        standby:
          primary_conninfo: host=127.0.0.1 user=replication port={{ pg1.port }}
          status: promoted
        replication_slots:
          - name: newslot

    - name: Drop 14/pg1 instance
      dalibo.pglift.instance:
        name: pg1
        state: absent

    - name: Rebuild 14/pg1 instance as a standby
      dalibo.pglift.instance:
        name: pg1
        state: started
        port: "{{ pg1.port }}"
        settings:
          listen_addresses: '*'
          shared_buffers: 1GB
        standby:
          primary_conninfo: host=127.0.0.1 user=replication port={{ pg2.port }}
          password: "{{ replication_role_password }}"
          slot: newslot

Run the playbook:

user@control:~$ ansible-playbook --extra-vars @pglift-vars --ask-vault-password standby-promote.yml
Vault password:

PLAY [promote standby] **************************************************************************************************

TASK [Gathering Facts] **************************************************************************************************
ok: [localhost]

TASK [promote standby instance 14/pg2] **********************************************************************************
ok: [localhost]

TASK [delete 14/pg1 instance] *******************************************************************************************
changed: [localhost]

TASK [rebuild 14/pg1 instance as a standby] *****************************************************************************
changed: [localhost]

PLAY RECAP **************************************************************************************************************
localhost                  : ok=5    changed=2    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

Finally we can see pg1 as a standby and both running :

user@managed:~$ pglift instance get pg1
 name  version  port  data_checksums  locale  encoding  extensions  pending_restart  standby
 pg1   14       5432  False           C       UTF8                  False            primary_conninfo: user=replication
                                                                                     channel_binding=prefer
                                                                                     host=127.0.0.1 port=5433
                                                                                     sslmode=prefer sslcompression=0
                                                                                     sslsni=1
                                                                                     ssl_min_protocol_version=TLSv1.2
                                                                                     gssencmode=prefer
                                                                                     krbsrvname=postgres
                                                                                     target_session_attrs=any
                                                                                     status: demoted
                                                                                     slot:
                                                                                     replication_lag: 0

user@managed:~$ pglift instance list
┏━━━━━━┳━━━━━━━━━┳━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┓
┃ name ┃ version ┃ port ┃ datadir                ┃ status  ┃
┡━━━━━━╇━━━━━━━━━╇━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━┩
│ pg1  │ 14      │ 5432 │ /srv/pgsql/14/pg1/data │ running │
│ pg2  │ 14      │ 5433 │ /srv/pgsql/14/pg2/data │ running │
└──────┴─────────┴──────┴────────────────────────┴─────────┘