Andy Kerber (@dbakerber), Senior Consultant
In this blog, we will cover the installation and configuration of PostgreSQL for load balancing and High Availability (HA). This is a fairly basic, traditional configuration that uses a hot standby rather than SQL replication.
Servers
First, a summary of the servers involved.
Pgstream1 and pgstream2 are the Postgres database servers. Pgstream1 includes the primary database, pgstream2 includes the hot standby.
Pgpool1 and pgpool2 are the load balancing connection pooling servers for the Postgres databases, which are set up as a watchdog cluster. Automatic activation of the hot standby is not configured, but we will discuss it in a future blog.
The steps in the section titled ‘DB Server Setup’ to the start of ‘Hot Standby Setup’ must be done on both pgstream1 and pgstream2 and the steps are the same for both servers.
DB Server Setup
The host file is below. This file is the same on all servers discussed in this blog. The –priv ips are used for communication between the primary and standby Postgres instances. The –hb ips are used for communication between the two pgpool servers. Isolating the networks is a best practice:
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 10.12.1.61 pgstream2 pgstream2.localdomain 10.12.1.225 pgstream1 pgstream1.localdomain 192.168.57.10 pgstream1-priv pgstream1-priv.localdomain 192.168.57.11 pgstream2-priv pgstream2-priv.localdomain 10.12.1.142 pgpool1 pgpool1.localdomain 10.12.1.204 pgpool2 pgpool2.localdomain 192.168.31.20 pgpool1-hb pgpool1-hb.localdomain 192.168.31.21 pgpool2-hb pgpool2-hb.localdomain
For this blog, we used the open source version of Postgres, with the open source repos. Below are the commands we used to install the required repos:
yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm yum install curl --header 'Host: download.postgresql.org' --user-agent 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:66.0) Gecko/20100101 Firefox/66.0' --header 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' --header 'Accept-Language: en-US,en;q=0.5' --referer 'https://yum.postgresql.org/repopackages.php' --header 'DNT: 1' --cookie '__utma=257560202.337789427.1552660161.1554995806.1555514629.5; __utmz=257560202.1555514629.5.4.utmcsr=google|utmccn=(organic)|utmcmd=organic|utmctr=(not%20provided); __utmc=257560202; __utmb=257560202.2.10.1555514629' --header 'Upgrade-Insecure-Requests: 1' 'https://download.postgresql.org/pub/repos/yum/11/redhat/rhel-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm' --output 'pgdg-redhat-repo-latest.noarch.rpm' yum -y install pgdg-redhat-repo-latest.noarch.rpm
Next, with the repos configured and everything downloaded, install the Postgres software:
yum install postgresql11-server
Configure the data directories for Postgres, and use LVM for space management. The steps below create a partition in a new disk device, put it in LVM, label it, and create an XFS file system on it. While obviously the Postgres data can go on the root file system, it works best to put it in a separate volume group for simpler space management.
fdisk /dev/sdb
Create a new primary partition on /dev/sdb using all the space on the drive. The name of the partition is /dev/sdb1.
[root@pgstream2 ~]# fdisk /dev/sdb Welcome to fdisk (util-linux 2.23.2). Changes will remain in memory only, until you decide to write them. Be careful before using the write command. Device does not contain a recognized partition table Building a new DOS disklabel with disk identifier 0x57646c50. Command (m for help): n Partition type: p primary (0 primary, 0 extended, 4 free) e extended Select (default p): Using default response p Partition number (1-4, default 1): First sector (2048-41943039, default 2048): Using default value 2048 Last sector, +sectors or +size{K,M,G} (2048-41943039, default 41943039): Using default value 41943039 Partition 1 of type Linux and of size 20 GiB is set Command (m for help): w The partition table has been altered! Calling ioctl() to re-read partition table. Syncing disks.
Next, create a new logical volume group for the Postgres data. The commands below create a new physical volume from /dev/sdb1. They also create a new volume group called data, containing /dev/sdb1. Next, a new logical volume named pgdata is created that contains the volume group. The logical volume will be used for the data.
pvcreate /dev/sdb1 vgcreate data /dev/sdb1 lvcreate -n pgdata -l 100%FREE data mkfs.xfs /dev/data/pgdata xfs_admin -L pgdata /dev/data/pgdata mkdir -p /postgres/data
Next, do the same for /dev/sdc. The name of the partition is /dev/sdc1. Create a new logical volume as indicated below, the logical volume will be used for the WAL (archive) logs.
fdisk /dev/sdc vgcreate archive /dev/sdc1 lvcreate -n archive_vol -l 100%FREE archive mkfs.xfs -L archive /dev/archive/archive_vol mkdir -p /postgres/archive
Then, add lines in /etc/fstab for these two new file systems:
LABEL=pgdata /postgres/data xfs defaults 0 0 LABEL=archive /postgres/archive xfs defaults 0 0
Mount the devices using mount –a. Mounting the devices this way accomplishes two things, it ensures that the fstab format is correct, and mounts the drives.
Next, we need to initialize the database. Set the PGDATA environment variable to point to our data directory:
export PGDATA=/postgres/data/pgdata /usr/pgsql-11/bin/postgresql-11-setup initdb
This creates the initial data directory (cluster, in Postgres parlance) with an empty database and the required configuration files.
Next, edit the pg_hba.conf file to allow connections as desired. Our pg_hba.conf file is below:
local all all trust host all all 0.0.0.0/0 md5 host all all pgstream1 trust host all all pgstream2 trust # IPv4 local connections: host all all 127.0.0.1/32 ident # IPv6 local connections: host all all ::1/128 ident # Allow replication connections from localhost, by a user with the # replication privilege. local replication all trust host replication all 127.0.0.1/32 ident host replication all ::1/128 ident host replication all 0.0.0.0/0 md5
At this point, the initial basic setup is complete. Now we need to configure the database for hot standby replication.
Configure Postgres to start as a service. First, edit the file /usr/lib/systemd/system/postgresql-11.service and edit the line beginning with ‘Environment=’. Change the line to read:
Environment=PGDATA=/postgres/data/pgdata
Now, enable and start the service:
systemctl enable postgresql-11.service systemctl start postgresql-11.service
Hot Standby Setup – Primary
On the primary server, make the following changes to the postgresql.conf file. Only the non-default parameters are noted here:
listen_addresses = '*' shared_buffers = 1024MB # min 128kB wal_level = logical # minimal, replica, or logical full_page_writes = on # recover from partial page writes max_wal_size = 1GB min_wal_size = 80MB archive_mode = on # enables archiving; off, on, or always archive_command = 'test ! -f /postgres/archive/%f && cp %p /postgres/archive/%f' hot_standby = on # "off" disallows queries during recovery
Create the replication user, which is the user that has privileges to create the hot standby:
postgres=# create role replication with REPLICATION password 'mypassword' login;
Hot Standby Setup – Standby
On the standby server, stop the Postgres service:
systemctl stop postgresql-11.service
Then, delete all files in the PGDATA directory:
(rm –rf /postgres/data/pgdata)
Create the hot standby files using the pg_basebackup command:
pg_basebackup -h pgstream1-priv -U replication -D /postgres/data/pgdata
Next, before starting the database, create the recovery.conf file (with the contents below) in the PGDATA directory. This tells Postgres where the source WAL logs are coming from:
standby_mode = 'on' primary_conninfo = 'host=pgstream1-priv port=5432 user=replication password=replication' trigger_file= '/var/lib/pgsql/activate.now'
Start the standby:
systemctl start postgresql-11.service
After starting the service, monitor the log file in $PGDATA/log/postgresql-dow.log. Lines similar to those below indicate that the standby is working properly:
2019-04-30 14:29:47.664 CDT [20428] LOG: database system is ready to accept read only connections 2019-04-30 14:29:47.675 CDT [20436] LOG: started streaming WAL from primary at 0/7000000 on timeline 1
Should it become necessary to activate the standby in read-write mode, we will use the command ‘touch /var/lib/pgsql/activate.now’. The existence of the file created by the ‘touch’ command is the signal to activate the standby. The file is created on the standby server.
Pgpool Setup
Pgpool is connection pooling software for the Postgres database. It can be used to direct communications to the standby or primary node, as well as manage failover.
More about managing failover will be discussed later blogs. In this blog, we are going set up a two-node Pgpool watchdog cluster.
First, install the Pgpool software. For this example, we are using the most recent available version pgpool-II-10. The command to install the software is yum install pgpool-II-10. Note that for this version of Pgpool, the authors chose to leave the full version information on the directories, so you will use the Pgpool-II-10 directories rather than the Pgpool-II directories, which can be confusing.
Next, modify the files for the service to be run by root. Running it under root makes management easier. Just remove the line that tells it to run under Postgres, so the file /usr/lib/systemd/system/pgpool-II-10.service reads like this:
[Unit] Description=PGPool-II Middleware Between PostgreSQL Servers And PostgreSQL Database Clients After=syslog.target network.target [Service] EnvironmentFile=-/etc/sysconfig/pgpool-II-10 ExecStart=/usr/pgpool-10/bin/pgpool -f /etc/pgpool-II-10/pgpool.conf $OPTS ExecStop=/usr/pgpool-10/bin/pgpool -f /etc/pgpool-II-10/pgpool.conf -m fast stop ExecReload=/bin/kill -HUP $MAINPID KillMode=mixed KillSignal=SIGINT [Install] WantedBy=multi-user.target
Update the file on pgpool1 as follows (non-default settings are the only settings listed):
listen_addresses = '*' port = 5432 pcp_listen_addresses = '*' pcp_port = 9898 backend_hostname0 = '10.12.1.225' backend_port0 = 5432 backend_data_directory0 = '/postgres/data/pgdata' backend_flag0 = 'DISALLOW_TO_FAILOVER' backend_hostname1 = '10.12.1.61' backend_port1 = 5432 backend_weight1 = 1 backend_data_directory1 = '/postgres/data/pgdata' backend_flag1 = 'DISALLOW_TO_FAILOVER' enable_pool_hba = on pool_passwd = 'pool_passwd' log_destination = 'syslog' syslog_facility = 'LOCAL6' syslog_ident = 'pgpool-II-10' load_balance_mode = on master_slave_mode = on master_slave_sub_mode = 'stream' sr_check_user = 'postgres' sr_check_password = 'password' sr_check_database = 'postgres' health_check_user = 'postgres' health_check_password = 'password' use_watchdog = on wd_hostname = '192.168.31.20' wd_port = 9000 delegate_IP = '10.12.1.111' if_up_cmd = 'ip addr add $_IP_$/24 dev ens33 label ens33:111' if_down_cmd = 'ip addr del $_IP_$/24 dev ens33' arping_cmd = 'ping -w 1 $_IP_$' wd_lifecheck_method = 'heartbeat' wd_heartbeat_port = 9694 heartbeat_destination0 = '192.168.31.21' heartbeat_destination_port0 = 9694 other_pgpool_hostname0 = '192.168.159.21' other_pgpool_port0 = 5432 other_wd_port0 = 9000
On pgpool2, the changes are very similar, just reverse the node IP addresses:
listen_addresses = '*' port = 5432 pcp_listen_addresses = '*' pcp_port = 9898 backend_hostname0 = '10.12.1.225' backend_port0 = 5432 backend_data_directory0 = '/postgres/data/pgdata' backend_flag0 = 'DISALLOW_TO_FAILOVER' backend_hostname1 = '10.12.1.61' backend_port1 = 5432 backend_weight1 = 1 backend_data_directory1 = '/postgres/data/pgdata' backend_flag1 = 'DISALLOW_TO_FAILOVER' enable_pool_hba = on pool_passwd = 'pool_passwd' log_destination = 'syslog' syslog_facility = 'LOCAL6' syslog_ident = 'pgpool-II-10' load_balance_mode = on master_slave_mode = on master_slave_sub_mode = 'stream' sr_check_user = 'postgres' sr_check_password = 'password' sr_check_database = 'postgres' health_check_user = 'postgres' health_check_password = 'password' use_watchdog = on wd_hostname = '192.168.31.21' wd_port = 9000 delegate_IP = '10.12.1.111' if_up_cmd = 'ip addr add $_IP_$/24 dev ens33 label ens33:111' if_down_cmd = 'ip addr del $_IP_$/24 dev ens33' arping_cmd = 'ping -w 1 $_IP_$' wd_lifecheck_method = 'heartbeat' wd_heartbeat_port = 9694 heartbeat_destination0 = '192.168.31.20' heartbeat_destination_port0 = 9694 other_pgpool_hostname0 = '192.168.159.20' other_pgpool_port0 = 5432 other_wd_port0 = 9000
Next, on each server add this line to /etc/rsyslog.conf:
local6.* /var/log/pgpool-II-10/pgpool-II-10.log
The line will cause pgpool-II-10 to log information to the file /var/log/pgpool-II-10/ pgpool-II-10.log.
Then, on either server, use the pg_md5 to populate the pool_passwd file:
pg_md5 --md5auth --username=postgres password
The file is located in /etc/pgpool-II-10. Copy the password from the current server to the same directory on the other server.
Edit the pool_hba.conf to read as required. Below is the pool_hba.conf for this configuration:
# "local" is for Unix domain socket connections only local all all trust # IPv4 local connections: host all all 127.0.0.1/32 trust host all all ::1/128 trust host all all 0.0.0.0/0 md5 host all all pgstream1 trust host all all pgstream2 trust
Copy the file to both nodes.
At this point, you are ready to start pgpool. On the first node, start the service using this command:
systemctl start pgpool-II-10.service
A successful startup of the first node will look like the following in the log file:
Apr 30 16:39:50 pgpool1 pgpool-II-10[17955]: [1-1] 2019-04-30 16:39:50: pid 17955: WARNING: checking setuid bit of if_up_cmd Apr 30 16:39:50 pgpool1 pgpool-II-10[17955]: [1-2] 2019-04-30 16:39:50: pid 17955: DETAIL: ifup[/usr/sbin/ip] doesn't have setuid bit Apr 30 16:39:50 pgpool1 pgpool-II-10[17955]: [2-1] 2019-04-30 16:39:50: pid 17955: WARNING: checking setuid bit of if_down_cmd Apr 30 16:39:50 pgpool1 pgpool-II-10[17955]: [2-2] 2019-04-30 16:39:50: pid 17955: DETAIL: ifdown[/usr/sbin/ip] doesn't have setuid bit Apr 30 16:39:50 pgpool1 pgpool-II-10[17955]: [3-1] 2019-04-30 16:39:50: pid 17955: WARNING: checking setuid bit of arping command Apr 30 16:39:50 pgpool1 pgpool-II-10[17955]: [3-2] 2019-04-30 16:39:50: pid 17955: DETAIL: arping[/bin/ping] doesn't have setuid bit Apr 30 16:39:50 pgpool1 pgpool-II-10[17955]: [4-1] 2019-04-30 16:39:50: pid 17955: LOG: Backend status file /var/log/pgpool-II-10/pgpool_status discarded Apr 30 16:39:50 pgpool1 pgpool-II-10[17955]: [5-1] 2019-04-30 16:39:50: pid 17955: LOG: waiting for watchdog to initialize Apr 30 16:39:50 pgpool1 pgpool-II-10[17956]: [5-1] 2019-04-30 16:39:50: pid 17956: LOG: setting the local watchdog node name to "192.168.31.20:5432 Linux pgpool1.localdomain" Apr 30 16:39:50 pgpool1 pgpool-II-10[17956]: [6-1] 2019-04-30 16:39:50: pid 17956: LOG: watchdog cluster is configured with 1 remote nodes Apr 30 16:39:50 pgpool1 pgpool-II-10[17956]: [7-1] 2019-04-30 16:39:50: pid 17956: LOG: watchdog remote node:0 on 192.168.159.21:9000 Apr 30 16:39:50 pgpool1 pgpool-II-10[17956]: [8-1] 2019-04-30 16:39:50: pid 17956: LOG: interface monitoring is disabled in watchdog Apr 30 16:39:50 pgpool1 pgpool-II-10[17956]: [9-1] 2019-04-30 16:39:50: pid 17956: LOG: watchdog node state changed from [DEAD] to [LOADING] Apr 30 16:39:55 pgpool1 pgpool-II-10[17956]: [10-1] 2019-04-30 16:39:55: pid 17956: LOG: watchdog node state changed from [LOADING] to [JOINING] Apr 30 16:39:59 pgpool1 pgpool-II-10[17956]: [11-1] 2019-04-30 16:39:59: pid 17956: LOG: watchdog node state changed from [JOINING] to [INITIALIZING] Apr 30 16:40:00 pgpool1 pgpool-II-10[17956]: [12-1] 2019-04-30 16:40:00: pid 17956: LOG: I am the only alive node in the watchdog cluster Apr 30 16:40:00 pgpool1 pgpool-II-10[17956]: [12-2] 2019-04-30 16:40:00: pid 17956: HINT: skipping stand for coordinator state Apr 30 16:40:00 pgpool1 pgpool-II-10[17956]: [13-1] 2019-04-30 16:40:00: pid 17956: LOG: watchdog node state changed from [INITIALIZING] to [MASTER] Apr 30 16:40:00 pgpool1 pgpool-II-10[17956]: [14-1] 2019-04-30 16:40:00: pid 17956: LOG: I am announcing my self as master/coordinator watchdog node Apr 30 16:40:04 pgpool1 pgpool-II-10[17956]: [15-1] 2019-04-30 16:40:04: pid 17956: LOG: I am the cluster leader node Apr 30 16:40:04 pgpool1 pgpool-II-10[17956]: [15-2] 2019-04-30 16:40:04: pid 17956: DETAIL: our declare coordinator message is accepted by all nodes Apr 30 16:40:04 pgpool1 pgpool-II-10[17956]: [16-1] 2019-04-30 16:40:04: pid 17956: LOG: setting the local node "192.168.31.20:5432 Linux pgpool1.localdomain" as watchdog cluster master Apr 30 16:40:04 pgpool1 pgpool-II-10[17956]: [17-1] 2019-04-30 16:40:04: pid 17956: LOG: I am the cluster leader node. Starting escalation process Apr 30 16:40:04 pgpool1 pgpool-II-10[17956]: [18-1] 2019-04-30 16:40:04: pid 17956: LOG: escalation process started with PID:17963 Apr 30 16:40:04 pgpool1 pgpool-II-10[17955]: [6-1] 2019-04-30 16:40:04: pid 17955: LOG: watchdog process is initialized Apr 30 16:40:04 pgpool1 pgpool-II-10[17956]: [19-1] 2019-04-30 16:40:04: pid 17956: LOG: new IPC connection received Apr 30 16:40:04 pgpool1 pgpool-II-10[17955]: [7-1] 2019-04-30 16:40:04: pid 17955: LOG: Setting up socket for 0.0.0.0:5432 Apr 30 16:40:04 pgpool1 pgpool-II-10[17955]: [8-1] 2019-04-30 16:40:04: pid 17955: LOG: Setting up socket for :::5432 Apr 30 16:40:04 pgpool1 pgpool-II-10[17955]: [9-1] 2019-04-30 16:40:04: pid 17955: LOG: find_primary_node_repeatedly: waiting for finding a primary node Apr 30 16:40:04 pgpool1 pgpool-II-10[17963]: [18-1] 2019-04-30 16:40:04: pid 17963: LOG: watchdog: escalation started Apr 30 16:40:04 pgpool1 pgpool-II-10[17956]: [20-1] 2019-04-30 16:40:04: pid 17956: LOG: new IPC connection received Apr 30 16:40:04 pgpool1 pgpool-II-10[17964]: [7-1] 2019-04-30 16:40:04: pid 17964: LOG: 2 watchdog nodes are configured for lifecheck Apr 30 16:40:04 pgpool1 pgpool-II-10[17964]: [8-1] 2019-04-30 16:40:04: pid 17964: LOG: watchdog nodes ID:0 Name:"192.168.31.20:5432 Linux pgpool1.localdomain" Apr 30 16:40:04 pgpool1 pgpool-II-10[17964]: [8-2] 2019-04-30 16:40:04: pid 17964: DETAIL: Host:"192.168.31.20" WD Port:9000 pgpool-II port:5432 Apr 30 16:40:04 pgpool1 pgpool-II-10[17964]: [9-1] 2019-04-30 16:40:04: pid 17964: LOG: watchdog nodes ID:1 Name:"Not_Set" Apr 30 16:40:04 pgpool1 pgpool-II-10[17964]: [9-2] 2019-04-30 16:40:04: pid 17964: DETAIL: Host:"192.168.159.21" WD Port:9000 pgpool-II port:5432 Apr 30 16:40:04 pgpool1 pgpool-II-10[17955]: [10-1] 2019-04-30 16:40:04: pid 17955: LOG: find_primary_node: primary node is 0 Apr 30 16:40:04 pgpool1 pgpool-II-10[17955]: [11-1] 2019-04-30 16:40:04: pid 17955: LOG: find_primary_node: standby node is 1 Apr 30 16:40:04 pgpool1 pgpool-II-10[17955]: [12-1] 2019-04-30 16:40:04: pid 17955: LOG: pgpool-II successfully started. version 4.0.3 (torokiboshi) Apr 30 16:40:04 pgpool1 pgpool-II-10[17955]: [13-1] 2019-04-30 16:40:04: pid 17955: LOG: node status[0]: 1 Apr 30 16:40:04 pgpool1 pgpool-II-10[17955]: [14-1] 2019-04-30 16:40:04: pid 17955: LOG: node status[1]: 2 Apr 30 16:40:05 pgpool1 pgpool-II-10[17999]: [10-1] 2019-04-30 16:40:05: pid 17999: LOG: set SO_REUSEPORT option to the socket Apr 30 16:40:05 pgpool1 pgpool-II-10[17999]: [11-1] 2019-04-30 16:40:05: pid 17999: LOG: creating socket for sending heartbeat Apr 30 16:40:05 pgpool1 pgpool-II-10[17999]: [11-2] 2019-04-30 16:40:05: pid 17999: DETAIL: set SO_REUSEPORT Apr 30 16:40:05 pgpool1 pgpool-II-10[17998]: [10-1] 2019-04-30 16:40:05: pid 17998: LOG: set SO_REUSEPORT option to the socket Apr 30 16:40:05 pgpool1 pgpool-II-10[17998]: [11-1] 2019-04-30 16:40:05: pid 17998: LOG: creating watchdog heartbeat receive socket. Apr 30 16:40:05 pgpool1 pgpool-II-10[17998]: [11-2] 2019-04-30 16:40:05: pid 17998: DETAIL: set SO_REUSEPORT Apr 30 16:40:07 pgpool1 pgpool-II-10[17963]: [19-1] 2019-04-30 16:40:07: pid 17963: LOG: successfully acquired the delegate IP:"10.12.1.111" Apr 30 16:40:07 pgpool1 pgpool-II-10[17963]: [19-2] 2019-04-30 16:40:07: pid 17963: DETAIL: 'if_up_cmd' returned with success Apr 30 16:40:07 pgpool1 pgpool-II-10[17956]: [21-1] 2019-04-30 16:40:07: pid 17956: LOG: watchdog escalation process with pid: 17963 exit with SUCCESS.
Next, use the same command to start up the second node. In the original log, a successful startup of the second node will look like this:
Apr 30 16:42:28 pgpool1 pgpool-II-10[17956]: [22-1] 2019-04-30 16:42:28: pid 17956: LOG: new watchdog node connection is received from "192.168.159.21:11962" Apr 30 16:42:28 pgpool1 pgpool-II-10[17956]: [23-1] 2019-04-30 16:42:28: pid 17956: LOG: new node joined the cluster hostname:"192.168.31.21" port:9000 pgpool_port:5432 Apr 30 16:42:28 pgpool1 pgpool-II-10[17956]: [24-1] 2019-04-30 16:42:28: pid 17956: LOG: new outbound connection to 192.168.159.21:9000 Apr 30 16:42:29 pgpool1 pgpool-II-10[17956]: [25-1] 2019-04-30 16:42:29: pid 17956: LOG: adding watchdog node "192.168.31.21:5432 Linux pgpool2" to the standby list Apr 30 16:42:29 pgpool1 pgpool-II-10[17955]: [15-1] 2019-04-30 16:42:29: pid 17955: LOG: Pgpool-II parent process received watchdog quorum change signal from watchdog Apr 30 16:42:29 pgpool1 pgpool-II-10[17956]: [26-1] 2019-04-30 16:42:29: pid 17956: LOG: new IPC connection received Apr 30 16:42:29 pgpool1 pgpool-II-10[17955]: [16-1] 2019-04-30 16:42:29: pid 17955: LOG: watchdog cluster now holds the quorum Apr 30 16:42:29 pgpool1 pgpool-II-10[17955]: [16-2] 2019-04-30 16:42:29: pid 17955: DETAIL: updating the state of quarantine backend nodes Apr 30 16:42:29 pgpool1 pgpool-II-10[17956]: [27-1] 2019-04-30 16:42:29: pid 17956: LOG: new IPC connection received
On node 2, it will look something like this:
Apr 30 16:42:28 pgpool2 pgpool-II-10[17193]: [1-1] 2019-04-30 16:42:28: pid 17193: WARNING: checking setuid bit of if_up_cmd Apr 30 16:42:28 pgpool2 pgpool-II-10[17193]: [1-2] 2019-04-30 16:42:28: pid 17193: DETAIL: ifup[/usr/sbin/ip] doesn't have setuid bit Apr 30 16:42:28 pgpool2 pgpool-II-10[17193]: [2-1] 2019-04-30 16:42:28: pid 17193: WARNING: checking setuid bit of if_down_cmd Apr 30 16:42:28 pgpool2 pgpool-II-10[17193]: [2-2] 2019-04-30 16:42:28: pid 17193: DETAIL: ifdown[/usr/sbin/ip] doesn't have setuid bit Apr 30 16:42:28 pgpool2 pgpool-II-10[17193]: [3-1] 2019-04-30 16:42:28: pid 17193: WARNING: checking setuid bit of arping command Apr 30 16:42:28 pgpool2 pgpool-II-10[17193]: [3-2] 2019-04-30 16:42:28: pid 17193: DETAIL: arping[/bin/ping] doesn't have setuid bit Apr 30 16:42:28 pgpool2 pgpool-II-10[17193]: [4-1] 2019-04-30 16:42:28: pid 17193: LOG: Backend status file /var/log/pgpool-II-10/pgpool_status discarded Apr 30 16:42:28 pgpool2 pgpool-II-10[17193]: [5-1] 2019-04-30 16:42:28: pid 17193: LOG: waiting for watchdog to initialize Apr 30 16:42:28 pgpool2 pgpool-II-10[17194]: [5-1] 2019-04-30 16:42:28: pid 17194: LOG: setting the local watchdog node name to "192.168.31.21:5432 Linux pgpool2" Apr 30 16:42:28 pgpool2 pgpool-II-10[17194]: [6-1] 2019-04-30 16:42:28: pid 17194: LOG: watchdog cluster is configured with 1 remote nodes Apr 30 16:42:28 pgpool2 pgpool-II-10[17194]: [7-1] 2019-04-30 16:42:28: pid 17194: LOG: watchdog remote node:0 on 192.168.159.20:9000 Apr 30 16:42:28 pgpool2 pgpool-II-10[17194]: [8-1] 2019-04-30 16:42:28: pid 17194: LOG: interface monitoring is disabled in watchdog Apr 30 16:42:28 pgpool2 pgpool-II-10[17194]: [9-1] 2019-04-30 16:42:28: pid 17194: LOG: watchdog node state changed from [DEAD] to [LOADING] Apr 30 16:42:28 pgpool2 pgpool-II-10[17194]: [10-1] 2019-04-30 16:42:28: pid 17194: LOG: new outbound connection to 192.168.159.20:9000 Apr 30 16:42:28 pgpool2 pgpool-II-10[17194]: [11-1] 2019-04-30 16:42:28: pid 17194: LOG: setting the remote node "192.168.31.20:5432 Linux pgpool1.localdomain" as watchdog cluster master Apr 30 16:42:28 pgpool2 pgpool-II-10[17194]: [12-1] 2019-04-30 16:42:28: pid 17194: LOG: watchdog node state changed from [LOADING] to [INITIALIZING] Apr 30 16:42:28 pgpool2 pgpool-II-10[17194]: [13-1] 2019-04-30 16:42:28: pid 17194: LOG: new watchdog node connection is received from "192.168.159.20:7321" Apr 30 16:42:28 pgpool2 pgpool-II-10[17194]: [14-1] 2019-04-30 16:42:28: pid 17194: LOG: new node joined the cluster hostname:"192.168.31.20" port:9000 pgpool_port:5432 Apr 30 16:42:29 pgpool2 pgpool-II-10[17194]: [15-1] 2019-04-30 16:42:29: pid 17194: LOG: watchdog node state changed from [INITIALIZING] to [STANDBY] Apr 30 16:42:29 pgpool2 pgpool-II-10[17194]: [16-1] 2019-04-30 16:42:29: pid 17194: LOG: successfully joined the watchdog cluster as standby node Apr 30 16:42:29 pgpool2 pgpool-II-10[17194]: [16-2] 2019-04-30 16:42:29: pid 17194: DETAIL: our join coordinator request is accepted by cluster leader node "192.168.31.20:5432 Linux pgpool1.localdomain" Apr 30 16:42:29 pgpool2 pgpool-II-10[17193]: [6-1] 2019-04-30 16:42:29: pid 17193: LOG: watchdog process is initialized Apr 30 16:42:29 pgpool2 pgpool-II-10[17193]: [7-1] 2019-04-30 16:42:29: pid 17193: LOG: Pgpool-II parent process received watchdog quorum change signal from watchdog Apr 30 16:42:29 pgpool2 pgpool-II-10[17194]: [17-1] 2019-04-30 16:42:29: pid 17194: LOG: new IPC connection received Apr 30 16:42:29 pgpool2 pgpool-II-10[17193]: [8-1] 2019-04-30 16:42:29: pid 17193: LOG: watchdog cluster now holds the quorum Apr 30 16:42:29 pgpool2 pgpool-II-10[17193]: [8-2] 2019-04-30 16:42:29: pid 17193: DETAIL: updating the state of quarantine backend nodes Apr 30 16:42:29 pgpool2 pgpool-II-10[17194]: [18-1] 2019-04-30 16:42:29: pid 17194: LOG: new IPC connection received Apr 30 16:42:29 pgpool2 pgpool-II-10[17194]: [19-1] 2019-04-30 16:42:29: pid 17194: LOG: new IPC connection received Apr 30 16:42:29 pgpool2 pgpool-II-10[17193]: [9-1] 2019-04-30 16:42:29: pid 17193: LOG: we have joined the watchdog cluster as STANDBY node Apr 30 16:42:29 pgpool2 pgpool-II-10[17193]: [9-2] 2019-04-30 16:42:29: pid 17193: DETAIL: syncing the backend states from the MASTER watchdog node Apr 30 16:42:29 pgpool2 pgpool-II-10[17194]: [20-1] 2019-04-30 16:42:29: pid 17194: LOG: new IPC connection received Apr 30 16:42:29 pgpool2 pgpool-II-10[17194]: [21-1] 2019-04-30 16:42:29: pid 17194: LOG: received the get data request from local pgpool-II on IPC interface Apr 30 16:42:29 pgpool2 pgpool-II-10[17194]: [22-1] 2019-04-30 16:42:29: pid 17194: LOG: get data request from local pgpool-II node received on IPC interface is forwarded to master watchdog node "192.168.31.20:5432 Linux pgpool1.localdomain" Apr 30 16:42:29 pgpool2 pgpool-II-10[17194]: [22-2] 2019-04-30 16:42:29: pid 17194: DETAIL: waiting for the reply... Apr 30 16:42:29 pgpool2 pgpool-II-10[17193]: [10-1] 2019-04-30 16:42:29: pid 17193: LOG: master watchdog node "192.168.31.20:5432 Linux pgpool1.localdomain" returned status for 2 backend nodes Apr 30 16:42:29 pgpool2 pgpool-II-10[17193]: [11-1] 2019-04-30 16:42:29: pid 17193: LOG: Setting up socket for 0.0.0.0:5432 Apr 30 16:42:29 pgpool2 pgpool-II-10[17193]: [12-1] 2019-04-30 16:42:29: pid 17193: LOG: Setting up socket for :::5432 Apr 30 16:42:29 pgpool2 pgpool-II-10[17194]: [23-1] 2019-04-30 16:42:29: pid 17194: LOG: new IPC connection received Apr 30 16:42:29 pgpool2 pgpool-II-10[17195]: [7-1] 2019-04-30 16:42:29: pid 17195: LOG: 2 watchdog nodes are configured for lifecheck Apr 30 16:42:29 pgpool2 pgpool-II-10[17195]: [8-1] 2019-04-30 16:42:29: pid 17195: LOG: watchdog nodes ID:0 Name:"192.168.31.21:5432 Linux pgpool2" Apr 30 16:42:29 pgpool2 pgpool-II-10[17195]: [8-2] 2019-04-30 16:42:29: pid 17195: DETAIL: Host:"192.168.31.21" WD Port:9000 pgpool-II port:5432 Apr 30 16:42:29 pgpool2 pgpool-II-10[17195]: [9-1] 2019-04-30 16:42:29: pid 17195: LOG: watchdog nodes ID:1 Name:"192.168.31.20:5432 Linux pgpool1.localdomain" Apr 30 16:42:29 pgpool2 pgpool-II-10[17195]: [9-2] 2019-04-30 16:42:29: pid 17195: DETAIL: Host:"192.168.159.20" WD Port:9000 pgpool-II port:5432 Apr 30 16:42:29 pgpool2 pgpool-II-10[17193]: [13-1] 2019-04-30 16:42:29: pid 17193: LOG: pgpool-II successfully started. version 4.0.3 (torokiboshi) Apr 30 16:42:29 pgpool2 pgpool-II-10[17193]: [14-1] 2019-04-30 16:42:29: pid 17193: LOG: node status[0]: 0 Apr 30 16:42:29 pgpool2 pgpool-II-10[17193]: [15-1] 2019-04-30 16:42:29: pid 17193: LOG: node status[1]: 0 Apr 30 16:42:30 pgpool2 pgpool-II-10[17233]: [10-1] 2019-04-30 16:42:30: pid 17233: LOG: set SO_REUSEPORT option to the socket Apr 30 16:42:30 pgpool2 pgpool-II-10[17233]: [11-1] 2019-04-30 16:42:30: pid 17233: LOG: creating socket for sending heartbeat Apr 30 16:42:30 pgpool2 pgpool-II-10[17233]: [11-2] 2019-04-30 16:42:30: pid 17233: DETAIL: set SO_REUSEPORT Apr 30 16:42:30 pgpool2 pgpool-II-10[17232]: [10-1] 2019-04-30 16:42:30: pid 17232: LOG: set SO_REUSEPORT option to the socket Apr 30 16:42:30 pgpool2 pgpool-II-10[17232]: [11-1] 2019-04-30 16:42:30: pid 17232: LOG: creating watchdog heartbeat receive socket. Apr 30 16:42:30 pgpool2 pgpool-II-10[17232]: [11-2] 2019-04-30 16:42:30: pid 17232: DETAIL: set SO_REUSEPORT
At this point, Pgpool should be properly configured. To test, login as the Postgres user on either of the Pgpool nodes and set the variables below:
export PGPORT=5432 export PGDATA=/postgres/data/pgdata export PGHOSTADDR=10.12.1.111
Then enter the command psql.
It should prompt for the password, and if it does, configuration was successful.
-bash-4.2$ psql Password: psql (10.7, server 11.2) WARNING: psql major version 10, server major version 11. Some psql features might not work. Type "help" for help. postgres=# select inet_server_addr(); inet_server_addr ------------------ 10.12.1.61 (1 row) postgres-# \quit -bash-4.2$ psql Password: psql (10.7, server 11.2) WARNING: psql major version 10, server major version 11. Some psql features might not work. Type "help" for help. postgres=# select inet_server_addr(); inet_server_addr ------------------ 10.12.1.225 (1 row) postgres=#
Connect again, as shown above, to verify that load balancing is working properly. As we can see, it successfully connected to both nodes, so we can connect to both instances. Pgpool is designed to send update queries to the primary node, and will send select queries to the standby nodes. The configuration of PostgreSQL with pgpool is complete.
Conclusion
In this blog, we have installed and configured PostgreSQL and pgpool for high availability. We did not cover automatic failover, and typically you will not want to fail over automatically. My next blog will cover failover and recovery for PostgreSQL.
Please note: this blog contains code examples provided for your reference. All sample code is provided for illustrative purposes only. Use of information appearing in this blog is solely at your own risk. Please read our full disclaimer for details.