High-Availability Solutions: Patroni Failover and Switchover

Hannah Stoik

November 06, 2023

Introduction

Patroni is an open-source tool for managing replication and failover of a PostgreSQL cluster. It is popular to use while implementing high-availability due to its handling of replication and fast failover and switchover times. To understand these times better, we benchmarked the failover and switchover performance within a Patroni cluster.

To learn more about Patroni: https://www.enterprisedb.com/docs/supported-open-source/patroni/.

Benchmark Specifications

The purpose of this benchmark is to measure the downtime of a Patroni cluster during two different scenarios:

Switchover - one standby is gracefully promoted to leader
Failover - one standby is promoted to leader following a Postgres crash on the leader.

In each case, downtime is measured with traffic generated by pgbench.

Environment

The Patroni cluster is composed of:

3 PostgreSQL nodes, all located in the same region, each with ETCD DCS on the system
1 PgBouncer pooler node configured with HAProxy

System Characteristics

The characteristics of the cloud hosts involved in the test are as follows:

AWS Instance Type	r5b.2xlarge
Operating System	Rocky8
vCPU	8
Memory	64GB

Postgres Configuration

The following are configuration parameters found in the postgresql.conf file:

shared_buffers	16GB
checkpoint_completion_target	0.93
checkpoint_timeout	30min
max_wal_size	200GB
effective_cache_size	80GB
random_page_cost	1
seq_page_cost	1
effective_io_concurrency	200

Patroni Configuration

The following are parameters found in each servers' respective Patroni configuration file:

ttl	30
loop_wait	10
retry_timeout	10
maximum_lag_on_failover	1048576

Downtime Measurement

In this benchmark, downtime is defined as the time during which the database system is not able to handle incoming SQL requests. Incoming requests are made through one HAProxy connection.

Downtime is measured with the help of a script connected to the pooler instance that is inserting records as fast as possible into a table.

Each record contains:

the current timestamp returned by the now() Postgres function;
the name of the current Patroni node actually making record insertion.

In case of connection loss, the script retries connecting until a new connection can be made.

With this, we can find:

The timestamp of the last record inserted before the switchover or failover happened;
The timestamp of the first record inserted after the switchover or failover happened.

Measurement Table

The following is the DDL used to create the ping table in charge of gathering timestamps:

CREATE TABLE ping(node TEXT, timestamp TIMESTAMPTZ DEFAULT NOW(), cur_lsn pg_lsn);

PgBouncer Settings

PgBouncer was run in transaction session mode.

HAProxy Settings

The following is the HAProxy configuration file:

defaults
   retries 2
   timeout connect 1s
   timeout check 1s
   timeout server 30m
   timeout client 30m

listen postgres
   stick-table type ip size 1
   stick on dst
   option httpchk OPTIONS /primary
   http-check expect status 200
   default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions

Results

Switchover

Patroni switchover operation is triggered by the execution of the following command on the current leader:

$ patronictl switchover --candidate postgres2 --force

The following are the statistical results of ten switchover tests:

Min	15.049ms
Max	15.354ms
Average	15.199ms
Std. Dev.	0.0989ms

Failover

Patroni failover operation is triggered by the execution of the following command on the current leader, where user postgres owns both PostgreSQL and Patroni services:

$ sudo pkill -u postgres

The following are the statistical results of ten failover tests:

Min	15.037ms
Max	15.347ms
Average	15.231ms
Std. Dev.	0.1215ms

Patroni Timeline

With each execution of either a failover or a switchover, Patroni moves its timeline forward by one.

The following shows the Patroni cluster information with a switchover occurring:

# Before switchover happens
# postgres1 is leader with replicas postgres2 and postgres3
Current cluster topology
+ Cluster: main -------------------------------+----+-----------+
| Member    | Host       | Role    | State     | TL | Lag in MB |
+-----------+------------+---------+-----------+----+-----------+
| postgres1 | xx.x.x.xxx | Leader  | running   | 22 |           |
| postgres2 | xx.x.x.xxx | Replica | streaming | 22 |         0 |
| postgres3 | xx.x.x.xxx | Replica | streaming | 22 |         0 |
+-----------+------------+---------+-----------+----+-----------+

# Switchover occurs, postgres1 is stopped and postgres2 is new leader
+ Cluster: main -----------------------------+----+-----------+
| Member    | Host       | Role    | State   | TL | Lag in MB |
+-----------+------------+---------+---------+----+-----------+
| postgres1 | xx.x.x.xxx | Replica | stopped |    |   unknown |
| postgres2 | xx.x.x.xxx | Leader  | running | 22 |           |
| postgres3 | xx.x.x.xxx | Replica | running | 22 |         0 |
+-----------+------------+---------+---------+----+-----------+

# postgres1 is now replica caught up with cluster
# timeline moved forward from 22 to 23
+ Cluster: main -------------------------------+----+-----------+
| Member    | Host       | Role    | State     | TL | Lag in MB |
+-----------+------------+---------+-----------+----+-----------+
| postgres1 | xx.x.x.xxx | Replica | streaming | 23 |         0 |
| postgres2 | xx.x.x.xxx | Leader  | running   | 23 |           |
| postgres3 | xx.x.x.xxx | Replica | streaming | 23 |         0 |
+-----------+------------+---------+-----------+----+-----------+

Conclusion

Both the failover and switchover times in a Patroni cluster are consistent and quick. With pooler and proxy setup, downtime can be fewer than 1 second. Using a proxy is recommended so that your cluster has a single endpoint to connect to. To improve durability, setting up a standalone ETCD server as a part of the ETCD cluster to ensure the DCS is maintained during system crashes is recommended.

Resource Feature Callout 1

Resource Feature Callout 2

High-Availability Solutions: Patroni Failover and Switchover

Hannah Stoik

Introduction

Benchmark Specifications

Environment

System Characteristics

Postgres Configuration

Patroni Configuration

Downtime Measurement

Measurement Table

PgBouncer Settings

HAProxy Settings

Results

Switchover

Failover

Patroni Timeline

Conclusion

Relevant Blogs

Finding memory leaks in Postgres C code

Trying Out CloudNative-PG: A Novice in a Kubernetes World

REST API Usage in EDB Postgres Enterprise Manager (PEM)

More Blogs

The Cache that Wasn’t: Optimizing Postgres for SELECT Performance

Let's Workshop an Unplanned Postgres Outage

The Do's and Don'ts of Postgres High Availability Part 1: Great Expectations