Skip to content

PostgreSQL Streaming Replication

This guide will lead you through the process of setting up your first PostgreSQL streaming replication cluster (primary-replica), complete with an HAProxy load balancer, using ClusterControl. By following this tutorial, you'll have a fully functional database cluster that you can effortlessly monitor, manage, and scale.

PostgreSQL streaming replication is a built-in feature that allows data from a primary PostgreSQL server to be streamed in real-time to one or more replica servers. This architecture improves read scalability, offers data redundancy, and provides a basis for high availability when combined with a failover mechanism. This tutorial applies to major versions of PostgreSQL starting from 10 and above.

See also

Prerequisites

Before proceeding, ensure the following requirements are met:

  • ClusterControl is installed and running. If not, follow the instructions in Quickstart or use the Installer Script.
  • At least four hosts (bare-metal or virtual machines) with a supported OS:
    • One host for the PostgreSQL primary server.
    • Two hosts for the PostgreSQL replica servers.
    • One host for the load balancer (HAProxy).
  • SSH access to all servers involved.
  • All nodes must have internet access during the deployment stage.
  • NTP is configured and running on all hosts to keep their clocks synchronized.

Architecture

Below is a high-level architecture diagram of the final architecture:

flowchart TB
   a{{**Users/<br>clients/<br>apps**}} ==> |RW-5433<br>RO-5434| x[**lb1** <br>#40;HAProxy#41;<br>192.168.99.100] ==>|RW| d[(**db1**<br>Primary<br>192.168.99.101)] 
   x ==>|RO| e[(**db2**<br>Replica<br>192.168.99.102)] & f[(**db3**<br>Replica<br>192.168.99.103)]
   d -.-> |streaming replication| e & f
   d & e & f & x -.-|manages| y[/**ClusterControl server**<br>192.168.99.5/]
    subgraph Streaming Replication
        d
        e
        f
    end

Step 1: Set up SSH key-based authentication

  1. On the ClusterControl server, generate a new SSH key as root user:

    ssh-keygen -t rsa
    
  2. Copy the SSH public key to all target nodes (specify the password if prompted):

    ssh-copy-id -i /root/.ssh/id_rsa [email protected]  # lb1
    ssh-copy-id -i /root/.ssh/id_rsa [email protected]  # db1
    ssh-copy-id -i /root/.ssh/id_rsa [email protected]  # db2
    ssh-copy-id -i /root/.ssh/id_rsa [email protected]  # db3
    

    If the target node only requires a key-based authentication, you can use the -o flag to customize the SSH option:

    ssh-copy-id -i /root/.ssh/id_rsa -p 22 -o 'IdentityFile /root/myprivatekey.pem' [email protected]  # lb1
    ssh-copy-id -i /root/.ssh/id_rsa -p 22 -o 'IdentityFile /root/myprivatekey.pem' [email protected]  # db1
    ssh-copy-id -i /root/.ssh/id_rsa -p 22 -o 'IdentityFile /root/myprivatekey.pem' [email protected]  # db2
    ssh-copy-id -i /root/.ssh/id_rsa -p 22 -o 'IdentityFile /root/myprivatekey.pem' [email protected]  # db3
    
  3. Perform some tests and make sure you can execute the following commands from the ClusterControl server without any password prompt:

    ssh [email protected] "ls /root"
    ssh [email protected] "ls /root"
    ssh [email protected] "ls /root"
    ssh [email protected] "ls /root"
    

Step 2: Deploy a new cluster

  1. Open your browser and navigate to your ClusterControl server’s IP or domain name.

  2. From the ClusterControl dashboard, click on Deploy a cluster in the top-right corner and choose Create a database cluster.

  3. Choose "PostgreSQL Streaming" from the Database dropdown and the appropriate Vendor and Version (e.g., "PostgreSQL" and "16") from the wizard. Click Continue.

  4. In the Deploy PostgreSQL wizard, configure the cluster as follows:

    • Name: Enter a name for your cluster (e.g., My First PostgreSQL Cluster).
    • Tags: Enter one or more tags, separated by pressing "Enter" (e.g., production, replication, dc1).
    • SSH user: root
    • SSH user key path: /root/.ssh/id_rsa
    • SSH port: 22
    • SSH sudo password: (leave blank if using key-based auth)
    • Install software: On (default)
    • Disable firewall: Checked (default)
    • Disable SELinux/AppArmor: Checked (default)
    • Server data directory: /var/lib/pgsql (default varies by distro)
    • Server port: 5432 (default)
    • User: postgres
    • Password: Enter a secure password for the PostgreSQL postgres superuser
    • Repository: Use vendor repositories (default)
    • Enable SSL encryption: On (default)
    • Primary node:
      • Specify "192.168.99.101" as the primary node and press Enter.
    • Replica node:
      • Specify "192.168.99.102" as replica and press Enter.
      • Specify "192.168.99.103" as another replica and press Enter.
    • Wait until everything is green. If a red warning appears, inspect the error and fix it.
    • Turn "On" Synchronous Replication if required.
    • Proceed only if all nodes are reachable (shown in green).
    • Leave the toggle button "Off" if you don't require the extensions, else turn "On" "pgvector" or "PostGIS" if you want to enable them.
    • Review the summary. You can go back to modify any configuration. The settings persist until you exit the wizard.
  5. Click Finish to start the deployment.

  6. ClusterControl will now start provisioning the PostgreSQL streaming replication cluster. You can monitor the progress from Activity center, where you will see detailed logs of the deployment. This process can take a few minutes. Once the deployment is complete, the cluster will appear in the Home page.

Step 3: Monitor your cluster

After deployment, you are redirected to the Home page. This page provides an at-a-glance view of the cluster's status:

  • Cluster health: The Home page provides the cluster state.
  • Node health: Rollover on the honeycomb diagram and get a more detailed view of a particular node. You can also see more detailed histograms under ClusterControl GUI → Clusters → choose the cluster → Dashboards.
  • Recent alarms: View most recent alarms of your cluster.
  • Automatic recovery status: Green means ClusterControl will raise an alarm and perform recovery procedures if it detects the database service is down. Red means ClusterControl will raise an alarm but the recovery procedures will be skipped.
  • Topology viewer: Roll over on each cluster entry to see the summary of the cluster visualized in topology view to better understand your overall database state. Also available under ClusterControl GUI → Clusters → choose the cluster → Nodes → Topology.

Step 4: Import data

In a primary-replica (asynchronous) streaming replication setup, you must import the data on the primary, and changes will replicate automatically to the replicas.

To import data from a dump file on the primary (db1: 192.168.99.101), you can use psql command as below:

psql -U postgres -h 192.168.99.101 -f dumpfile.sql

Once the import completes and the replicas have caught up, you can proceed with normal read/write operations on the primary, while replicas remain in sync.

Step 5: Deploy a new load balancer

Integrating HAProxy with PostgreSQL improves high availability and load balancing. ClusterControl allows you to automate the deployment of HAProxy.

  1. Go to ClusterControl GUI → Clusters → locate the cluster → Actions → Add new → Load balancer → Create balancer → HAProxy → Continue.

  2. In the Create a load balancer wizard, configure HAProxy as below:

    • Server address: 192.168.99.100. Press "Enter" and wait until it turns green.
    • Policy: leastconn (default) (1)

      1. You can choose the policy that suits your requirement.
    • Listen port (read/write): 5433 (default)

    • Listen port (read-only): 5434 (default)
    • Install for read/write splitting (master-slave replication): On (default)
    • Disable firewall: Checked (default)
    • Disable SELinux/AppArmor: Checked (default)
    • Overwrite existing /usr/local/sbin/postgreschk_rw_split on targets: On (default)
    • Stats socket: /var/run/haproxy.socket (default)
    • Admin port: 9600 (default)
    • Admin user: admin
    • Admin password: Enter the password for the Admin user
    • Backend name (RW): haproxy_5433_rw (default)
    • Backend name (RO): haproxy_5434_ro (default)
    • Timeout server (in seconds): 10800 (default)
    • Timeout client (in seconds): 10800 (default)
    • Max connections frontend: 8192 (default)
    • Max connections backend (per instance): 1024
    • xinted allow connections from: 0.0.0.0/0 (default)
    • 192.168.99.101:5432 (Primary): On (default)
      • Advance options
        • Role: Active (default)
        • Connection address: External Address [192.168.99.101]
    • 192.168.99.102:5432 (Primary): On (default)
      • Advance options
        • Role: Active (default)
        • Connection address: External Address [192.168.99.102]
    • 192.168.99.103:5432 (Primary): On (default)
      • Advance options
        • Role: Active (default)
        • Connection address: External Address [192.168.99.103]
    • Review the summary. You can go back to any previous section to modify configurations if needed.
  3. Click Finish to deploy HAProxy.

  4. ClusterControl will now start provisioning the HAProxy instance. You can monitor the progress from Activity center, where you will see detailed logs of the deployment. This process can take a few minutes. Once the deployment is complete, the load balancer instance will appear in the Nodes page.

Tips

For production environment, it is recommended to have at least two load balancers to eliminate single-point of failure. You can deploy another HAProxy instance and tie them together with a virtual IP address. See Load Balancers.

Step 6: Connecting to the database cluster

Start by creating a new database user by going to ClusterControl GUI → Clusters → choose the cluster → Manage → DB Users → Create DB User. Ensure the Hostname is the IP address of the application/client/user. Under the Privileges section, set a grant on the database and tables and add more privileges for the user by clicking Add privileges if necessary. Click Create and the database user will be created and granted on all of the database nodes.

Tip

You can create a new database by going to ClusterControl GUI → Clusters → choose the cluster → Actions → Create database.

Next, the application/client/user can access the database cluster via:

  • Load balancer instance, 192.168.99.100 on port 5433 for read-write workloads. For read-only workloads, send the database connections to port 5434. These are load-balanced PostgreSQL connections via HAProxy.
  • If there are two or more load balancers, tie them together with a virtual IP address provided by Keepalived. Use the virtual IP address in your connection string instead of the load balancers' primary IP address.
  • For read-write workloads, send the database connections to the PostgreSQL primary server, 192.168.99.101 on port 5432. For read-only workloads, send the database connections to either of the PostgreSQL replica servers, 192.168.99.102 or 192.168.99.103 on port 5432.

Note

If a failover happens, 192.168.99.101 might not be running as primary, and you have to find the new primary from ClusterControl GUI (or by querying directly to the database server) and send the read-write workloads there. That is why having a load balancer is a better approach for your application.

Step 7: Enable automatic backups

To protect your data, set up automatic backups. ClusterControl supports both logical (pg_dumpall) and physical (pg_basebackup, pgBackRest) backups for PostgreSQL.

  1. Go to ClusterControl GUI → Clusters → choose the cluster → Backups to open the backup overview.

  2. Go to Create Backup → Schedule a Backup. The backup wizard appears.

  3. In the Create a backup schedule wizard:

    • Schedule name: e.g., Daily Backup
    • Cluster: (auto-selected)
    • Backup host: db3 (select one of the replicas to offload backup load)
    • Backup method: pg_dump or pg_basebackup
    • PITR enabled: On (default for other than pg_basebackup)
    • Upload backup to cloud: Off (default, can be enabled if needed)
    • Compression: On
    • Compression level: 6
    • Enable encryption: Off or On (optional)
    • Verify backup: Off (default)
    • Retention: On, configure how many backups to keep
    • Failover backup if node is down: Off (default)
    • Storage location: Store on controller (default)
    • Storage directory: /root/backups (default)
    • Backup subdirectory: BACKUP-%I (default)
    • Set backup schedule: Simple
    • Every: day at 2:00
    • Timezone: Your preferred timezone
    • Review the summary. You can always go back to any previous section to modify settings.
  4. Click Create to set the backup schedule.

ClusterControl will handle the scheduled backups automatically. Advanced backup features like partial backups, encryption, verification, and point-in-time recovery can also be configured.

Step 8: Configure alerts

To keep track of any issues or incidents in your cluster, it's important to set up alerting. ClusterControl supports sending alarms and alerts to email, web hooks and third-party notification services like Slack or Telegram. In this example, we are going to use email.

Firstly, configure the mail server. Go to ClusterControl GUI → Settings → You don't have a mail server configured. Configure now → SMTP Server. Fill up all necessary information about the SMTP server. You can also opt for Sendmail, however a mail transfer agent (sendmail, postfix or exim) must be installed on the ClusterControl server.

Once configured, we can configure the alert and recipients as below:

  1. Go to ClusterControl GUI → choose the cluster → Settings → Email Notifications.
  2. Choose a user group where you are belong to from the User group dropdown.
  3. Choose your email address to from the Users in the selected group dropdown.
  4. Click Enable.
  5. Set the Digest delivery time when you want a digested (summarized events) to be sent to you every day.
  6. Set all Critical events to "Deliver" (default), all Warning events to "Digest" and you may ignore the Info events.

This ensures you are always notified when critical issues arise and can act quickly to resolve them. You will also get an email with summarized warning events every day at the configured delivery time.

Tip

You can also configure alarms to be sent to third-party notification systems (Slack, Telegram), incident management systems (PagerDuty, ServiceNow, OpsGenie) or web hooks. See Integration → Notification Services.

Step 9: Manage your cluster and load balancer

ClusterControl offers many features for ongoing management:

  • PostgreSQL upgrades: Perform rolling major or minor version upgrade to keep the cluster up-to-date. This feature is available at ClusterControl GUI → choose cluster → Actions → Upgrades.
  • Node management: Start/stop/restart a node, promote a replica to primary, reboot host, etc. These features are available at ClusterControl GUI → choose cluster → Nodes → Actions.
  • Load balancer management: Manage and configure HAProxy backends, frontend ports, perform service restarts, and advanced routing. These features are available at ClusterControl GUI → choose cluster → Nodes → Actions (for HAProxy nodes).
  • Configuration management: Adjust PostgreSQL settings globally or on individual nodes. This feature is available at ClusterControl GUI → choose cluster → Manage → Configuration.
  • User management: Manage PostgreSQL roles and privileges. These features are available at ClusterControl GUI → choose cluster → Manage → DB Users.
  • Backup management: Create, schedule, restore, perform point-in-time recovery, upload backups to cloud, verify backups, and set retention. These features are available at ClusterControl GUI → choose cluster → Backups.
  • Maintenance mode: Activate or schedule maintenance for nodes. These features are available at ClusterControl GUI → choose cluster → Nodes → Actions → Schedule maintenance.
  • SSH console: Directly access nodes via a web-based SSH console. This feature is available at ClusterControl GUI → choose cluster → Nodes → Actions → SSH Console.

Step 10: Scale your cluster

A key advantage of ClusterControl is easy scalability. You can add more replica nodes to handle greater read workloads or to distribute backups, analytics, or failover targets.

  1. To add a new replica, go to ClusterControl GUI → choose cluster → Actions → Add new → Replica node.

  2. In the Create a replica node wizard, configure:

    • Port: 5432 (or your chosen port)
    • Data directory: /var/lib/pgsql
    • Install software: On
    • Disable firewall: Checked
    • Disable SELinux/AppArmor: Checked
    • Include in load balancer set: On (if you want it in HAProxy)
    • Synchronous replication: 0ff (default)
    • Instance name: Enter instance name
    • Version: Pre-selected by default
    • Primary node: Pick the primary node from the dropdown 192.168.99.101.
    • Node: Type the new replica node IP, then Enter and wait until green:
    • Confirm your deployment summary.
  3. Click Finish to trigger the job.

  4. Monitor the deployment from Activity center. When complete, the new replica node appears in Nodes page.

Conclusion

Congratulations! You've successfully deployed, monitored, managed, and scaled your first PostgreSQL Streaming Replication cluster with ClusterControl. Explore additional features like failover management, advanced monitoring, backup management, and more to further optimize your PostgreSQL infrastructure for high availability and performance.