Database Point-in-time Recovery

In this article, we explore how to perform a Database Point-in-time Recovery (PITR) using ClusterControl. Point-in-time recovery allows you to restore your database to a specific time, often just before a disruptive event (for example, accidental data deletion or corruption). By combining full backups with transaction logs (binlogs, WAL, or equivalent), ClusterControl can help you recover to a precise moment to minimize data loss.

Use cases

Point-in-time recovery is useful in scenarios such as:

Accidental Data Modification: A user or application drops a table or updates critical rows by mistake.
Data Corruption: Software or hardware errors that corrupt database pages or files, requiring you to roll back to a clean state.
Ransomware or Security Incidents: Restoring to the moment before an intrusion or malicious activity was detected.
Testing & Compliance: Validating that your backups and binlogs/WAL archives can reconstruct the database to any point in time to meet audit or compliance requirements.

Configuring point-in-time recovery

Enable transaction log archiving

Depending on your database type, you must ensure the transaction logs are retained long enough to allow PITR. Below are examples for MySQL-based and PostgreSQL-based systems:

MySQL/MariaDB

Make sure binary logging is enabled and that the retention is appropriately set in the MySQL configuration file (usually my.cnf):

[mysqld]
log_bin              = binlog
binlog_format        = ROW
expire_logs_days     = 7
server_id            = 100

Restart the MySQL/MariaDB server for changes to take effect.

PostgreSQL

Configure continuous WAL archiving by editing postgresql.conf:

wal_level = replica
archive_mode = on
archive_command = 'test ! -f /mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f'

Ensure sufficient storage space for WAL archives and that the retention policy aligns with your PITR requirements.
Restart PostgreSQL for changes to take effect.

Note

You can also enable binary logging from the ClusterControl UI.

Enable full backup

Before PITR can be performed, you need consistent base backups:

Choose a backup method

In the ClusterControl UI, create or scheduled a backup using a supported method (mysqldump, xtrabackup, pg_basebackup, etc).
Navigate to Backups on your cluster.
Create or schedule a backup job.
Verify the backup storage location (local or/and cloud).

Validate backups

Check that backups are running successfully (in Jobs or Backups sections).
Periodically restore test backups to ensure they are valid (ClusterControl Verify backup feature helps on this task).

Performing point-in-time recovery

Once full backups and transaction log archiving are in place, initiate a PITR from ClusterControl.

Using the ClusterControl GUI

Identify the desired recovery time
- Determine the timestamp just before an unwanted change or event.
- Alternatively, note the binlog or WAL position if you have that information.
Navigate to the Backups section
- Under the Backups section, locate a full backup that predates your desired recovery point.
- Select Restore.
Confirm how to restore the backup
- Restore on node or on standalone host.
Specify the recovery point
- Enable PITR and provide a timestamp or binlog/WAL sequence.
- Adjust any additional settings as needed (Temporary directory).
Initiate the PITR job
- ClusterControl will restore the base backup.
- After the base restore completes, it will replay the transaction logs or WAL files up until the specified time.
- Monitor the Jobs tab or Logs to track progress.

Using the ClusterControl CLI

Below is an example flow for using ClusterControl CLI for MySQL/MariaDB point-in-time recovery tasks:

List clusters and backups:
```
s9s backup --list --long
```
Identify the --cluster-id (CID) and --backup-id (ID) of your full backup.
Initiate PITR:
```
s9s backup \
        --restore \
        --cluster-id=2 \
        --backup-id=3 \
        --pitr-stop-time="2025-02-10 23:20:09" \
        --nodes="10.118.0.3:3306" \
        --log
```
Where:
- --cluster-id: ID for the cluster from the previous step.
- --backup-id: ID for the backup from the previous step.
- --pitr-stop-time: the timestamp in YYYY-MM-DD HH:MM:SS format to which you want to roll forward.
- --nodes: specify which node to restore on.
- --log: Watch the logs in real time.
Monitor the restore job:
```
s9s job --list --long
```
Check the job status to confirm when the PITR operation completes.

Verifying the recovery

After the point-in-time recovery job completes:

In the ClusterControl UI or CLI logs, ensure there are no errors or warnings.
Query your restored database to confirm data is as expected for the specified point in time.
Reconnect your application to the restored node/cluster and validate functionality.