Maintenance
Maintenance is something that an operation team can not avoid. Servers have to keep up with the latest software, hardware, and technology to ensure systems are stable and running with the lowest risk possible while making use of newer features to improve the overall performance.
Undoubtedly, there is a long list of maintenance tasks that have to be performed by system administrators, especially when it comes to critical systems. Some of the tasks have to be performed at regular intervals, like daily, weekly, monthly and yearly. Some have to be done right away, urgently. Nevertheless, any maintenance operation should not lead to another bigger problem, and any maintenance has to be handled with extra care to avoid any interruption to the business.
ClusterControl provides a maintenance mode where you can put an individual node or a cluster into maintenance, this prevents ClusterControl from raising alarms and sending notifications for the specified duration, when . This feature enables one to set the maintenance period for a pre-defined time or schedule it accordingly. You can also write down the reason for scheduling the upgrade (RAM Upgrade, OS Patching, etc) useful for auditing purposes. During the maintenance mode, ClusterControl will not degrade the node, hence the node’s state remains as is unless you perform any action that changes the state.
Note
If automatic cluster/node recovery is enabled, ClusterControl will always recover a node/cluster regardless of the maintenance mode status. Don’t forget to disable cluster/node recovery to avoid ClusterControl interfering with your maintenance tasks.
Configuring maintenance mode
Maintenance mode can be configured from ClusterControl UI and also using ClusterControl CLI tool called “s9s”.
Cluster-wide maintenance mode
Log in to your ClusterControl GUI → choose a database cluster → Actions → Schedule maintenance. This will bring up a panel where you can activate maintenance mode.
Node maintenance mode
Log in to your ClusterControl GUI → choose a database cluster → Nodes → select the node → Actions → Schedule maintenance. This will bring up a panel where you can activate maintenance mode.
Alarms and notifications will be reactivated once the maintenance period is over, or the operator explicitly disables it by going to Nodes → select the node → Actions → Disable Maintenance Mode or _ choose a database cluster → Actions → Disable Maintenance Mode_.
Create a maintenance period for PostgreSQL node 10.35.112.21, starting at 05:44:55 AM for one full day (cmon expects UTC time to create a maintenance):
```bash
s9s maintenance \
--create \
--nodes=10.35.112.21:5432 \
--begin=2024-05-15T05:44:55.000Z \
--end=2024-05-16T05:44:55.000Z \
--reason='Upgrading RAM' \
--batch
```
Create a new maintenance period for 192.168.1.121 which shall start tomorrow and finish an hour later:
```bash
s9s maintenance --create \
--nodes=192.168.1.121 \
--begin="$(date --date='now + 1 day' --utc '+%Y-%m-%d %H:%M:%S')'" \
--end="$(date --date='now + 1day + 1 hour' --utc '+%Y-%m-%d %H:%M:%S')" \
--reason="Upgrading software."
```
List out all nodes that are under maintenance period:
```bash
s9s maintenance --list --long
```
Delete a maintenance period for UUID 70346c3:
bash
s9s maintenance --delete --uuid=70346c3