How it Works?

At its core, ClusterControl operates as a central management layer, integrating with multiple database technologies to streamline complex tasks and ensure high availability and performance. Using automation and intelligent monitoring, it allows you to deploy, scale, back up, and recover clusters with minimal manual intervention.

This page explains how ClusterControl functions behind the scenes, detailing the mechanisms and processes that make it a trusted solution for managing and optimizing database infrastructure efficiently and securely.

Components

Primarily, ClusterControl consists of the following components:

Component	Package name	Description
ClusterControl Controller (cmon)	clustercontrol-controller	The brain of ClusterControl. A backend service performing automation, management, monitoring and scheduling tasks on the managed nodes.
ClusterControl GUI	clustercontrol-mcc	A modern web user interface to visualize and manage the cluster. It also supports for centralized management of multiple ClusterControl Controller instances, also known as ClusterControl Operations Center (Ops-C). It interacts with the ClusterControl controller via the `cmon-proxy` service and remote procedure call (RPC).
ClusterControl CLI	s9s-tools	A command-line tool to manage and monitor clusters provisioned by ClusterControl.
ClusterControl SSH	clustercontrol-ssh	A web-based SSH client module implemented using JavaScript and WebSockets.
ClusterControl Notifications	clustercontrol-notifications	A service and interface module for integration with third-party messaging and incident management tools.
ClusterControl Cloud	clustercontrol-cloud	A service and interface module for integration with cloud providers.
ClusterControl Cloud File Manager	clustercontrol-clud	A command-line interface module to interact with cloud storage objects.
ClusterControl Proxy	clustercontrol-proxy	A controller proxying service for ClusterControl GUI.
ClusterControl Kuberenetes Proxy	clustercontrol-kuber-proxy	A proxying service to interact with database clusters running on Kubernetes platform.

For a large deployment, ClusterControl provides an additional suite to provision multiple ClusterControl instances at once, called ClusterControl Operations Center (Ops-C). This feature is now merged within ClusterControl GUI and can be activated under the License section.

ClusterControl also provides a number of integration components:

Component	Description
Terraform Provider for ClusterControl	A Terraform provider that enables interaction with ClusterControl database platform.
cmon_exporter	A Prometheus exporter for ClusterControl Controller (cmon) service.

See Components for more details on every ClusterControl component.

Architecture

ClusterControl components must reside on an independent node apart from your database cluster. For example, if you have a three-node Galera Cluster, ClusterControl should be installed on the fourth node. Following is an example deployment of having a Galera cluster with ClusterControl:

Once the cmon service is started, it will load up all configuration options inside /etc/cmon.cnf and /etc/cmon.d/cmon_*.cnf (if exists) into CMON database. Each configuration file represents a cluster with a distinct cluster ID. It starts by registering hosts, collecting information and periodically perform check-ups and scheduled jobs on all managed nodes through SSH as ssh_user using SSH key defined in ssh_identity inside CMON configuration file. Setting up a passwordless SSH (key-based auhthentication) is vital in ClusterControl for agentless management purposes. For monitoring, ClusterControl uses an agent-based setup using Prometheus and exporters, including cmonagent for query monitoring.

What you really need to do is to access ClusterControl GUI located at https://<ClusterControl_host> (or use ClusterControl CLI) and start managing your database infrastructure from there. Begin by importing an existing database cluster or create a new database server or cluster, on-premises or in the cloud, as long as the target nodes are reachable via SSH. ClusterControl supports managing multiple clusters and cluster types under a single ClusterControl instance as shown in the following figure:

ClusterControl controller exposes all functionalities through remote procedure calls (RPC) on port 9500 (authenticated by an RPC token), port 9501 (RPC with TLS) and integrates with a number of modules like notifications (9510), cloud (9518) and web SSH (9511). The client components, ClusterControl GUI or ClusterControl CLI interact with those interfaces to retrieve monitoring data (cluster load, host status, alarms, backup status, etc.) or to send management commands (add/remove nodes, run backups, upgrade a cluster, etc.).

The following diagram illustrates the internal architecture of ClusterControl and how the components interact with each other:

flowchart TB
    a[/**Users**/] ==> b{ClusterControl<br>clients}
    b --> c{{**ClusterControl GUI**}} & d{{**ClusterControl CLI**}} & e{{**Terraform Provider<br>for ClusterControl**}}
    c --> |HTTP|f([ClusterControl<br>Proxy])
    f --> g[**ClusterControl Controller<br>#40;cmon#41;**]
    d & e ---> |HTTP|g
    g --> |MySQL| h[(CMON DB)]
    g --> |PromQL| i[(Prometheus)]
    g ---> j([ClusterControl<br>Cloud]) & k([ClusterControl<br>Web SSH]) & l([ClusterControl<br>Notifications])
    f -->|REST<br>#40;JWT#41;| q([ClusterControl<br>Kubernetes Proxy])
    j --> m[/Cloud providers/]
        subgraph **ClusterControl Server**
            g
            h
            i
            j
            k
            l
        end
    k --> |SSH| n@{ shape: procs, label: "**Database and load balancer nodes**"}
    l --> o[/Third-party<br>providers/]
    g --> |SSH|n
    i ---> |scrape|n
    f ---> p@{ shape: procs, label: "**Multiple<br>ClusterControl<br>Controllers**"}
    q <--> |gRPC|r@{ shape: procs, label: "**Kubernetes<br>Agents**"}

ClusterControl has minimal performance impact and will not cause any downtime to your database server or cluster. In fact, it will perform automatic recovery (if enabled) when it finds a failed database node or cluster. See Failover and Recovery.

Management and automation operations

For management and deployment jobs, ClusterControl performs these jobs by pushing remote commands via SSH to the target node. Users only need to install the ClusterControl controller package on the ClusterControl host, make sure the SSH key-based authentication and the CMON database user GRANTs are properly set up on each of the managed hosts. This mechanism requires no agent which simplifies the configuration of the whole infrastructure. Agent-based mode of operation is only supported for monitoring jobs, as described in the next section.

Monitoring operation

For monitoring operation, ClusterControl uses Prometheus and exporters to store time-series monitoring data. For query monitoring, ClusterControl uses either a query agent (ClusterControl Agent, cmnd) or SSH pulling, depending on the cluster type. Other operations like management, scaling, and deployment are performed through an agentless approach as described in the Management and Deployment Operations.

Info

ClusterControl used to support both agentless (SSH sampling) and agent-based monitoring. However, since v1.9.8, ClusterControl defaults to agent-based monitoring using Prometheus. Agentless monitoring is no longer recommended.

Monitoring tools

ClusterControl uses a Prometheus server (default to port 9090) to store time-series monitoring data, and all monitored nodes will be configured with at least three exporters (depending on the node's role):

Node type	Exporter name	Port	Package
All	Process exporter	9011	process-exporter
	Node/system metrics exporter	9100	node_exporter
MySQL/MariaDB	MySQL exporter	9104	mysqld_exporter
MongoDB	MongoDB exporter	9216	mongodb_exporter
PostgreSQL/TimescaleDB	Postgres exporter	9187	postgres_exporter
Redis/Valkey	Redis exporter	9121	redis_exporter
SQL Server	SQL exporter	9999	sql_exporter
Elasticsearch	Elasticsearch exporter	9114	elasticsearch_exporter
ProxySQL	ProxySQL exporter	42004	proxysql_exporter
HAProxy	HAProxy exporter	9101	haproxy_exporter

On every monitored host, ClusterControl will configure and daemonize the exporter process using systemd. It is recommended to have an Internet connection when installing the necessary packages and automate the Prometheus deployment. For offline installation, the packages must be pre-downloaded into /var/cache/cmon/packages on the ClusterControl node. For the list of required packages and links, please refer to /usr/share/cmon/templates/packages.conf. Apart from the Prometheus scrape process, ClusterControl also connects to the process exporter via HTTP calls to determine the process state of the node, and PromQL to visualize and trend the monitoring data.

Note

ClusterControl depends on a working Prometheus for accurate reporting on management and monitoring data. Therefore, Prometheus and exporter processes are managed by the internal process manager thread. A non-operational Prometheus will have a significant impact on the CMON process.

Since ClusterControl allows multi-instance per single host (only for PostgreSQL-based clusters), ClusterControl takes a conservative approach by automatically configuring a different exporter port if there is more than one same process to monitor to avoid port conflict by incrementing the port number for every instance. Suppose you have two PostgreSQL instances in a host (port 5432 and 5437), ClusterControl will configure the first PostgreSQL's exporter to be running on the default port, 9187 while the second PostgreSQL's exporter port will be configured with port 9188, incremented by 1.

Multiple controllers management

For large deployments, it is possible to have multiple ClusterControl instances managing your fleet of database clusters. For example, for 1000+ database nodes, you probably need around 5 to 8 ClusterControl instances (depending on the hardware size) to manage and automate the whole set. You can use ClusterControl Ops-C (shortform of "Operation Center") to have a centralized management system to oversee the whole clusters.

ClusterControl Ops-C utilizing the ClusterControl Proxy service (ccmgr) to connect to multiple ClusterControl instances to retrieve status and information about all clusters managed by every ClusterControl controller. It aggregates and visualizes overall view of all clusters in real-time, allowing you to have a centralized view of what is happening without the hassle to check and find the corresponding ClusterControl controller.

The following diagram shows how ClusterControl Ops-C communicates with ClusterControl instances:

flowchart TD
    A[/**Users**/] --> C
    B([ClusterControl Proxy - ccmgr])
    C[**ClusterControl Ops-C**]
    D[**ClusterControl Controller #1**<br>/api/v2]
    F[**ClusterControl Controller #N**<br>/api/v2]
    E@{ shape: procs, label: "**Database/Load balancer nodes**"}
    G@{ shape: procs, label: "**Database/Load balancer nodes**"}
    C <--> B
    B --> D
    B --> F
    D --> E
    F --> G

By default, the ClusterControl Ops-C GUI is disabled and can be activated under the License page. When activating, additional configuration is required to create a privileged user to securely integrate and manage multiple ClusterControl controllers. Apart from the standard ClusterControl users authentication method, it also supports authenticating using the existing directory service like LDAP or Active Directory. Simply add a controller by specifying the username and password of ClusterControl user (must belong to admins group) and you are good to go.