Elasticsearch High-Availability Cluster

This guide will take you through the process of setting up your first high-availability cluster, Elasticsearch cluster using ClusterControl. By following this tutorial, you’ll have a fully functional high-availability Elasticsearch cluster that you can monitor, manage, and later scale if your requirements change.

A high-availability cluster in Elasticsearch offers you service ensuring your cluster to keep available when individual nodes, racks, or whole zones go down. A highly available cluster in Elasticsearch provides no single-point of failure (SPOF), data redundancy, automatic failover and self-healing (also resiliency), zone/rack awareness, rolling maintenance and upgrade, horizontal scalability, consistent cluster state, and durable snapshots and fast restore.

This tutorial applies to the deployment of a database cluster for Elasticsearch by Elastics.

Prerequisites

Before proceeding, ensure you have:

ClusterControl installed and running. If not, follow the instructions in Quickstart or use the Installer Script.
At least two hosts (bare-metal or virtual machines):
- One for the ClusterControl server.
- One for the Elasticsearch single-node or server.
SSH access to all servers.
Internet access on the database host to install required packages.
Network Time Protocol (NTP) configured and running on both hosts to keep their clocks synchronized.

Architecture

Below is a simple flow diagram on how shall our deployment managed and monitored with ClusterConrol while the Client/User interacts with our Elasticsearch nodes (or master/data nodes) emphasizing a high-availability cluster.

%%{init: { "flowchart": { "wrap": false, 'subGraphTitleMargin': {'top': 20,'bottom': 30} }}}%%
graph TD
    %% ------------- Groups -------------
    subgraph Client/User
        C[Client / User]
    end

    %% ------------- Elasticsearch cluster -------------
    subgraph EST["ES Cluster (3 – 5 HA nodes)"]
        direction LR
        ES1[(Node 1)]
        ES2[(Node 2)]
        ES3[(Node 3)]
        ES4[(Node 4)]
        ES5[(Node 5)]
    end

    %% ------------- ClusterControl -------------
    subgraph ClusterControl
        CC[ClusterControl]
    end

    %% ------------- Traffic -------------
    C -- "Search / Index<br>requests" --> ES1
    C --> ES2
    C --> ES3
    C -.-> ES4
    C -.-> ES5

    %% ------------- Monitoring / Mgmt -------------
    CC ===|Monitors & manages| ES1
    CC === ES2
    CC === ES3
    CC === ES4
    CC === ES5

    %% ------------- Styling -------------
    style ES4 fill:#f0f0f0,stroke-dasharray:5 5
    style ES5 fill:#f0f0f0,stroke-dasharray:5 5

Elasticsearch in ClusterControl is deployed without any additional supporting load balancers. Elasticsearch has its built-in capability for load balancing but its not further discussed or is related for this topic.

On the diagram above, the nodes ES1, ES2, ES3, ES4, and ES5 shall have the following hostnames for this example deployment:

ES1: 192.168.40.70 (master-data)
ES2: 192.168.40.71 (master-data)
ES3: 192.168.40.72 (master-data)
ES4: 192.168.40.73 (data)
ES5: 192.168.40.74 (data)

Step 1: Set up SSH key-based authentication

On the ClusterControl server, generate a new SSH key as the root user:
```
ssh-keygen -t rsa
```
Copy the public key to your standalone database node (replace 192.168.40.68 with your node’s IP/hostname):
```
ssh-copy-id -i /root/.ssh/id_rsa [email protected]
```
If the target node uses a custom SSH key or port, you can add options:
```
ssh-copy-id -i /root/.ssh/id_rsa -p 22 -o 'IdentityFile /root/myprivatekey.pem' [email protected]
```
For some advance setup where the user (for example mymainacc) is only allowed to access non-root with only public keys are allowed and password challenge is disabled but has sudo privileges, as root OS user, you can do this:
```
[root@pupnode7 ~]# ssh -i /home/mymainacc/.ssh/id_rsa [email protected]   "sudo bash -c '
umask 077
mkdir -p   /root/.ssh
cat >>     /root/.ssh/authorized_keys
'" < ~/.ssh/id_rsa.pub
```
this copies your root public key to the target node you will be setting up for single-node Elasticsearch cluster deployment
Test passwordless SSH from the ClusterControl server:
```
ssh [email protected] "stat \$PWD/"
```
Ensure there is no password prompt. If the command returns directory system status, you're set.
Apply the same steps to the rest of the ndoes from nodes ES2, ES3, ES4, and ES5

Step 2: Deploy a new cluster (high-availability)

Open a web browser and go to the ClusterControl server’s IP or hostname.
On the ClusterControl dashboard, click Deploy a cluster (top-right) → Create a database cluster. This opens the Deploy cluster wizard.
Select Elasticsearch from the Database dropdown. Specify also the version you desired to choose by clicking Version dropdown. Click Continue.
In the Deploy MySQL Replication wizard, configure the database cluster as below:
(1) Cluster details(2) SSH configuration(3) Node configuration(4) Add node(5) Snapshot storage configuration(5) Preview
Name: For example, Elasticsearch-HA.

Tags: (Optional) e.g., standalone, production, dc1.
SSH user: root

SSH user key path: /root/.ssh/id_rsa (ClusterControl will also autofill this field)

SSH port: 22 (default port)

SSH sudo password: (leave blank if you rely on key-based auth)

SSH sudo / OS elevation command: Choose either uses sudo(default) or doas or pbrun

Install software: On (default)

Disable firewall: Checked (default)

Disable SELinux/AppArmor: Checked (default)
HTTP port: 9200 (disabled text-field with default port specified)

Transfer port: 9200 (disabled text-field with default port specified)

Admin user: admin

Admin password: Password to be assigned to the database admin user

Repository: Use vendor repositories (default)
Eligible master: (Fill in the IP/hostname or FQDN of the master nodes. In this exercise, input ES1, ES2, and ES3)

Data nodes: (Fill in the IP/hostname or FQDN of the data nodes. In this exercise, input ES4, ES5)
Repository name: (Fill in the repository name you want. For example, es-s9s-repo)

Storage host: (Click the drop-down field and choose your desired node to host the NFS shared directory using IP/hostname or FQDN entry)

Default snapshot location: (Path of the shared filesystem use to store your snapshots. For example, /mnt/data/backups/es-snapshot-repositories)

Configure shared filesystem: On (default)
Review your configuration. You can go back and adjust if necessary.
Click Finish to start deployment.
ClusterControl will now install and configure your Elasticsearch High-Availability (HA) Cluster. You can track progress in the Activity center. After a few minutes, your Elasticsearch HA Cluster will appear on the Home page.

Step 3: Monitor your cluster

Once deployed, you’ll see:

Cluster health: The Home page provides the cluster state.
Node health: Hover on the honeycomb diagram or check the Nodes tab. You can also see more detailed histograms under ClusterControl GUI → Clusters → choose the cluster → Dashboards.
Recent alarms: Any triggered alarms will appear if there are configuration or resource issues.
Automatic recovery status: If enabled, ClusterControl can attempt to restart a crashed MySQL server automatically.
Topology viewer: You’ll see a very simple topology with only one node.

Step 4: Import data

You can import data into the standalone server in a variety of ways:

For bulk import, using NDJSON fileUsing elasticdump

NDJSON stands for newline delimited JSON. For example, loading data from Wikimedia Foundation from here, I can load data as such:

Adding index first to the file,

$ wget https://dumps.wikimedia.org/other/cirrussearch/20250414/abwiki-20250414-cirrussearch-content.json.gz
$ gzip -d abwiki-20250414-cirrussearch-content.json.gz
$ jq -nc --arg idx abwiki '
  foreach inputs as $doc (0;                # counter starts at 0
    . + 1;                                  # next counter value
    ({index:{_index:$idx,_id:.}}),           # emit meta line
    $doc                                     # emit original line
  )
'  abwiki-20250414-cirrussearch-content.json > abwiki.ndjson

Then load with NDJSON

curl --cacert /etc/elasticsearch/certs/elasticsearch-ca.pem -u elastic-s9s:myPassw0rd -H 'Content-Type: application/x-ndjson' --data-binary @abwiki.ndjson -X POST 'https://192.168.40.72:9200/_bulk?pretty'

Install elasticdump first.

For example, you have migrated from another host (in this example, its 192.168.10.100), you can dump your data using elasticdump as follows:

~/node_modules/elasticdump/bin/elasticdump \
     --cert /etc/elasticsearch/certs/elasticsearch-ca.pem \
     --input=https://elastic-s9s:[email protected]:9200/abwiki \
     --output=/backups/abwiki.json --type=data \
     --limit=10000

Then create an empty index from your target node, in this example, its 192.168.40.72

curl --cacert /etc/elasticsearch/certs/elasticsearch-ca.pem \
     -u elastic-s9s:myPassw0rd \
     -H 'Content-Type: application/json' -X PUT 'https://192.168.40.72:9200/abwiki_play' \
     -d '{ "settings": { "number_of_replicas": 1 } }'

Then it's now ready to dump to the target node,

NODE_TLS_REJECT_UNAUTHORIZED=0 ~/node_modules/elasticdump/bin/elasticdump \
  --input=/backups/abwiki.json \
  --output=https://elastic-s9s:[email protected]:9200/abwiki_play \
  --type=data \
  --limit=10000 \
  --concurrency=4

You can also pass the NODE_TLS_REJECT_UNAUTHORIZED=0 if you are using a self-signed certificate

For high-availability cluster, make sure your indices are replicated so that it shall always have a copy in case the primary goes down.

Step 5: Connect to the database

Your application or client will connect directly to any of your nodes (master or data nodes). In this example, we go and connect to node ES1,

Host: 192.168.40.70
Port: 9200
User/Password: The credentials specified during deployment. You can also inspect the file /etc/cmon.d/cmon_$CID.cnf where $CID is the cluster id of your Elasticsearch deployment.

No load balancer or additional ports are involved.

Step 6: Enable automatic backups

Elasticsearch does not offer the traditional sense of backup. Instead, it uses snapshots which have more capability and robust way to create a consistent backup copy. It is the most recommended method for creating consistent point-in-time backups for an entire cluster or specific indices, and its native to Elasticsearch.

For high-availability cluster, it is very important to always have a reliable backup and a copy of your data in case disaster or catastrophe occurs. In that regard, ClusterControl makes it easy to schedule them automatically.

Go to ClusterControl GUI → choose the cluster → Backups.
Click Create Backup → Schedule a Backup. The Create a backup schedule wizard will open and proceed to configure your backup as below:
(1) Configuration(2) Advanced settings(3) Schedule(4) Preview
Schedule name: Daily ES-HA cluster snapshot

Cluster: (defaults to your Elasticsearch standalone cluster)

Repository: (defaults to the repository you created during deployment)

Backup method: elasticsearch-snapshot (default) [disabled text field]
Retention: On (default)

Retention [textfield]: 4 (Set to your desired number of days to retain backup)
Set backup schedule: Simple

Every: day at 02:00

Timezone: select your local timezone
Verify all settings. You can go back to adjust if needed.
Click Create to schedule it.

ClusterControl will now automatically perform your backups. You may also restore your backups from the elasticsearch snapshots in the future if necessary for disaster recovery.

Step 7: Configure alerts

To keep track of any issues or incidents in your cluster, it's important to set up alerting. ClusterControl supports sending alarms and alerts to email, web hooks and third-party notification services like Slack or Telegram. In this example, we are going to use email.

Firstly, configure the mail server. Go to ClusterControl GUI → Settings → You don't have a mail server configured. Configure now → SMTP Server. Fill up all necessary information about the SMTP server. You can also opt for Sendmail, however a mail transfer agent (sendmail, postfix or exim) must be installed on the ClusterControl server.

Once configured, we can configure the alert and recipients as below:

ClusterControl GUI → choose the cluster → Settings → Email Notifications.
Select a User group (your group) from the User group dropdown.
Select your email address to from the Users in the selected group dropdown.
Click Enable.
Set the Digest delivery time when you want a digested (summarized events) to be sent to you every day.
Set all Critical events to "Deliver" (default), all Warning events to "Digest" and you may ignore the Info events.

This ensures timely notifications when something goes wrong.

Tip

You can also configure alarms to be sent to third-party notification systems (Slack, Telegram), incident management systems (PagerDuty, ServiceNow, OpsGenie) or web hooks. See Integration → Notification Services.

Step 8: Manage your node

ClusterControl provides monitoring for your cluster and system overview which displays workloads based on metrics. Once your Elasticsearch cluster is done deployed, agents using Prometheus exporters are deployed to gather metrics and provide more granular monitoring of your cluster and system workload for your standalone cluster in Elastitcsearch.

Apart from the monitoring, you can manage your node with the most available options for this cluster:

Database node management: Perform start/stop/restart node, reboot host. These features are available at ClusterControl GUI → choose the cluster → Nodes → Actions.
Configuration management: Perform database configuration changes globally. This feature is available at ClusterControl GUI → choose the cluster → Manage 🡒 Configuration.
Backup management: Create, schedule, restore, store snapshot in an off-cluster storage location such as AWS S3 or any S3 Compatiable storage, while allowing you to set retention period for your backup snapshots. These features are available at ClusterControl GUI → choose the cluster → Backups → Actions and ClusterControl GUI → choose the cluster → Backups → More.
Maintenance management: Activate, deactivate, remark and schedule maintenance mode for all nodes. This feature is available at ClusterControl GUI → choose the cluster → Nodes → Actions → Schedule maintenance.
SSH console: Access your nodes directly from ClusterControl GUI via web SSH console. This feature is available at ClusterControl GUI → choose the cluster → Nodes → Actions → SSH Console.

Step 9: Scale your cluster

One of the key benefits of using ClusterControl is the ease of scaling your cluster. Scaling your cluster in ClusterControl can be done by either scaling out or scaling in (or done horizontal scaling). When you remove nodes, this means you want to scale in, and this can be done in ClusterControl by just going to ClusterControl GUI → choose the cluster → Nodes → Actions → Remove node . While in this section, we are going to focus on adding a node, which means in this case, we are going to scale out.

When scaling out in Elasticsearch using ClusterControl, you are required to choose and define which role you should have this node carries. The following roles available to choose from are the following:

master-data
master
data
coordinator

Adding a node to your cluster increases the high-availability of your cluster and also allows more sharding or replication for your data. Giving a large number of nodes in the cluster incurs also penalties such as processing time or overhead that could be the source of bottlenecks in the future. So in this case, plan ahead and make sure that the scaling of cluster is properly setup.

Lastly, ClusterControl assumes that the new node that you want to add meets the requirements as decribed under Prerequisites and configured with a proper SSH key-based authentication as shown under Step 1: Set up SSH key-based authentication.

Let's go over on how you can scale out your cluster in Elasticsearch:

To add a new node, go to ClusterControl GUI → choose the cluster → Actions → Add new → Replica node → Add node.
In the Create a database node wizard, configure the following:
(1) Node Configuration(2) Add node(4) Preview
Port: 9200 (default)

Install software: On (default)

Disable firewall: Checked (default)

Disable SELinux/AppArmor: Checked (default)
Node: Specify the IP address or hostname or FQDN of the node that you want to add and press "Enter".

Node role: Displays after the Node is specified. Select in this drop-down item to which role you would like this node assigned.
Review the summary of the deploymnent. You can always go back to any previous section to modify your configurations if you wish. The deployment configuration settings will be kept until you exit the wizard.
Click Finish to trigger the deployment job.
ClusterControl will start provisioning the new database node. You can monitor the progress from Activity center, where you will see detailed logs of the deployment. This process can take a few minutes. Once the deployment is complete, the node will appear in the Nodes page.

Conclusion

Congratulations! You’ve deployed, monitored, and managed a high-availability Elasticsearch cluster using ClusterControl. With high-availability cluster, you can scale in or scale out depends in your necessity and requirements, as these subjects are dynamically changes depending on the demand. Aside from that, you can manage your cluster by scheduling backups or executing backups based on-demand. You can monitor the nodes, setup maintenance mode when version upgrade is necessary. While utilizing ClusterControl makes it more easier and convenient for you, make sure not to miss such any updates or file a request for features if necessary for your requirements. We also advise that you always practice to do the following:

Keep your backups snapshots current and up to date.
Monitor performance and resource usage.
Secure your node by restricting access and using strong passwords.
Scale-in or scale-out your cluster.

Enjoy your streamlined high-availability Elasticsearch cluster database deployment, backed by ClusterControl’s powerful operations for backups, alerts, scaling, and more!