1. Home
  2. Docs
  3. ClusterControl
  4. Troubleshooting
  5. Common Issues

Common Issues

This section covers common issues when configuring and dealing with ClusterControl components, with possible troubleshooting steps and solutions. There is also a community forum available with knowledge base sections for public reference.

If you need further assistance, please contact us via our support channel by submitting a support request or post a new thread in our community help forum.

ClusterControl Installation

This section covers common issues encountered during ClusterControl installation and the installer script.

Failed to start MySQL Server during ClusterControl installation

Description:

During installation, the installer script fails to start the MySQL/MariaDB server on ClusterControl host and returns the following lines:

=> Starting database. This may take a couple of minutes. Do NOT press any key.
mysqld: unrecognized service
=> Failed to start the MySQL Server. ...
Please contact Severalnines support at http://support.severalnines.com if you have installation issues that cannot be resolved.

Troubleshooting steps:

  1. Examine the MySQL error log on possible reasons why does MySQL fails to start. Typically, the log file is located under /var/log/mysqld.log or /var/lib/mysql/error.log.
  2. Try starting the MySQL server manually by using systemctl or service command.

Solutions:

If you are running on the older operating system, ClusterControl might not support the distribution and some issues are expected to show up. See Requirements for details. Once the MySQL server is up and running, restart the ClusterControl installer script again:

$ ./install-cc

 ClusterControl Controller (CMON)

This section covers common issues encountered related to ClusterControl Controller (CMON).

User ‘{username}’ suspended for previous authentication failures.

Description:

When trying to execute an s9s command line, the command returns the current s9s user is suspended due to authentication failures. This happens if the key is no longer recognized by the CMON controller when processing a request.

Example error:

When running an s9s command, the following error occurs:

$ s9s user --list
User 'admin' suspended for previous authentication failures.

Solution:

1. Remove all the s9s user configuration files (/etc/s9s.conf, ~/.s9s/s9s.conf):

$ mv ~/.s9s/s9s.conf ~/.s9s/s9s.conf.bak
$ mv /etc/s9s.conf /etc/s9s.conf.bak

2. Create another user for s9s CLI, in this example, we created a user called dba, under group “admins”:

$ s9s user --create --generate-key --controller="https://localhost:9501" --group=admins dba
Grant user 'dba' succeeded

3. Use it to unlock the “original” suspended user (assuming the suspended user is “admin”):

$ s9s user --cmon-user=dba --enable admin
OK.

By performing the above, it will clear the “disabled” flag of the user so that the user will be able to log in again. The “suspended” flag will also be cleared, the failed login counter set to 0 and the date and time of the last failed login gets deleted, so users who are suspended for failed login attempts will also be able to log in.

CMON is unable to restart MySQL using the service command

Description:

When scheduling a start/restart job, ClusterControl fails to start the node with the error “Command not found”.

Example error:

galera1.domain.com: Starting mysqld failed: Error: Command not found (host: galera1.domain.com): service mysql restart
galera1.domain.com: Starting mysqld

Troubleshooting steps:

1. SSH into the DB node and check the user’s environment path variable:

ssh -tt -i /home/admin/.ssh/id_rsa [email protected] "sudo env | grep PATH"
PATH=/usr/local/bin:/bin:/usr/bin

2. Look at the PATH output.

Solution:

  • Ensure the /sbin the path is there. Otherwise, ClusterControl won’t be able to automatically locate and run the “service” command.
  • If the /sbin the path is not listed in the PATH, add it by using the following command:
PATH=$PATH:/sbin
export PATH

However, the above won’t persist if the user logs out from the terminal. To make it persistent, add those lines into /home/{SSH user}/.bash_profile or /home/{SSH user}/.bashrc

CMON always tries to recover failed database nodes during my maintenance window.

Description:

By default, CMON is configured to perform recovery of failed nodes or clusters. This behavior can be overridden by disabling automatic recovery or enabling maintenance mode for the node.

Solution:

  1. Enable maintenance mode for selected nodes (recommended). To enable the maintenance window, go to Nodes → select the node → Schedule Maintenance Mode → Enable. You have to specify the reason and duration of the maintenance window. During this period, any alarms and notifications raised for this node will be disabled.
  2. Disabling automatic recovery. To disable automatic recovery temporarily, you can just click on the ‘power’ icon for node and cluster. Red means automatic recovery is turned off while green indicates recovery is turned on. This behavior will not persistent if CMON is restarted. To make the above change persistent, disable node or cluster auto-recovery by specifying the following line inside the CMON configuration file of the respective cluster. For example, if you want to disable automatic recovery for cluster ID 1, inside /etc/cmon.d/cmon_1.cnf, set the following line: enable_autorecovery=0

CMON process dies with “Critical error (mysql error code 1)”

Description:

After starting CMON service, it stops and /var/log/cmon.log shows the following error:

(ERROR) Critical error (mysql error code 1) occurred - shutting down

Troubleshooting steps:

1. Run the following command on the ClusterControl host to check if it has the ability to connect to the DB host with current credentials:

$ mysql -ucmon -p -h[database node IP] -P[MySQL port] -e 'SHOW STATUS'

2. Check GRANT for the cmon user on each database host:

mysql> SHOW GRANTS FOR 'cmon'@'[ClusterControl IP address]';

Solution:

It is not recommended to mix public IP addresses and internal IP addresses. For the GRANT, try to use the IP address that your database nodes use to communicate with each other. If the SHOW STATUS statement returns ERROR 1130 (HY000): Host '[ClusterControl IP address]' is not allowed to connect to this, the database host is missing the cmon user grant. Run the following command to reset the cmon user privileges:

mysql> GRANT ALL PRIVILEGES ON *.* TO 'cmon'@'[ClusterControl IP]' IDENTIFIED BY '[cmon password]' WITH GRANT OPTION;
mysql> FLUSH PRIVILEGES;

Where, [ClusterControl IP] is ClusterControl IP address and [cmon password] is mysql_password value inside CMON configuration file.

Job fails with ‘host is already in another cluster’ error

Description:

When deploying a new node, or adding a node into an existing cluster managed by ClusterControl, the deployment fails with the following error:

"Host 1.2.3.4:nnnn is already in another cluster."

Solution:

A host can only exist in one cluster at a time. Check if you have an /etc/cmon.d/cmon_X.cnf file (where X is an integer) that contains the hostname and remove the file if the corresponding cluster does no exist in the UI (be careful to not remove the wrong cmon_X.cnf file):

$ rm /etc/cmon.d/cmon_X.cnf

Otherwise, delete the host from server_node, mysql_server, and hosts tables:

mysql> DELETE FROM cmon.server_node WHERE hostname='1.2.3.4';
mysql> DELETE FROM cmon.mysql_server WHERE hostname='1.2.3.4';
mysql> DELETE FROM cmon.hosts WHERE hostname='1.2.3.4';

Restart CMON to load the new changes:

$ service cmon restart

You may have to execute the deletion several times for each hostname/IP of the cluster you are trying to add.

ClusterControl UI

This section covers common issues encountered related to ClusterControl UI.

Cluster details cannot be retrieved. Please check the CMON process status (service cmon status)

Description:

When listing out the database cluster, ClusterControl reports the following:

Error Message:

Cluster details cannot be retrieved. Please check the CMON process status (service cmon status). Also, ensure the dcps.apis token matches therpc_key in/etc/cmon.cnf.

Additionally, ClusterControl UI shows a toaster notification (on the top right of the UI) indicating that it has an authentication problem to connect to cluster 0 (0 means the global view of clusters under ClusterControl management):

Authentication required on '/0/auth'

Troubleshooting steps:

Retrieve the value of global token inside/etc/cmon.cnf,/var/www/html/clustercontrol/bootstrap.php:

$ grep rpc_key /etc/cmon.cnf
$ grep RPC_TOKEN /var/www/html/clustercontrol/bootstrap.php

Solutions:

Verify that the RPC_TOKEN value in /var/www/html/clustercontrol/bootstrap.php match the token defined as rpc_key in /etc/cmon.cnf. If you manipulate /etc/cmon.cnf directly, you must restart cmon for the change to take effect.

Database connection “Mysql” is missing, or could not be created

Description:

When opening ClusterControl UI on the browser, ClusterControl shows the following error:

Error Message

Error Details Database connection “Mysql” is missing, or could not be created. An Internal Error Has Occurred.

Troubleshooting steps:

1. Verify the installed PHP version:

$ php --version

2. Verify if the php-mysql is installed:

$ rpm -qa | grep -i php-mysql # RHEL/CentOS
$ dpkg -l | egrep php.*mysql # Ubuntu/Debian

3. Verify if MySQL/MariaDB is running:

$ ps -ef | grep -i mysql

Solution:

ClusterControl requires the php-mysql package to be installed together with a running MySQL server. See Requirements. In some cases, php-mysqlnd the package was installed (due to phpMyAdmin dependencies) and this would cause ClusterControl UI to fail to establish a connection to the MySQL server using the standard MySQL calls. Also, a custom package repository could also install a different version of PHP than we would expect. Use the OS’s default package repository is highly recommended during the installation.

Authentication required on ‘/{cluster_id}/auth’

Description:

The ClusterControl UI shows a toaster notification (on the top right of the UI) indicating that it has an authentication problem to connect to a specific cluster-ID.

Troubleshooting steps:

Run the following command to verify if a token is set correctly for the corresponding cluster:

mysql> SELECT cluster_id, token FROM dcps.clusters;

Solution:

In this case, you need to update the token column in dcps.clusters the table for the cluster_id={ID} so it matches the rpc_key in /etc/cmon.d/cmon_{ID}.cnf. These tokens must match. Execute the following update query on the dcps database:

mysql> UPDATE dcps.clusters SET token='[rpc_key]' WHERE cluster_id=[ID];

 

Failed to deploy Prometheus service

Description:

The deployment of prometheus service failed due to the prometheus.service does not exist as follows:

Failed to enable prometheus.service: Command exited with return code 1 on host xxxx. Command: systemctl --no-pask-password enable prometheus.service stdErr: Failled to enable unit: Unit file prometheus.service does not exist. stderr: Failed to enable unit: Unit file prometheus.service does not exist

Troubleshooting steps:

Run the following command to verify the following command in the ClusterControl host to check for the detail why the prometheus won’t start:

$ journalctl -t setroubleshoot

If the result is related to the
[20764]: SELinux is preventing systemd from read access on the file prometheus.service.
, check the status of SELinux.

$ sestatus
SELinux status:                 enabled
SELinuxfs mount:                /sys/fs/selinux
SELinux root directory:         /etc/selinux
Loaded policy name:             targeted
Current mode:                   enforcing
Mode from config file:          enforcing
Policy MLS status:              enabled
Policy deny_unknown status:     allowed
Memory protection checking:     actual (secure)
Max kernel policy version:      33

Solution:

In this case, you need to update the parameter SELINUX in SELinux configuration for CentOS/Redhat based. The configuration can be found at /etc/selinux/config. Change the value become PERMISSIVE or DISABLED. While for the Debian/Ubuntu based, you need to stop the AppArmor service.

Unable to authenticate to LDAP server

Description:

Unable to log in to ClusterControl using the LDAP user after LDAP Settings is configured.

Troubleshooting steps:

  1. Make sure that you have mapped ClusterControl’s Roles with the respective LDAP Group Name under ClusterControl → Sidebar → User Management → LDAP Settings
  2. Verify if the configured LDAP Settings are still correct. Go to ClusterControl → Sidebar → User Management → LDAP Settings → Settings and hit the ‘Verify and Save’ button once more. Note that Windows Active directory DNs are case-sensitive.
  3. For failure LDAP events, ClusterControl will capture the error log under /var/www/html/clustercontrol/app/log/cc-ldap.log. Examine this log to look for any clues why LDAP authentication fails.
  4. On ClusterControl node, try to list out the directory branch by using the admin DN’s and password using ldapsearch the command (OpenLDAP-clients package is required):
$ ldapsearch -H ldaps://ad.company.com -x -b 'CN=DBA,OU=Groups,DC=ad,DC=company,DC=com' -D 'CN=Administrator,OU=Users,DC=ad,DC=company,DC=com' -W

Solutions:

ClusterControl supports Active Directory, FreeIPA, and OpenLDAP authentication, see LDAP Settings for details. If the above troubleshooting steps do not help you solve the issue, please contact us via our support channel for further assistance.

 

Was this article helpful to you? Yes No 1