Query Monitor (version 2)
Since the release of ClusterControl 1.9.0, the Query Monitor v2 (version 2) has been introduced. It is our new feature that deploys agents for monitoring the database nodes particularly the database queries. It is only available for MySQL and PostgreSQL database systems.
Features you can use with Query Monitor v2 are the following:
- Install / remove query monitor agents on the db nodes.
- Start / stop collecting query stats with the agents.
- New ‘Query Workload’ overview showing query digests, latency, throughput and concurrency with a scatter chart.
Query Monitor Agents
When you click the
The agent has the following configuration which you can found in
/etc/cmnd.conf. This is for the agent configuration. Then under
/etc/cmnd.d directory for the Top Query configuration. The agent configuration file
/etc/cmnd.conf has all the details you can find here. By default, these are the following options that are set:
$ cat /etc/cmnd.conf | sed -e 's/#.*$//' -e '/^$/d' -e '/^$/N;/^\n$/D' | sed '/^\t*$/d' [cmnd] data_directory = "/var/lib/cmnd" log_file = "/var/log/cmnd.log" pid_file = "/var/run/cmnd.pid" plugin_names = [ "libpluginMySql.so", "libpluginSqlite.so", "libpluginPgSql.so" ] port = 4433 cert_file = "/etc/ssl/cmnd/cert.pem" key_file = "/etc/ssl/cmnd/key.pem"
You can change the port to your desired port number based on the firewall policy you have in your organization. So take note, you need to have port 4433 open by default in order to make it work properly.
You can find more of its configuration files under these files/directories,
$ find /etc/ -name "cmn*" /etc/ssl/cmnd /etc/logrotate.d/cmnd /etc/cmnd.conf /etc/cmnd.d /etc/cmn.passwd
The Query Monitor Agents are installed using systemd. So you can manage the init scripts with systemd to start and stop the service. To determine if the agents are running, you can do the following,
$ systemctl status cmon-daemon
or verify it with netstat just like below,
$ netstat -ntlvp46|grep cmnd tcp 0 0 0.0.0.0:4433 0.0.0.0:* LISTEN 8456/cmnd
Query Monitoring Workflow
The agent collects samples from the
pg_stat_statements. These metrics which is collected by a database client and sends the query inquiry to the agent. Then the agent that reads the input data and will produce the query statistics and makes it available to the Query Monitor v2 dashboard query metrics. That means, all collected information from the Query Monitoring shall depend upon the following condition.
The reply should be sent in a timely fashion no matter how big the input data is. Should the computer be very slow or the processing too complicated the query statistics should be produced and made available in a progressive manner so a responsive UI can be implemented.
This means while the processing of the data happens the client program can access the data that is already processed.
The agent should be able to update the query statistics on itself when the input data is updated by the database server. A simple query or RPC call is invoked so that the agent keeps to monitor the input data and keep the data be updated.
The query statistics should be according to the query inquiry and the input data.
In simpler words the produced statistical data should be in sync with the input data and the query the user sent. Some query statistics may be derived from the input data.
For PostgreSQL, it requires
pg_stat_statements to be enabled. If you deploy your PostgreSQL cluster using ClusterControl, then you do not have this problem since pg_stat_statements shall be loaded or installed as a plugin extension by default. On the other hand, if you do not have this enabled, you cannot further use the capabilities that the Query Monitor v2 until this is enabled.
For more details (including how to enable the
pg_stat_statements, see this blog post, How to Identify PostgreSQL Performance Issues with Slow Queries
Once it is enabled, you are good to go and allowed to enable or install the agents.
To install the agents, click
Using The Overview Dashboard
The Overview dashboard contains the following:
- The drop down lists for the list of your PostgreSQL databases and time range that you can select.
- The database metrics
- The database metrics has the following:
- Throughput which is based on Queries per-second/Query Count
- Concurrency which is based on Lock time (in seconds)
- Average Latency which is based on Average query time (s)/Average Latency
- Errors which are all the database connectivity errors per second
- The database metrics has the following:
- The digest queries which contains the list of queries based on the following
- a query fingerprint format listed under Digest field.
- Count i.e. the number of queries being detected or found
- Rows which contains the Sent, Examined, Affected sub-fields
- Exec Time (Execution Time) with contains Average and Total sub-fields
This is an aggregated list of all your top queries running on all the nodes of your database cluster. The list can be ordered by Occurrence or Execution Time, to show the most common or slowest queries respectively. It is also possible to filter and review queries from one particular node.
You can see the explain the output of your queries by selecting a query in the list. Review the Settings → Query Monitor to configure what queries to log (e.g. only log queries that take more than 1 seconds to execute).
Configures the Query Monitor settings, as explained below:
|Long Query Time||
|Log queries not using indexes?||
Top Queries Table
This page is auto-refresh every 30 seconds. You can change the refresh rate by clicking on the Refresh rate dropdown at the top right. The following describes the Top Queries table columns:
|Total Exec Time||
View current running queries on your database nodes similar to
select * from pg_stat_activity the command in PostgreSQL. You can stop a running query by selecting to kill the connection that started the query. The process list can be filtered out by the host.
This page is auto-refresh every 30 seconds. You can change the refresh rate by clicking on the arrow beside the green Refresh button.
Shows queries that are outliers. An outlier is a query taking a longer time than the normal query of that type. Use this feature to filter out the outliers for a certain time period. After a number of samples and when ClusterControl has had enough stats, it can determine if latency is higher than normal (2 sigmas +
average_query_time) then it is an outlier and will be added into the Query Outlier.
This feature is dependent on the Top Queries feature above. If Query Monitoring is enabled and Top Queries are captured and populated, the Query Outliers will summarize these and provide a filter based on timestamp. You can view the query history as old as one year ago.
|Avg Query Time||
|Max Query Time||
|Max Lock Time||
This feature is introduced in v1.7.1.
Views advanced query statistics of individual PostgreSQL server. Some statistics are collected per database-level and some are server-wide, as explained in the following table:
|Access by sequential or index scans||Identify whether tables are being accessed by sequential scans or index scans.|
|Table I/O statistics||Table I/O statistics. The ratio of heap blocks read from memory vs Disk I/O for a given table.|
|Index I/O statistics||Disk I/O for every index on a table.|
|Database wide statistics||Server-wide database statistics like
|Table bloat and index bloat||The estimated amount of bloat in your tables and indices.|
|Top 10 largest tables||The largest top 10 tables in the selected database.|
|Database sizes||Every database’s size in MB.|
|Last analyzed or vacuumed||The last time a table was last analyzed or vacuumed.|
|Unused indexes||Returns unused indexes.|
|Duplicate indexes||Returns duplicate indexes.|
|Exclusive lock waits||Returns exclusive lock waits.|
|Logical Replication Latency||Since PostgreSQL 9.4, this view contains replication statistics for each slave the master connects to for sending data. Details at pg_stat_replication View.|
|Logical Replication Slot||Since PostgreSQL 9.4 this view lists all replication slots (and their stats) existing on the database node. Details at pg_stat_replication.|
|Logical Publication||Since PostgreSQL 10
|Logical Subscription||Since PostgreSQL 10