MQ Exporter for Prometheus monitoring

This directory contains the code for a monitoring solution that exports queue manager data to a Prometheus data collection system. It also contains configuration files to run the monitor program

The monitor collects metrics published by an MQ V9 queue manager or the MQ appliance. Prometheus than calls the monitor program at regular intervals to pull those metrics into its database, where they can then be queried directly or used by other packages such as Grafana.

You can see data such as disk or CPU usage, queue depths, and MQI call counts. Channel status is also reported.

An example Grafana dashboard is included, to show how queries might be constructed. To use the dashboard, create a data source in Grafana called "MQ Prometheus" that points at your database server, and then import the JSON file. This dashboard was built using Grafana v5.3.1

There is also a script to start the collector so that it processes the statistics generated by the MQ Bridge for Salesforce, included in MQ V9.0.2

Building

You need to have the MQ client libraries installed first.
Set up an environment for compiling Go programs

  export GOPATH=~/go (or wherever you want to put it)
  export GOROOT=/usr/lib/golang  (or wherever you have installed it)
  mkdir -p $GOPATH/src
  cd $GOPATH/src

Clone this GitHub repository for the monitoring programs into your GOPATH. The repository contains the prereq packages at a suitable version in the vendor tree

  git clone https://github.com/ibm-message/mq-metric-samples ibm-messaging/mq-metric-samples

From the root of your GOPATH you can then compile the code

  cd $GOPATH
  export CGO_LDFLAGS_ALLOW='-Wl,-rpath.*'
  go build -o bin/mq_prometheus ibm-messaging/mq-metric-samples/cmd/mq_prometheus/*.go

Configuring MQ

It is convenient to run the monitor program as a queue manager service whenever possible.

This directory contains an MQSC script to define the service. In fact, the service definition points at a simple script which sets up any necessary environment and builds the command line parameters for the real monitor program. As the last line of the script is "exec", the process id of the script is inherited by the monitor program, and the queue manager can then check on the status, and can drive a suitable STOP SERVICE operation during queue manager shutdown.

Edit the MQSC script and the shell script to point at appropriate directories where the program exists, and where you want to put stdout/stderr. Ensure that the ID running the queue manager has permission to access the programs and output files.

If you cannot run the monitor as a service, for example when trying to monitor the MQ Appliance which does not support service definitions, then you can run as an MQ client connecting remotely. Setting the ibmmq.client property to true forces client connections. Then all the usual MQ configuration comes into play (MQSERVER environment variable, use of CCDT files etc).

The monitor listens for calls from Prometheus on a TCP port. The default port, reserved for this use in the Prometheus list, is 9157. If you want to use a different number, then use the -ibmmq.httpListenPort command parameter.

The monitor always collects all of the available queue manager-wide metrics. It can also be configured to collect statistics for specific sets of queues. The sets of queues can be given either directly on the command line with the -ibmmq.monitoredQueues flag, or put into a separate file which is also named on the command line, with the -ibmmq.monitoredQueuesFile flag. An example is included in the startup shell script.

Note that the queue patterns are expanded only at startup of the monitor program. If you want to change the patterns, or new queues are defined that match an existing pattern, the monitor must be restarted with a STOP SERVICE and START SERVICE pair of commands.

Channel Status

The monitor program can now process channel status, reporting that back into Prometheus.

The channels to be monitored are set on the command line, similarly to the queue patterns, with -ibmmq.monitoredChannels or -ibmmq.monitoredChannelFile. Unlike the queue monitoring, wildcards are handled automatically by the channel status API. So you do not need to restart this monitor in order to pick up newly-defined channels that match an existing pattern.

Another command line parameter is pollInterval. This determines how frequently the channel status is collected. You may want to have it collected at a different rate to the queue data, as it may be more expensive to extract the channel status. The default pollInterval is 0, which means that the channel status is collected every time Prometheus asks for the queue and queue manager statistics. Setting it to 1m means that a minimum time of one minute will elapse between asking for channel status even if the queue statistics are gathered more frequently.

Channel Metrics

A few of the responses from the DISPLAY CHSTATUS command have been selected as metrics. The key values returned are the status and number of messages processed.

The message count for SVRCONN channels is the number of MQI calls made by the client program.

There are actually two versions of the channel status returned. The channel_status metric has the value corresponding to one of the MQCHS_* values. There are about 15 of these possible values. There is also a channel_status_squash metric which returns one of only three values, compressing the full set into a simpler value that is easier to put colours against in Grafana. From this squashed set, you can readily see if a channel is stopped, running, or somewhere in between.

Channel Instances and Labels

Channel metrics are given labels to assist in distinguishing them. These can be displayed in Grafana or used as part of the filtering. When there is more than one instance of an active channel, the combination of channel name, connection name and MCA job name will be unique.

The channel type (SENDER, SVRCONN etc) and the name of the remote queue manager are also given as labels on the metric.

Channel Dashboard Panels

The example Grafana dashboard shows how these labels and metrics can be combined to show some channel status. The Channel Status table panel demonstrates a couple of features. It uses the labels to select unique instances of channels. It also uses a simple number-to-text map to show the channel status as a word (and colour the cell) instead of a raw number.

The metrics for the table are selected and have '0' added to them. This may be a workround of a Grafana bug, or it may really be how Grafana is designed to work. But without that '+0' on the metric line, the table was showing multiple versions of the status for each channel. This table combines multiple metrics on the same line now.

Authentication

This monitor can be configured to authenticate to the queue manager, sending a userid and password.

The userid is configured using the -ibmmq.userid flag. The password can be set either by using the -ibmmq.password flag, or by passing it via stdin. That allows it to be piped from an external stash file or some other mechanism. Command line flags for controlling passwords are not recommended!

Configuring Prometheus

The Prometheus server has to know how to contact the MQ monitor. The simplest way is just to add a reference to the monitor in the server's configuration file. For example, by adding this block to /etc/prometheus/prometheus.yml.

  # Adding a reference to an MQ monitor. All we have to do is
  # name the host and port on which the monitor is listening.
  # Port 9157 is the reserved default port for this monitor.
  - job_name: 'ibmmq'
          scrape_interval: 15s

    static_configs:
          - targets: ['hostname.example.com:9157']

The server documentation has information on more complex options, including the ability to pull information on which hosts should be monitored from a variety of discovery tools.

Metrics

Once the monitor program has been started, and Prometheus refreshed to connect to it, you will see metrics being available in the prometheus console. All of the metrics are given the a prefix which by default is ibmmq. The name can be configured with the -namespace option on the command line.

More information on the metrics collected through the publish/subscribe interface can be found in the MQ KnowledgeCenter with further description in an MQDev blog entry

The queue and queue manager metrics shown in the Prometheus console are named after the descriptions that you can see when running the amqsrua sample program, but with some minor modifications to match the required style.

The channel metrics all begin with channel.

Note: This update to the Prometheus collector has changed the name of queue-level metrics. Instead of beginning object_, they now begin with "queue_. Dashboards building on this collector will need to change to use the new names. The change was done because of the inclusion of channels as an object type, and for future options with other object types.

z/OS Support

Because the QSTATUS and CHSTATUS commands can be used on z/OS, the Prometheus monitor can now support showing some limited information from a z/OS queue manager. There is nothing special needed to configure it, beyond the client connectivity that allows an application to connect to the z/OS system.

The ibmmq.qStatus parameter must be set to true to use the DIS QSTATUS command

# README