# Packages
# README
ServerStatus
Yet Another ServerStatus Backend, using Prometheus as datasource
Quick Start
Scraping
First, you should set up node-exporter on each of target hosts, and prometheus or any Prometheus-compatible software like vmagent on the host you like to scarpe metrics on targets
As region
, location
, virtualization type
of target hosts cannot be concluded from exported metrics, so you should set these attributes on prometheus scrape configs
global: scrape_interval: 15s evaluation_interval: 15s scrape_configs: - job_name: node scrape_interval: 15s scrape_timeout: 15s static_configs: - targets: ['1.1.1.1:9100'] labels: hostname: "host-a" virt_type: "kvm" region: "FR" location: "Paris" - targets: ['2.2.2.2:9100'] labels: hostname: "host-b" virt_type: "kvm" region: "JP" location: "Osaka" metric_relabel_configs: - action: labeldrop regex: (region|location)
We recommend you adding hostname
into target labels to identify hosts, but the auto-generated label instance
can also be used to identify hosts, and displayed hostname can be replaced by ServerStatus configurations later
Action labeldrop
is used to drop all the specific labels in scraped metrics except metrics that auto-generated by prometheus like up
before metrics are stored. It's quite an ugly style but works well for our purpose to keep these extra infomation at a cheaper cost.
For example, for auto-generated metrics, it will be
up{hostname="host-a", instance="1.1.1.1:9100", job="node", location="Paris", region="FR", virt_type="kvm"} 1
and those scraped from node-exporter, it will be
node_boot_time_seconds{hostname="host-a", instance="1.1.1.1:9100", job="node"} 3333
querying
{ "version": 1, "listen": "127.0.0.1:30000", "refresh_interval": 120, # configuration refresh interval, nodes list will be automated be reloaded "scrape_interval": 5, # how often we make a query to prometheus datasource "log_path": "/path/to/logdir", "nodes": { "default_data_source": "prometheus_name", "id_label": "hostname", # label name used to identify a host "mode": "AUTO", # AUTO or STATIC, AUTO means hosts will be get from query, and STATIC means the following list will be the source "network_overwrites": { # for aggerated metrics to calculate the total amount of network traffic "enable": true, "rx": "node_network_receive_bytes_total:30m_inc", "tx": "node_network_transmit_bytes_total:30m_inc", "align": "30m" }, "list": [ { "hostname": "host-a", "overwrites": { "hostname": "DisplayNameForHostA", "net_devices": [ "eth4", "pppoe0" ] } }, { "hostname": "host-b", "overwrites": { "hostname": "DisplayNameForHostB", "net_devices": [ "eth3", "eth4", "pppoe0", "pppoe1" ], "billing_date": "2023-09-15T00:00:00+08:00" # network traffic will reset at the day and hour of the month } } ], "global_matcher": [ { "label": "job", "op": "=", "value": "node" } ] }, "data_sources": [ { "type": "prometheus", "name": "prometheus_name", "url": "https://127.0.0.1:9090" } ] }
If you don't add hostname
in prometheus configuration, you can specific id_label
as instance
here, and fill hostname
as 1.1.1.1:9100
or 2.2.2.2:9100
in list, and with a replaced hostname
value in overwrites
section.
pre-aggregated network metrics
At the end of the month or billing cycle, calculating the total network traffic usage can become a time-consuming task
To mitigate this, you can utilize record rules from VMAlert
or Prometheus
to aggregate network traffic metrics. For instance, by applying a rule to aggregate the increase in traffic over a 30-minute range, the number of data points will be reduced to 1/120th compared to the original data if your scrape duration is 15 seconds
You can enable this feature in the network_overwrites
section
Please refer to ./doc/vm/vmalert_rule.yml
for instructions on how to add record rules
systemd and reverse proxy config
Please refer to ./doc/
for detail