package
1.33.2
Repository: https://github.com/influxdata/telegraf.git
Documentation: pkg.go.dev

# README

Intel PowerStat Input Plugin

This plugin gathers power statistics on Intel-based platforms providing insights into power saving and workload migration. Those are beneficial for Monitoring and Analytics systems to take preventive or corrective actions based on platform busyness, CPU temperature, actual CPU utilization and power statistics.

โญ Telegraf v1.17.0 ๐Ÿท๏ธ hardware, system ๐Ÿ’ป linux

Requirements

Kernel modules

Plugin is mostly based on Linux Kernel modules that expose specific metrics over sysfs or devfs interfaces. The following dependencies are expected:

  • intel-rapl kernel module which exposes Intel Runtime Power Limiting metrics over sysfs (/sys/devices/virtual/powercap/intel-rapl),
  • msr kernel module that provides access to processor model specific registers over devfs (/dev/cpu/cpu%d/msr),
  • cpufreq kernel module - which exposes per-CPU Frequency over sysfs (/sys/devices/system/cpu/cpu%d/cpufreq/scaling_cur_freq),
  • intel-uncore-frequency kernel module exposes Intel uncore frequency metrics over sysfs (/sys/devices/system/cpu/intel_uncore_frequency).

Make sure the required kernel modules are loaded and running. Modules might have to be manually enabled by using modprobe. Depending on the kernel version, run the following commands:

# rapl modules:
## kernel < 4.0
sudo modprobe intel_rapl
## kernel >= 4.0
sudo modprobe rapl
sudo modprobe intel_rapl_common
sudo modprobe intel_rapl_msr

# msr module:
sudo modprobe msr

# cpufreq module:
### integrated in kernel

# intel-uncore-frequency module:
## only for kernel >= 5.6.0
sudo modprobe intel-uncore-frequency

Kernel's perf interface

For perf-related metrics, when Telegraf is not running as root, the following capability should be added to the Telegraf executable:

sudo setcap cap_sys_admin+ep <path_to_telegraf_binary>

Alternatively, /proc/sys/kernel/perf_event_paranoid has to be set to value less than 1.

Depending on environment and configuration (number of monitored CPUs and number of enabled metrics), it might be required to increase the limit on the number of open file descriptors allowed. This can be done for example by using ulimit -n command.

Root privileges

[!IMPORTANT] Telegraf with Intel PowerStat plugin enabled may require root privileges to read all the metrics (depending on OS type or configuration).

Alternatively, the following capabilities can be added to the Telegraf executable:

#without perf-related metrics:
sudo setcap cap_sys_rawio,cap_dac_read_search+ep <path_to_telegraf_binary>

#with perf-related metrics:
sudo setcap cap_sys_rawio,cap_dac_read_search,cap_sys_admin+ep <path_to_telegraf_binary>

Supported hardware

Specific metrics require certain processor features to be present, otherwise Intel PowerStat plugin won't be able to read them. The user can detect supported processor features by reading /proc/cpuinfo file. Plugin assumes crucial properties are the same for all CPU cores in the system.

The following processor properties are examined in more detail in this section:

  • vendor_id
  • cpu family
  • model
  • flags

The following processor properties are required by the plugin:

  • Processor vendor_id must be GenuineIntel and cpu family must be 6 - since data used by the plugin are Intel-specific.
  • The following processor flags shall be present:
    • msr shall be present for plugin to read platform data from processor model specific registers and collect the following metrics:
      • cpu_c0_state_residency
      • cpu_c1_state_residency
      • cpu_c3_state_residency
      • cpu_c6_state_residency
      • cpu_c7_state_residency
      • cpu_busy_cycles (DEPRECATED - superseded by cpu_c0_state_residency_percent)
      • cpu_busy_frequency
      • cpu_temperature
      • cpu_base_frequency
      • max_turbo_frequency
      • uncore_frequency (for kernel < 5.18)
    • aperfmperf shall be present to collect the following metrics:
      • cpu_c0_state_residency
      • cpu_c1_state_residency
      • cpu_busy_cycles (DEPRECATED - superseded by cpu_c0_state_residency_percent)
      • cpu_busy_frequency
    • dts shall be present to collect:
      • cpu_temperature
  • supported CPU model. To see which metrics are supported by your model. The following metrics exist:
    • cpu_c1_state_residency
    • cpu_c3_state_residency
    • cpu_c6_state_residency
    • cpu_c7_state_residency
    • cpu_temperature
    • cpu_base_frequency
    • uncore_frequency

Supported CPU models

Model numberProcessor namecpu_c1_state_residency
cpu_c6_state_residency
cpu_temperature
cpu_base_frequency
cpu_c3_state_residencycpu_c7_state_residencyuncore_frequency
0x1EIntel Nehalemโœ“โœ“
0x1FIntel Nehalem-Gโœ“โœ“
0x1AIntel Nehalem-EPโœ“โœ“
0x2EIntel Nehalem-EXโœ“โœ“
0x25Intel Westmereโœ“โœ“
0x2CIntel Westmere-EPโœ“โœ“
0x2FIntel Westmere-EXโœ“โœ“
0x2AIntel Sandybridgeโœ“โœ“โœ“
0x2DIntel Sandybridge-Xโœ“โœ“โœ“
0x3AIntel Ivybridgeโœ“โœ“โœ“
0x3EIntel Ivybridge-Xโœ“โœ“โœ“
0x3CIntel Haswellโœ“โœ“โœ“
0x3FIntel Haswell-Xโœ“โœ“โœ“
0x45Intel Haswell-Lโœ“โœ“โœ“
0x46Intel Haswell-Gโœ“โœ“โœ“
0x3DIntel Broadwellโœ“โœ“โœ“
0x47Intel Broadwell-Gโœ“โœ“โœ“โœ“
0x4FIntel Broadwell-Xโœ“โœ“โœ“
0x56Intel Broadwell-Dโœ“โœ“โœ“
0x4EIntel Skylake-Lโœ“โœ“โœ“
0x5EIntel Skylakeโœ“โœ“โœ“
0x55Intel Skylake-Xโœ“โœ“
0x8EIntel KabyLake-Lโœ“โœ“โœ“
0x9EIntel KabyLakeโœ“โœ“โœ“
0xA5Intel CometLakeโœ“โœ“โœ“
0xA6Intel CometLake-Lโœ“โœ“โœ“
0x66Intel CannonLake-Lโœ“โœ“
0x6AIntel IceLake-Xโœ“โœ“
0x6CIntel IceLake-Dโœ“โœ“
0x7DIntel IceLakeโœ“
0x7EIntel IceLake-Lโœ“โœ“
0x9DIntel IceLake-NNPIโœ“โœ“
0xA7Intel RocketLakeโœ“โœ“
0x8CIntel TigerLake-Lโœ“โœ“
0x8DIntel TigerLakeโœ“โœ“
0x8FIntel Sapphire Rapids Xโœ“โœ“
0xCFIntel Emerald Rapids Xโœ“โœ“
0xADIntel Granite Rapids Xโœ“
0xAEIntel Granite Rapids Dโœ“
0x8AIntel Lakefieldโœ“โœ“
0x97Intel AlderLakeโœ“โœ“โœ“
0x9AIntel AlderLake-Lโœ“โœ“โœ“
0xB7Intel RaptorLakeโœ“โœ“โœ“
0xBAIntel RaptorLake-Pโœ“โœ“โœ“
0xBFIntel RaptorLake-Sโœ“โœ“โœ“
0xACIntel MeteorLakeโœ“โœ“โœ“
0xAAIntel MeteorLake-Lโœ“โœ“โœ“
0xC6Intel ArrowLakeโœ“โœ“
0xBDIntel LunarLakeโœ“โœ“
0x37Intel Atomยฎ Bay Trailโœ“
0x4DIntel Atomยฎ Avatonโœ“
0x4AIntel Atomยฎ Merrifieldโœ“
0x5AIntel Atomยฎ Moorefieldโœ“
0x4CIntel Atomยฎ Airmontโœ“โœ“
0x5CIntel Atomยฎ Apollo Lakeโœ“โœ“โœ“
0x5FIntel Atomยฎ Denvertonโœ“
0x7AIntel Atomยฎ Goldmontโœ“โœ“โœ“
0x86Intel Atomยฎ Jacobsvilleโœ“
0x96Intel Atomยฎ Elkhart Lakeโœ“โœ“
0x9CIntel Atomยฎ Jasper Lakeโœ“โœ“
0xBEIntel AlderLake-Nโœ“โœ“
0xAFIntel Sierra Forestโœ“
0xB6Intel Grand Ridgeโœ“
0x57Intel Xeonยฎ PHI Knights Landingโœ“
0x85Intel Xeonยฎ PHI Knights Millโœ“

Global configuration options

In addition to the plugin-specific configuration settings, plugins support additional global and plugin configuration settings. These settings are used to modify metrics, tags, and field or create aliases and configure ordering, etc. See the CONFIGURATION.md for more details.

Configuration

# Intel PowerStat plugin enables monitoring of platform metrics (power, TDP)
# and per-CPU metrics like temperature, power and utilization. Please see the
# plugin readme for details on software and hardware compatibility.
# This plugin ONLY supports Linux.
[[inputs.intel_powerstat]]
  ## The user can choose which package metrics are monitored by the plugin with
  ## the package_metrics setting:
  ## - The default, will collect "current_power_consumption",
  ##   "current_dram_power_consumption" and "thermal_design_power".
  ## - Leaving this setting empty means no package metrics will be collected.
  ## - Finally, a user can specify individual metrics to capture from the
  ##   supported options list.
  ## Supported options:
  ##   "current_power_consumption", "current_dram_power_consumption",
  ##   "thermal_design_power", "max_turbo_frequency", "uncore_frequency",
  ##   "cpu_base_frequency"
  # package_metrics = ["current_power_consumption", "current_dram_power_consumption", "thermal_design_power"]

  ## The user can choose which per-CPU metrics are monitored by the plugin in
  ## cpu_metrics array.
  ## Empty or missing array means no per-CPU specific metrics will be collected
  ## by the plugin.
  ## Supported options:
  ##   "cpu_frequency", "cpu_c0_state_residency", "cpu_c1_state_residency",
  ##   "cpu_c3_state_residency", "cpu_c6_state_residency", "cpu_c7_state_residency",
  ##   "cpu_temperature", "cpu_busy_frequency", "cpu_c0_substate_c01",
  ##   "cpu_c0_substate_c02", "cpu_c0_substate_c0_wait"
  # cpu_metrics = []

  ## CPUs metrics to include from those configured in cpu_metrics array
  ## Can't be combined with excluded_cpus. Empty means all CPUs are gathered.
  ## e.g. ["0-3", "4,5,6"] or ["1-3,4"]
  # included_cpus = []

  ## CPUs metrics to exclude from those configured in cpu_metrics array
  ## Can't be combined with included_cpus. Empty means all CPUs are gathered.
  ## e.g. ["0-3", "4,5,6"] or ["1-3,4"]
  # excluded_cpus = []

  ## Filesystem location of JSON file that contains PMU event definitions.
  ## Mandatory only for perf-related metrics (cpu_c0_substate_c01, cpu_c0_substate_c02, cpu_c0_substate_c0_wait).
  # event_definitions = ""

  ## The user can set the timeout duration for MSR reading.
  ## Enabling this timeout can be useful in situations where, on heavily loaded systems,
  ## the code waits too long for a kernel response to MSR read requests.
  ## 0 disables the timeout (default).
  # msr_read_timeout = "0ms"
  1. The configuration of included_cpus or excluded_cpus may affect the ability to collect package_metrics. Some of them (max_turbo_frequency, cpu_base_frequency, and uncore_frequency) need to read data from exactly one processor for each package. If included_cpus or excluded_cpus exclude all processors from the package, reading th mentioned metrics for that package will not be possible.
  2. event_definitions JSON file for specific architecture can be found at perfmon. A script to download the event definition that is appropriate for current environment (event_download.py) is available at pmu-tools. For perf-related metrics supported by this plugin, an event definition JSON file with events for the core is required, e.g. sapphirerapids_core.json or GenuineIntel-6-8F-core.json.

Dependencies of metrics on system configuration

Details of these dependencies are discussed above:

Configuration optionTypeDependency
current_power_consumptionpackage_metricsrapl module
current_dram_power_consumptionpackage_metricsrapl module
thermal_design_powerpackage_metricsrapl module
max_turbo_frequencypackage_metricsmsr module
uncore_frequencypackage_metricsintel-uncore-frequency module*
cpu_base_frequencypackage_metricsmsr module
cpu_frequencycpu_metricscpufreq module
cpu_c0_state_residencycpu_metricsmsr module
cpu_c1_state_residencycpu_metricsmsr module
cpu_c3_state_residencycpu_metricsmsr module
cpu_c6_state_residencycpu_metricsmsr module
cpu_c7_state_residencycpu_metricsmsr module
cpu_busy_cycles (DEPRECATED, use cpu_c0_state_residency_percent)cpu_metricsmsr module
cpu_temperaturecpu_metricsmsr module
cpu_busy_frequencycpu_metricsmsr module
cpu_c0_substate_c01cpu_metricsperf interface
cpu_c0_substate_c02cpu_metricsperf interface
cpu_c0_substate_c0_waitcpu_metricsperf interface

*for all metrics enabled by the configuration option uncore_frequency, starting from kernel version 5.18, only the intel-uncore-frequency module is required. For older kernel versions, the metric uncore_frequency_mhz_cur requires the msr module to be enabled.

Example: Configuration with no per-CPU telemetry

This configuration allows getting default processor package specific metrics, no per-CPU metrics are collected:

[[inputs.intel_powerstat]]
  cpu_metrics = []

Example: Configuration with no per-CPU telemetry - equivalent case

This configuration allows getting default processor package specific metrics, no per-CPU metrics are collected:

[[inputs.intel_powerstat]]

Example: Configuration for CPU Temperature and CPU Frequency

This configuration allows getting default processor package specific metrics, plus subset of per-CPU metrics (CPU Temperature and CPU Frequency) which will be gathered only for cpu_id = 0:

[[inputs.intel_powerstat]]
  cpu_metrics = ["cpu_frequency", "cpu_temperature"]
  included_cpus = ["0"]

Example: Configuration for CPU Temperature and CPU Frequency without default package metrics

This configuration allows getting only a subset of per-CPU metrics (CPU Temperature and CPU Frequency) which will be gathered for all cpus except cpu_id = ["1-3"]:

[[inputs.intel_powerstat]]
  package_metrics = []
  cpu_metrics = ["cpu_frequency", "cpu_temperature"]
  excluded_cpus = ["1-3"]

Example: Configuration with all available metrics

This configuration allows getting all processor package specific metrics and all per-CPU metrics:

[[inputs.intel_powerstat]]
  package_metrics = ["current_power_consumption", "current_dram_power_consumption", "thermal_design_power", "max_turbo_frequency", "uncore_frequency", "cpu_base_frequency"]
  cpu_metrics = ["cpu_frequency", "cpu_c0_state_residency", "cpu_c1_state_residency", "cpu_c3_state_residency", "cpu_c6_state_residency", "cpu_c7_state_residency", "cpu_temperature", "cpu_busy_frequency", "cpu_c0_substate_c01", "cpu_c0_substate_c02", "cpu_c0_substate_c0_wait"]
  event_definitions = "/home/telegraf/.cache/pmu-events/GenuineIntel-6-8F-core.json"

Metrics

All metrics collected by Intel PowerStat plugin are collected in fixed intervals. Metrics that reports processor C-state residency or power are calculated over elapsed intervals.

The following measurements are supported by Intel PowerStat plugin:

  • powerstat_core

    • The following tags are returned by plugin with powerstat_core measurements:

      TagDescription
      package_idID of platform package/socket.
      core_idID of physical processor core.
      cpu_idID of logical processor core.

      Measurement powerstat_core metrics are collected per-CPU (cpu_id is the key) while core_id and package_id tags are additional topology information.

    • Available metrics for powerstat_core measurement:

      Metric name (field)DescriptionUnits
      cpu_frequency_mhzCurrent operational frequency of CPU Core.MHz
      cpu_busy_frequency_mhzCPU Core Busy Frequency measured as frequency adjusted to CPU Core busy cycles.MHz
      cpu_temperature_celsiusCurrent temperature of CPU Core.Celsius degrees
      cpu_c0_state_residency_percentPercentage of time that CPU Core spent in C0 Core residency state.%
      cpu_c1_state_residency_percentPercentage of time that CPU Core spent in C1 Core residency state.%
      cpu_c3_state_residency_percentPercentage of time that CPU Core spent in C3 Core residency state.%
      cpu_c6_state_residency_percentPercentage of time that CPU Core spent in C6 Core residency state.%
      cpu_c7_state_residency_percentPercentage of time that CPU Core spent in C7 Core residency state.%
      cpu_c0_substate_c01_percentPercentage of time that CPU Core spent in C0.1 substate out of the total time in the C0 state.%
      cpu_c0_substate_c02_percentPercentage of time that CPU Core spent in C0.2 substate out of the total time in the C0 state.%
      cpu_c0_substate_c0_wait_percentPercentage of time that CPU Core spent in C0_Wait substate out of the total time in the C0 state.%
      cpu_busy_cycles_percent(DEPRECATED - superseded by cpu_c0_state_residency_percent) CPU Core Busy cycles as a ratio of Cycles spent in C0 state residency to all cycles executed by CPU Core.%
  • powerstat_package

    • The following tags are returned by plugin with powerstat_package measurements:

      TagDescription
      package_idID of platform package/socket.
      active_coresSpecific tag for max_turbo_frequency_mhz metric. The maximum number of activated cores for reachable turbo frequency.
      hybridSpecific tag for max_turbo_frequency_mhz metric. Available only for hybrid processors. Will be set to primary for primary cores of a hybrid architecture, and to secondary for secondary cores of a hybrid architecture.
      dieSpecific tag for all uncore_frequency metrics. Id of die.
      typeSpecific tag for all uncore_frequency metrics. Type of uncore frequency (current or initial).

      Measurement powerstat_package metrics are collected per processor package package_id tag indicates which package metric refers to.

    • Available metrics for powerstat_package measurement:

      Metric name (field)DescriptionUnits
      thermal_design_power_wattsMaximum Thermal Design Power (TDP) available for processor package.Watts
      current_power_consumption_wattsCurrent power consumption of processor package.Watts
      current_dram_power_consumption_wattsCurrent power consumption of processor package DRAM subsystem.Watts
      max_turbo_frequency_mhzMaximum reachable turbo frequency for number of cores active.MHz
      uncore_frequency_limit_mhz_minMinimum uncore frequency limit for die in processor package.MHz
      uncore_frequency_limit_mhz_maxMaximum uncore frequency limit for die in processor package.MHz
      uncore_frequency_mhz_curCurrent uncore frequency for die in processor package. Available only with tag current. This value is available from intel-uncore-frequency module for kernel >= 5.18. For older kernel versions it needs to be accessed via MSR. In case of lack of loaded msr, only uncore_frequency_limit_mhz_min and uncore_frequency_limit_mhz_max metrics will be collected.MHz
      cpu_base_frequency_mhzCPU Base Frequency (maximum non-turbo frequency) for the processor package.MHz

Known issues

Starting from Linux kernel version v5.4.77, due to this kernel change, resources such as /sys/devices/virtual/powercap/intel-rapl//*/energy_uj can only be accessed by the root user for security reasons. Therefore, this plugin requires root privileges to gather rapl metrics correctly.

If such strict security restrictions are not relevant, reading permissions for files in the /sys/devices/virtual/powercap/intel-rapl/ directory can be manually altered, for example, using the chmod command with custom parameters. For instance, read and execute permissions for all files in the intel-rapl directory can be granted to all users using:

sudo chmod -R a+rx /sys/devices/virtual/powercap/intel-rapl/

Example Output

powerstat_package,host=ubuntu,package_id=0 thermal_design_power_watts=160 1606494744000000000
powerstat_package,host=ubuntu,package_id=0 current_power_consumption_watts=35 1606494744000000000
powerstat_package,host=ubuntu,package_id=0 cpu_base_frequency_mhz=2400i 1669118424000000000
powerstat_package,host=ubuntu,package_id=0 current_dram_power_consumption_watts=13.94 1606494744000000000
powerstat_package,host=ubuntu,package_id=0,active_cores=0 max_turbo_frequency_mhz=3000i 1606494744000000000
powerstat_package,host=ubuntu,package_id=0,active_cores=1 max_turbo_frequency_mhz=2800i 1606494744000000000
powerstat_package,die=0,host=ubuntu,package_id=0,type=initial uncore_frequency_limit_mhz_min=800,uncore_frequency_limit_mhz_max=2400 1606494744000000000
powerstat_package,die=0,host=ubuntu,package_id=0,type=current uncore_frequency_mhz_cur=800i,uncore_frequency_limit_mhz_min=800,uncore_frequency_limit_mhz_max=2400 1606494744000000000
powerstat_core,core_id=0,cpu_id=0,host=ubuntu,package_id=0 cpu_frequency_mhz=1200.29 1606494744000000000
powerstat_core,core_id=0,cpu_id=0,host=ubuntu,package_id=0 cpu_temperature_celsius=34i 1606494744000000000
powerstat_core,core_id=0,cpu_id=0,host=ubuntu,package_id=0 cpu_c0_state_residency_percent=0.8 1606494744000000000
powerstat_core,core_id=0,cpu_id=0,host=ubuntu,package_id=0 cpu_c1_state_residency_percent=6.68 1606494744000000000
powerstat_core,core_id=0,cpu_id=0,host=ubuntu,package_id=0 cpu_c3_state_residency_percent=0 1606494744000000000
powerstat_core,core_id=0,cpu_id=0,host=ubuntu,package_id=0 cpu_c6_state_residency_percent=92.52 1606494744000000000
powerstat_core,core_id=0,cpu_id=0,host=ubuntu,package_id=0 cpu_c7_state_residency_percent=0 1606494744000000000
powerstat_core,core_id=0,cpu_id=0,host=ubuntu,package_id=0 cpu_busy_frequency_mhz=1213.24 1606494744000000000
powerstat_core,core_id=0,cpu_id=0,host=ubuntu,package_id=0 cpu_c0_substate_c01_percent=0 1606494744000000000
powerstat_core,core_id=0,cpu_id=0,host=ubuntu,package_id=0 cpu_c0_substate_c02_percent=5.68 1606494744000000000
powerstat_core,core_id=0,cpu_id=0,host=ubuntu,package_id=0 cpu_c0_substate_c0_wait_percent=43.74 1606494744000000000

# Structs

PowerStat plugin enables monitoring of platform metrics.