modulepackage
0.0.0-20180227003457-e07f8c26812d
Repository: https://github.com/mxpv/nvml-go.git
Documentation: pkg.go.dev
# README
nvml-go
golang wrapper for NVIDIA Management Library (NVML)
Basic example
func ExampleNew() {
nvml, err := New("")
if err != nil {
panic(err)
}
defer nvml.Shutdown()
err = nvml.Init()
if err != nil {
panic(err)
}
driverVersion, err := nvml.SystemGetDriverVersion()
if err != nil {
panic(err)
}
log.Printf("Driver version:\t%s", driverVersion)
nvmlVersion, err := nvml.SystemGetNVMLVersion()
if err != nil {
panic(err)
}
log.Printf("NVML version:\t%s", nvmlVersion)
deviceCount, err := nvml.DeviceGetCount()
if err != nil {
panic(err)
}
for i := uint32(0); i < deviceCount; i++ {
handle, err := nvml.DeviceGetHandleByIndex(i)
if err != nil {
panic(err)
}
name, err := nvml.DeviceGetName(handle)
log.Printf("Product name:\t%s", name)
brand, err := nvml.DeviceGetBrand(handle)
if err != nil {
panic(err)
}
log.Printf("Product Brand:\t%s", brand)
uuid, err := nvml.DeviceGetUUID(handle)
if err != nil {
panic(err)
}
log.Printf("GPU UUID:\t\t%s", uuid)
fan, err := nvml.DeviceGetFanSpeed(handle)
if err != nil {
panic(err)
}
log.Printf("Fan Speed:\t\t%d", fan)
}
}
TODO
- Unit Queries
- Unit Commands
- Linux support
# Constants
Aggregate counts persist across reboots (i.e.
noinspection GoUnusedConst.
noinspection GoUnusedConst.
noinspection GoUnusedConst.
noinspection GoUnusedConst.
noinspection GoUnusedConst.
noinspection GoUnusedConst.
Graphics clock domain.
Default application clock target.
Target application clock.
Current actual clock value.
OEM-defined maximum clock rate.
Memory clock domain.
SM clock domain.
GPU clocks are limited by current setting of applications clocks.
Nothing is running on the GPU and the clocks are dropping to Idle state.
HW Power Brake Slowdown (reducing the core clocks by a factor of 2 or more) is engaged.
HW Slowdown (reducing the core clocks by a factor of 2 or more) is engaged.
HW Thermal Slowdown (reducing the core clocks by a factor of 2 or more) is engaged.
Bit mask representing no clocks throttling.
SW Power Scaling algorithm is reducing the clocks below requested clocks.
SW Thermal Slowdown This is an indicator of one or more of the following: - Current GPU temperature above the GPU Max Operating Temperature - Current memory temperature above the Memory Max Operating Temperature.
This GPU has been added to a Sync boost group with nvidia-smi or DCGM in order to maximize performance per watt.
Renamed to ClocksThrottleReasonApplicationsClocksSetting as the name describes the situation more accurately.
Video encoder/decoder clock domain.
Default compute mode - multiple contexts per device.
Only one context per device, usable from multiple threads at a time.
Support Removed.
No contexts per device.
WDDM driver model -- GPU treated as a display device.
WDM (TCC) model (recommended) -- GPU treated as a generic device.
noinspection GoUnusedConst.
noinspection GoUnusedConst.
Everything is enabled and running at full speed.
Designed for running only compute tasks.
Designed for running graphics applications that don't require high bandwidth double precision.
The ECC object determining the level of ECC support.
An object defined by OEM.
The power management object.
A memory error that was corrected for ECC errors, these are single bit errors For Texture memory, these are errors fixed by resend.
A memory error that was not corrected for ECC errors, these are double bit errors For Texture memory, these are errors where the resend fails.
CBU.
GPU Device Memory.
GPU L1 Cache.
GPU L2 Cache.
GPU Register File.
GPU Texture Memory.
Shared memory.
Page was retired due to double bit ECC error.
Page was retired due to multiple single bit ECC error.
noinspection GoUnusedConst.
noinspection GoUnusedConst.
How long did the board limit cause the GPU to be below application clocks.
How long did low utilization cause the GPU to be below application clocks.
How long did power violations cause the GPU to be below application clocks.
How long did the board reliability limit cause the GPU to be below application clocks.
How long did sync boost cause the GPU to be below application clocks.
How long did thermal violations cause the GPU to be below application clocks.
Total time the GPU was held below application clocks by any limiter (0 - 5 above).
Total time the GPU was held below base clocks.
Performance state 0 -- Maximum Performance.
noinspection GoUnusedConst.
noinspection GoUnusedConst.
noinspection GoUnusedConst.
noinspection GoUnusedConst.
noinspection GoUnusedConst.
noinspection GoUnusedConst.
Performance state 15 -- Minimum Performance.
noinspection GoUnusedConst.
noinspection GoUnusedConst.
noinspection GoUnusedConst.
noinspection GoUnusedConst.
noinspection GoUnusedConst.
noinspection GoUnusedConst.
noinspection GoUnusedConst.
noinspection GoUnusedConst.
noinspection GoUnusedConst.
APIs that change application clocks.
APIs that enable/disable Auto Boosted clocks.
Temperature sensor for the GPU die.
GPU Temperature at which the GPU can be throttled below base clock.
Memory Temperature at which the GPU will begin SW slowdown.
Temperature at which the GPU will shut down for HW protection.
Temperature at which the GPU will begin HW slowdown.
noinspection GoUnusedConst.
noinspection GoUnusedConst.
noinspection GoUnusedConst.
noinspection GoUnusedConst.
noinspection GoUnusedConst.
noinspection GoUnusedConst.
Volatile counts are reset each time the driver loads.
# Variables
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
# Structs
BAR1Memory holds BAR1 memory allocation information for a device.
Detailed ECC error counts for a device.
Memory holds allocation information for a device.
PCIInfo represents PCI information about a GPU device.
ProcessInfo holds information about running compute processes on the GPU.
Utilization information for a device.
ViolationTime holds perf policy violation status data.
# Type aliases
The Brand of the GPU.
Clock Ids.
No description provided by the author
Clock types.
Compute mode.
Device represents native NVML device handle.
Driver models.
ECC counter types.
Represents type of encoder for capacity can be queried.
GPUOperationMode represents GPU Operation Mode.
Represents level relationships within a system between two GPUs.
Available infoROM objects.
Memory error types.
Memory locations.
Causes for page retirement.
Represents the queryable PCIe utilization counters.
Represents type of perf policy for which violation times can be queried.
PState represents allowed PStates.
API types that allow changes to default permission restrictions.
Temperature sensors.
Temperature thresholds.