Categorygithub.com/hashicorp/nomad-device-nvidia
modulepackage
1.1.0
Repository: https://github.com/hashicorp/nomad-device-nvidia.git
Documentation: pkg.go.dev

# README

Nomad Nvidia Device Plugin

This repository provides an implementation of a Nomad device plugin for Nvidia GPUs.

Behavior

The Nvidia device plugin uses NVML bindings to get data regarding available Nvidia devices and will expose them via Fingerprint RPC. GPUs can be excluded from fingerprinting by setting the ignored_gpu_ids field (see below). Plugin sends statistics for fingerprinted devices every stats_period period.

The plugin detects whether the GPU has Multi-Instance GPU (MIG) enabled. When enabled all instances will be fingerprinted as individual GPUs that can be addressed accordingly.

Config

The plugin is configured in the Nomad client's plugin block:

plugin "nvidia" {
  config {
    ignored_gpu_ids    = ["uuid1", "uuid2"]
    fingerprint_period = "5s"
  }
}

The valid configuration options are:

  • ignored_gpu_ids (list(string): []): list of GPU UUIDs strings that should not be exposed to nomad
  • fingerprint_period (string: "1m"): interval to repeat the fingerprint process to identify possible changes.

# Packages

No description provided by the author
No description provided by the author
No description provided by the author

# Functions

NewNvidiaDevice returns a new nvidia device plugin.

# Constants

No description provided by the author
No description provided by the author
No description provided by the author
Mebibytes.
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
number of errors.
No description provided by the author
No description provided by the author
number of errors.
No description provided by the author
No description provided by the author
number of errors.
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
Attribute names and units for reporting Fingerprint output.
No description provided by the author
No description provided by the author
No description provided by the author
Mebibytes.
No description provided by the author
No description provided by the author
No description provided by the author
Nvidia-container-runtime environment variable names.
No description provided by the author
No description provided by the author
No description provided by the author
Attribute names for reporting stats output.
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
Celsius degrees.

# Variables

PluginConfig is the nvidia factory function registered in the plugin catalog.
PluginID is the nvidia plugin metadata registered in the plugin catalog.

# Structs

Config contains configuration information for the plugin.
NvidiaDevice contains all plugin specific data.