package
0.0.0-20250403192851-34a345b3f333
Repository: https://github.com/openshift/ci-tools.git
Documentation: pkg.go.dev
# README
gpu-scheduling-webhook
Motivation
Our clusters host some nodes that feature an Nvidia GPU. They are expensive to run workload on so by using this mutating webhook we ensure that only the pods requesting a GPU actually run on those nodes, leaving out everything else.
How it works
A node that features an Nvida GPU holds the following taint:
taints:
- effect: NoSchedule
key: nvidia.com/gpu
value: "true"
The webhook inspects a pod's container requests, both form the init containers and regular ones, and apply this toleration:
tolerations:
- key: nvidia.com/gpu
operator: Equal
value: "true"
effect: NoSchedule
when it finds either such a request:
requests:
nvidia.com/gpu: <SOME_VALUE_HERE>
or the following limit:
limits:
nvidia.com/gpu: <SOME_VALUE_HERE>
# Functions
No description provided by the author