Categorygithub.com/adobe/k8s-shredder
repositorypackage
0.2.2
Repository: https://github.com/adobe/k8s-shredder.git
Documentation: pkg.go.dev

# Packages

No description provided by the author
No description provided by the author

# README

K8s-shredder - a new way of parking in Kubernetes

tests Go Report Card GoDoc GitHub go.mod Go version GitHub release (latest by date) License

K8s-Shredder project

As more and more teams running their workloads on Kubernetes started deploying stateful applications(kafka, zookeeper, rabbitmq, redis, etc) on top of a Kubernetes platform, there might be challenges on finding solutions for keeping alive the minion nodes(k8s worker nodes) where those pods part of a statefulset/deployment are running. There might be cases where worker nodes need to be running for an extended period of time during a full cluster upgrade in order to ensure no downtime at application level while rotating the worker nodes.

K8s-shredder introduces the concept of parked nodes which aims to address some critical aspects on a Kubernetes cluster while rotating the worker nodes during a cluster upgrade:

  • allow teams running stateful apps to move their workloads off of parked nodes at their will, independent of clusters upgrade lifecycle.
  • optimises cloud costs by dynamically purging unschedulable worker nodes(parked nodes).
  • notifies clients that they are running workloads on parked nodes so that they can take proper actions.

Getting started

In order to enable k8s-shredder on a Kubernetes cluster you can use the manifests as described in k8s-shredder spec.

Then, during a cluster upgrade, while rotating the worker nodes, you have to label the nodes that you want them parked with:

shredder.ethos.adobe.net/upgrade-status=parked
shredder.ethos.adobe.net/parked-node-expires-on=<Node_expiration_timestamp>

Additionally, if you want a pod to be exempted from the eviction loop until parked node TTL expires, you can label the pod with "shredder.ethos.adobe.net/allow-eviction=false" so that k8s-shredder will know to skip it.

The following options can be used to customise the k8s-shredder controller:

NameDefault ValueDescription
EvictionLoopInterval60sHow often to run the eviction loop process
ParkedNodeTTL60mTime a node can be parked before starting force eviction process
RollingRestartThreshold0.5How much time(percentage) should pass from ParkedNodeTTL before starting the rollout restart process
UpgradeStatusLabel"shredder.ethos.adobe.net/upgrade-status"Label used for the identifying parked nodes
ExpiresOnLabel"shredder.ethos.adobe.net/parked-node-expires-on"Label used for identifying the TTL for parked nodes
NamespacePrefixSkipInitialEviction""For pods in namespaces having this prefix proceed directly with a rollout restart without waiting for the RollingRestartThreshold
RestartedAtAnnotation"shredder.ethos.adobe.net/restartedAt"Annotation name used to mark a controller object for rollout restart
AllowEvictionLabel"shredder.ethos.adobe.net/allow-eviction"Label used for skipping evicting pods that have explicitly set this label on false
ToBeDeletedTaint"ToBeDeletedByClusterAutoscaler"Node taint used for skipping a subset of parked nodes that are already handled by cluster-autoscaler

How it works

K8s-shredder will periodically run eviction loops, based on configured EvictionLoopInterval, trying to clean up all the pods from the parked nodes. Once all the pods are cleaned up, cluster-autoscaler should chime in and recycle the parked node.

The diagram below describes a simple flow about how k8s-shredder handles stateful set applications:

K8s-Shredder project