# README
KubeResourcesMonitor
KubeResourcesMonitor is a Kubernetes controller designed to collect various counts of Kubernetes resources, provide these metrics to Prometheus, and offer autoscaling capabilities based on time intervals and RabbitMQ queue lengths.
Table of Contents
Introduction
KubeResourcesMonitor is a Kubernetes controller that provides valuable insights into your cluster's health and performance by collecting metrics and enabling autoscaling based on specific criteria.
Features
1. Collection of Kubernetes Resource Metrics
KubeResourcesMonitor collects various counts of Kubernetes cluster resources such as pods, services, configmaps, secrets, deployments, nodes, etc., and makes these metrics available for Prometheus to scrape.
Metrics Collected:
pod_count
: Number of podsservice_count
: Number of servicesconfigmap_count
: Number of configmapssecret_count
: Number of secretscronjob_count
: Number of cronjobsdeployment_count
: Number of deploymentsstatefulset_count
: Number of stateful setsdaemonset_count
: Number of daemon setsreplicaset_count
: Number of replica setsendpoint_count
: Number of endpointspersistentvolume_count
: Number of persistent volumesnamespace_count
: Number of namespacesnode_count
: Number of nodesnode_ready_count
: Number of nodes in ready conditionnode_memory_pressure_count
: Number of nodes with memory pressurenode_disk_pressure_count
: Number of nodes with disk pressurepvc_count
: Number of PersistentVolumeClaims (PVCs)event_count
: Number of eventsrestart_count
: Number of container restartscrash_count
: Number of container crashestotal_cpu_usage
: Total CPU usage across all nodestotal_memory_usage
: Total memory usage across all nodes
Using Prometheus Alertmanager: Setting up Alertmanager in Prometheus allows you to track cluster-wide health based on these metrics. You can configure alerts for various metrics to get notified when certain thresholds are crossed. For example:
- Pod Count: Alert if the number of pods exceeds a certain limit, indicating potential resource overuse.
- Service Count: Monitor the number of services to ensure no unexpected spikes.
- Node Conditions: Alerts on
node_memory_pressure_count
andnode_disk_pressure_count
to track node health. - Container Restarts and Crashes: Alerts on
restart_count
andcrash_count
to detect unstable deployments. - Resource Usage: Alerts on
total_cpu_usage
andtotal_memory_usage
to prevent resource exhaustion.
2. Time-based Autoscaling
KubeResourcesMonitor can scale specified deployments based on configured time intervals. This is useful for scaling applications during peak and off-peak hours.
Configuration: In the CustomResource definition (CRD), you can specify the deployments and their scaling times. Below is an example configuration:
apiVersion: monitor.example.com/v1alpha1
kind: KubeResourcesMonitor
metadata:
name: example-kuberesourcesmonitor
namespace: kuberesourcesmonitor-system
spec:
interval: "45s" # Interval to requeue the reconciler
prometheusEndpoint: "2112" # Endpoint for Prometheus metrics
configMapName: "kuberesourcesmonitor-config" # Name of the ConfigMap containing metrics configuration
deployments:
- name: "busybox-deployment"
namespace: "default"
minReplicas: 2
maxReplicas: 5
scaleTimes:
- startTime: "00:00" # Time to start scaling
endTime: "06:00" # Time to end scaling
replicas: 5 # Number of replicas during this time window
- startTime: "06:00"
endTime: "23:59"
replicas: 2
This configuration will scale the busybox-deployment
to 5 replicas between midnight and 6 AM, and scale it down to 2 replicas for the rest of the day.
3. Message Queue-based Autoscaling
KubeResourcesMonitor can scale consumer microservice deployments based on the number of messages in a RabbitMQ queue. This ensures that your application can handle varying loads efficiently.
Configuration: You need to create a secret containing the RabbitMQ URL and specify the message queue details in the CustomResource definition.
Example Secret:
apiVersion: v1
kind: Secret
metadata:
name: rabbitmq-credentials
namespace: kuberesourcesmonitor-system
type: Opaque
data:
# Base64 encode the RabbitMQ URL
queueUrl: YW1xcDovL25vdmVsbDpub3ZlbGxAMTIzQDE3Mi4xMDUuNTEuMjE2OjU2NzIv
Example Configuration:
apiVersion: monitor.example.com/v1alpha1
kind: KubeResourcesMonitor
metadata:
name: example-kuberesourcesmonitor
namespace: kuberesourcesmonitor-system
spec:
interval: "45s"
prometheusEndpoint: "2112"
configMapName: "kuberesourcesmonitor-config"
messageQueues:
- queueSecretName: "rabbitmq-credentials"
queueSecretKey: "queueUrl"
queueNamespace: "kuberesourcesmonitor-system"
queueName: "message.test.queue"
deploymentName: "rabbitmq-consumer"
deploymentNamespace: "default"
thresholdMessages: 10
scaleUpReplicas: 5
scaleDownReplicas: 1
This configuration will scale the rabbitmq-consumer
deployment based on the number of messages in the message.test.queue
. If the number of messages exceeds 10, the deployment will scale up to 5 replicas. If the number of messages falls below or equal to 10, the deployment will scale down to 1 replica.
Prerequisites
- Kubernetes cluster
- Helm installed
- Prometheus installed and configured
Installation
Install Helm from Public Repo
-
Add the Helm repository:
helm repo add krm https://anurag-2911.github.io/helm-charts helm repo update
-
Install the Helm chart in a specific namespace:
kubectl create namespace kuberesourcesmonitor helm install krm krm/kuberesourcesmonitor --namespace kuberesourcesmonitor --set global.prometheusReleaseLabel=kube-prometheus-stack
-
Verify the installation:
kubectl get all -n kuberesourcesmonitor
-
Uninstall the Helm chart:
helm uninstall krm --namespace kuberesourcesmonitor kubectl delete namespace kuberesourcesmonitor
-
List the Helm chart:
helm list --namespace kuberesourcesmonitor
Port Forward the Requests and Verify the Metrics
-
Get the pods in the namespace:
kubectl get pods -n kuberesourcesmonitor
-
Port forward the requests: For example, for a pod name kuberesourcesmonitor-kubemntr-f6847f6fd-kmj5q kubectl port-forward pod/kuberesourcesmonitor-kubemntr-f6847f6fd-kmj5q 8080:2112 -n kuberesourcesmonitor
-
Verify the metrics:
Open your browser and go to
http://localhost:8080/metrics
.
Configuration
Changing ConfigMap Values
To change the values in the ConfigMap
, use the kubectl patch
command:
kubectl patch configmap kuberesourcesmonitor-config -n kuberesourcesmonitor --type merge -p '{"data":{"collectMetrics":"true","rabbitMQAutoScale":"true","timeBasedAutoScale":"true"}}'
Usage
Monitor Kubernetes Resources
KubeResourcesMonitor provides various Kubernetes resource metrics to Prometheus, which can be visualized in Grafana dashboards.
Autoscaling
- Time-based Autoscaling: Configure time intervals for scaling deployments to handle peak and off-peak loads.
- Message Queue-based Autoscaling: Scale consumer deployments based on the number of messages in RabbitMQ queues.
Contributing
Contributions are welcome!
License
This project is licensed under the MIT License