# README
Kubernetes Backup Service
A Kubernetes-native backup service that automates rsync-based backups of remote servers using Custom Resource Definitions (CRDs). The service provides scheduled backups, retention management, and a web interface for monitoring and manual operations.
Features
- Kubernetes-Native: Define backup targets using Custom Resource Definitions
- Automated Scheduling: Cron-based backup execution with configurable intervals
- SSH/Rsync: Reliable file transfer using SSH-based rsync
- Retention Management: Automatic cleanup with configurable retention policies
- Web Interface: Monitor backup status and trigger manual operations
- Prometheus Integration: Built-in metrics for monitoring and alerting
- Multi-Target: Support for backing up multiple servers and services
Architecture
The service consists of several key components:
- BackupExecutor: Orchestrates rsync-based backup operations
- BackupCleaner: Manages retention policies and cleanup
- K8sConnector: Interfaces with Kubernetes API for Target discovery
- HTTP Server: Provides REST API and web interface
- Cron Scheduler: Manages automated backup and cleanup jobs
Quick Start
- Deploy the service to your Kubernetes cluster
- Create SSH keys for accessing target servers
- Define backup targets using Target CRDs
- Monitor backups through the web interface
apiVersion: backup.benjamin-borbe.de/v1
kind: Target
metadata:
name: my-server
namespace: backup
spec:
host: server.example.com
port: 22
user: root
dirs:
- /home
- /etc
excludes:
- /tmp
- /var/cache
Installation
Prerequisites
- Kubernetes cluster (1.19+)
- SSH access to target servers
- Persistent storage for backup data
- Docker registry access (for custom builds)
Deploy to Kubernetes
- Create namespace and apply RBAC configuration:
kubectl apply -f k8s/ns.yaml
kubectl apply -f k8s/backup-sa.yaml
kubectl apply -f k8s/backup-clusterrole.yaml
kubectl apply -f k8s/backup-clusterrolebinding.yaml
- Create secrets for SSH keys and monitoring:
# Create SSH private key secret
kubectl create secret generic backup \
--from-file=id_backup=/path/to/your/ssh/private/key \
--from-literal=sentry-dsn="your-sentry-dsn" \
-n backup
- Deploy the service:
# Set environment variables
export NAMESPACE=backup
export BRANCH=master # or your preferred image tag
export LOGLEVEL=2
export RANDOM=$(date +%s)
# Apply deployment
envsubst < k8s/backup-deploy.yaml | kubectl apply -f -
kubectl apply -f k8s/backup-svc.yaml
- Optional: Configure ingress for web access:
kubectl apply -f k8s/backup-ing.yaml
Configuration
The service is configured through environment variables in the deployment:
Variable | Description | Default |
---|---|---|
LISTEN | HTTP server bind address | :9090 |
NAMESPACE | Kubernetes namespace to watch | backup |
CRON_EXPRESSION | Backup schedule cron expression | 0 0 * * * 0 (weekly) |
SSH_KEY | Path to SSH private key | /secret/id_backup |
BACKUP_ROOT_DIR | Root directory for storing backups | /backup |
BACKUP_CLEANUP_ENABLED | Enable automatic cleanup | true |
BACKUP_KEEP_AMOUNT | Number of backups to retain | 2 |
SENTRY_DSN | Sentry DSN for error reporting | Required |
Target Configuration
Basic Target Specification
Targets are defined using Kubernetes Custom Resources with the following structure:
apiVersion: backup.benjamin-borbe.de/v1
kind: Target
metadata:
name: target-name
namespace: backup
spec:
host: hostname-or-ip
port: 22
user: ssh-username
dirs:
- /path/to/backup
excludes:
- /path/to/exclude
Real-World Examples
System Server Backup
apiVersion: backup.benjamin-borbe.de/v1
kind: Target
metadata:
name: home-server
namespace: backup
spec:
host: server.home.local
port: 22
user: root
dirs:
- /
excludes:
- /backup
- /boot
- /dev
- /proc
- /run
- /sys
- /tmp
- /var/cache
- /var/lib/docker
- /var/log
- /home/*/.*cache
- /swap.img
- /swapfile
Kubernetes Node Backup
apiVersion: backup.benjamin-borbe.de/v1
kind: Target
metadata:
name: k3s-master
namespace: backup
spec:
host: k3s-master.cluster.local
port: 22
user: root
dirs:
- /
excludes:
- /boot
- /dev
- /proc
- /run
- /sys
- /tmp
- /var/cache
- /var/lib/kubelet
- /var/lib/rancher/k3s/agent/containerd
- /var/log
Selective Directory Backup
apiVersion: backup.benjamin-borbe.de/v1
kind: Target
metadata:
name: database-server
namespace: backup
spec:
host: db.example.com
port: 22
user: backup
dirs:
- /var/lib/postgresql
- /etc/postgresql
- /home/backups
excludes:
- "*.log"
- "*.tmp"
Common Exclusion Patterns
Pattern | Reason |
---|---|
/dev , /proc , /sys | Virtual filesystems |
/tmp , /var/tmp | Temporary files |
/var/cache , ~/.cache | Cached data (regeneratable) |
/var/lib/docker | Container runtime data |
/var/lib/kubelet | Kubernetes runtime data |
/var/log | Log files (often rotated) |
/boot | Boot partition (usually separate) |
*.log , *.tmp | Temporary and log files |
/swap* , *.swap | Swap files |
Usage
Web Interface
Access the web interface at http://your-service:9090/
to:
- View backup status and history
- Monitor target configurations
- Trigger manual backups
- Manage cleanup operations
API Endpoints
Endpoint | Method | Description |
---|---|---|
/healthz | GET | Health check |
/readiness | GET | Readiness check |
/metrics | GET | Prometheus metrics |
/status | GET | Overall backup status |
/list | GET | List all backup targets |
/backup/all | POST | Trigger backup for all targets |
/backup/{name} | POST | Trigger backup for specific target |
/cleanup/all | POST | Cleanup all targets |
/cleanup/{name} | POST | Cleanup specific target |
Manual Operations
Trigger backup for all targets:
curl -X POST http://backup-service:9090/backup/all
Trigger backup for specific target:
curl -X POST http://backup-service:9090/backup/my-server
Check backup status:
curl http://backup-service:9090/status
Managing Targets
Create a new backup target:
kubectl apply -f my-target.yaml
List all targets:
kubectl get targets -n backup
View target details:
kubectl describe target my-server -n backup
Delete a target:
kubectl delete target my-server -n backup
Monitoring
Prometheus Metrics
The service exposes metrics at /metrics
for Prometheus scraping:
- Backup execution counts and durations
- Success/failure rates
- Target discovery metrics
- HTTP endpoint performance
Logging
Logs are written to stdout with configurable verbosity levels:
-v=1
: Basic operational information-v=2
: Detailed execution logs-v=3
: Debug information-v=4
: Trace-level debugging
Backup Storage
Backups are stored in the following directory structure:
/backup/
├── hostname1.example.com/
│ ├── current/ # Latest backup (symlink)
│ ├── 2023-12-01/ # Dated backup directories
│ ├── 2023-12-02/
│ └── ...
└── hostname2.example.com/
├── current/
└── ...
Recovery Procedures
Restore from Backup
Single directory restore:
ssh-add ~/.ssh/backup_key
ssh -A target-server.com
sudo rsync -av --progress backup-server:/backup/$(hostname -f)/current/path/to/data/ /path/to/restore/
Kubernetes PVC restore:
ssh-add ~/.ssh/backup_key
ssh -A k8s-node.com
sudo -E -s
cd /var/lib/rancher/k3s/storage
DIR=pvc-uuid_namespace_claim-name
rsync -av --progress --rsync-path="sudo rsync" \
"backup-user@backup-server:/backup/$(hostname -f)/current/var/lib/rancher/k3s/storage/${DIR}" .
Bulk PVC restore:
for DIR in $(ls -1d *-app-name-*); do
rsync -av --progress --rsync-path="sudo rsync" \
"backup-user@backup-server:/backup/$(hostname -f)/current/var/lib/rancher/k3s/storage/${DIR}" .
done
Development
Building
# Run tests and linting
make precommit
# Build Docker image
BRANCH=develop make build
# Upload to registry
make upload
# Full build, upload, clean, and deploy cycle
make buca
Code Generation
# Generate Kubernetes client code
make generatek8s
# Install development dependencies
make deps
Frontend Development
cd frontend/
npm install
npm run dev # Development server
npm run build # Production build
npm run lint # Linting
Testing
# Run all tests
make test
# Run specific test package
go test ./pkg/...
# Run with coverage
go test -cover ./...
Troubleshooting
Common Issues
SSH Connection Failed:
- Verify SSH key has correct permissions (600)
- Ensure target server allows key-based authentication
- Check firewall rules and network connectivity
- Verify user has appropriate sudo privileges
Backup Failed:
- Check disk space on backup storage
- Verify target directories exist and are accessible
- Review exclusion patterns for conflicts
- Check rsync logs in service output
Target Not Found:
- Verify Target CRD is in correct namespace
- Check Kubernetes RBAC permissions
- Ensure service has access to watch Target resources
Performance Issues:
- Adjust backup schedules to avoid conflicts
- Consider excluding large, frequently changing directories
- Monitor network bandwidth usage
- Review backup retention settings
Debug Procedures
Check service logs:
kubectl logs -f deployment/backup -n backup
Verify Target discovery:
curl http://backup-service:9090/list
Test SSH connectivity:
kubectl exec -it deployment/backup -n backup -- \
ssh -i /secret/id_backup user@target-host
Manual rsync test:
kubectl exec -it deployment/backup -n backup -- \
rsync -av -e "ssh -i /secret/id_backup" \
user@target-host:/test/path/ /tmp/test/
Advanced Configuration
Custom Schedules
Modify the CRON_EXPRESSION
environment variable for different backup schedules:
0 2 * * *
: Daily at 2 AM0 2 * * 0
: Weekly on Sunday at 2 AM0 2 1 * *
: Monthly on 1st at 2 AM@every 6h
: Every 6 hours
Retention Policies
Adjust retention by modifying these environment variables:
BACKUP_KEEP_AMOUNT
: Number of backups to retainBACKUP_CLEANUP_ENABLED
: Enable/disable automatic cleanup
Security Considerations
- Use dedicated SSH keys with restricted permissions
- Implement network policies to limit service access
- Regular key rotation and access auditing
- Monitor backup access patterns
- Encrypt backup storage if required
Integration with Keel (Optional)
For automatic image updates, configure Keel:
helm repo add keel https://charts.keel.sh
helm repo update
helm upgrade --install keel --namespace=kube-system keel/keel
The deployment includes Keel annotations for automatic updates:
keel.sh/policy: force
keel.sh/trigger: poll
keel.sh/pollSchedule: "@every 1m"
Contributing
- Fork the repository
- Create a feature branch
- Follow the coding guidelines in
CLAUDE.md
- Run
make precommit
before committing - Submit a pull request
For detailed development guidelines, see the project's coding standards and architecture patterns documentation.
License
BSD-style license. See LICENSE
file for details.