Categorygithub.com/zcentric/vector-operator
module
0.0.0-20250202031945-e5653349673c
Repository: https://github.com/zcentric/vector-operator.git
Documentation: pkg.go.dev

# README

Vector Operator

A Kubernetes operator that simplifies the deployment and management of Vector observability pipelines in your Kubernetes cluster. This operator enables declarative configuration of Vector agents and data pipelines, making it easier to collect, transform, and forward observability data.

Overview

The Vector Operator provides three custom resources:

  • Vector: Manages the deployment of Vector agents (DaemonSet) in your cluster
  • VectorAggregator: Manages the deployment of Vector aggregators (Deployment) in your cluster
  • VectorPipeline: Defines observability data pipelines with sources, transforms, and sinks

Key features:

  • Declarative configuration of Vector instances
  • Support for both agent (per-node) and aggregator (centralized) deployment types
  • Pipeline management with support for multiple sources, transforms, and sinks
  • Kubernetes-native deployment and management
  • Automatic configuration updates and reconciliation

Pipeline Validation

The operator includes a robust validation system to ensure Vector configurations are valid before deployment. See Pipeline Validation for details on:

  • How validation works
  • Checking validation status
  • Handling validation failures
  • Best practices

Common Deployment Patterns

  1. Log Collection and Forwarding:

    • Deploy Vector agents (DaemonSet) to collect logs from all nodes
    • Deploy VectorAggregator instances (Deployment) to receive and process logs centrally
    • Configure agents to forward to the aggregators
  2. High Availability Aggregation:

    • Deploy multiple VectorAggregator replicas for redundancy
    • Use load balancing for even distribution of log processing

Quick Start

Prerequisites

  • Kubernetes cluster v1.11.3+
  • kubectl v1.11.3+
  • go v1.21+ (for development)
  • docker v17.03+ (for development)

Installation

  1. Install the operator and CRDs:
kubectl apply -f https://raw.githubusercontent.com/zcentric/vector-operator/main/dist/install.yaml
  1. Create Vector instances:

For an agent (runs on every node):

apiVersion: vector.zcentric.com/v1alpha1
kind: Vector
metadata:
  name: vector-agent
  namespace: vector
spec:
  image: "timberio/vector:0.38.0-distroless-libc"

For an aggregator (centralized processing):

apiVersion: vector.zcentric.com/v1alpha1
kind: VectorAggregator
metadata:
  name: vector-aggregator
  namespace: vector
spec:
  image: "timberio/vector:0.38.0-distroless-libc"
  replicas: 2 # optional, defaults to 1
  1. Define a pipeline:
apiVersion: vector.zcentric.com/v1alpha1
kind: VectorPipeline
metadata:
  name: kubernetes-logs
spec:
  vectorRef: vector-agent
  sources:
    k8s-logs:
      type: "kubernetes_logs"
  transforms:
    remap:
      type: "remap"
      inputs: ["k8s-logs"]
      source: |
        .timestamp = del(.timestamp)
        .environment = "production"
  sinks:
    console:
      type: "console"
      inputs: ["remap"]
      encoding:
        codec: "json"

Usage Examples

Vector Deployment Types

Agent Configuration (DaemonSet)

Use the Vector CRD when you need to collect logs and metrics from every node in your cluster:

apiVersion: vector.zcentric.com/v1alpha1
kind: Vector
metadata:
  name: vector-agent
spec:
  image: "timberio/vector:0.38.0-distroless-libc"
  api:
    enabled: true
    address: "0.0.0.0:8686"
  data_dir: "/vector-data"
  expire_metrics_secs: 30

Aggregator Configuration (Deployment)

Use the VectorAggregator CRD when you need centralized log processing and aggregation:

apiVersion: vector.zcentric.com/v1alpha1
kind: VectorAggregator
metadata:
  name: vector-aggregator
spec:
  image: "timberio/vector:0.38.0-distroless-libc"
  replicas: 2
  api:
    enabled: true
    address: "0.0.0.0:8686"
  data_dir: "/vector-data"
  expire_metrics_secs: 30

Pipeline with Multiple Sources and Transforms

apiVersion: vector.zcentric.com/v1alpha1
kind: VectorPipeline
metadata:
  name: multi-source-pipeline
spec:
  vectorRef: vector-agent
  sources:
    app-logs:
      type: "kubernetes_logs"
      extra_label_selector: "app=myapp"
    system-logs:
      type: "kubernetes_logs"
      extra_label_selector: "component=system"
  transforms:
    filter-errors:
      type: "filter"
      inputs: ["app-logs"]
      condition:
        type: "vrl"
        source: ".level == 'error'"
    add-metadata:
      type: "remap"
      inputs: ["system-logs"]
      source: |
        .metadata.cluster = "production"
  sinks:
    elasticsearch:
      type: "elasticsearch"
      inputs: ["filter-errors", "add-metadata"]

Contributing

Contributions are welcome! Here's how you can help:

  1. Fork the repository

  2. Create a feature branch:

    git checkout -b feature/my-new-feature
    
  3. Set up your development environment:

    # Install dependencies
    go mod download
    
    # Install CRDs
    make install
    
    # Run the operator locally
    make run
    
  4. Make your changes and add tests

  5. Run tests:

    make test
    
  6. Submit a pull request

Development Guidelines

  • Follow Go best practices and conventions
  • Add unit tests for new features
  • Update documentation as needed
  • Use meaningful commit messages
  • Run make lint before submitting PRs
  • All PRs are automatically tested using GitHub Actions

License

Copyright 2024.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

# Packages

No description provided by the author
No description provided by the author
No description provided by the author