Categorygithub.com/AccessibleAI/cnvrg-operator
repositorypackage
4.1.17-eli-test-5+incompatible
Repository: https://github.com/accessibleai/cnvrg-operator.git
Documentation: pkg.go.dev

# Packages

No description provided by the author
No description provided by the author
No description provided by the author

# README

cnvrg.io operator (v3)


Deploy cnvrg stack on EKS | AKS | GKE | OpenShift | On-Premise clusters

Architecture overview

cnvrg operator may deploy cnvrg stack in two different ways

  1. Multiple cnvrg control planes within the same cluster separated by namespaces - suitable for multi tenancy deployments
                            ---------cnvrg infra namespace----------
                            | Cluster scope prometheus             |
                            | Prometheus node exporter             |
                            | Kube state metrics                   |
                            | Cluster scope service monitors       |     
                            | Fluentbit                            |
                            | Istio control plane                  |
                            | Storage provisioners (hostpath/nfs)  |
                            ----------------------------------------           
---------cnvrg control plane 1 namespace-------  ---------cnvrg control plane 2 namespace-------
| cnvrg control plane (webapp, sidekiqs, etc.)|  | cnvrg control plane (webapp, sidekiqs, etc.)|
| PostgreSQL                                  |  | PostgreSQL                                  |
| ElasticSearch + Kibana                      |  | ElasticSearch + Kibana                      |
| Minio                                       |  | Minio                                       |
| Redis                                       |  | Redis                                       |
| Namespace scope Prometheus + Grafana        |  | Namespace scope Prometheus + Grafana        |
| Namespace scope service monitors            |  | Namespace scope service monitors            |
| Istio Gateway + VirtualServices             |  | Istio Gateway + VirtualServices             |
-----------------------------------------------  -----------------------------------------------
                    
  1. Single cnvrg control plane in dedicated namespace
                        ----------------cnvrg namespace--------------------
                        | Cluster scope prometheus                        |
                        | Prometheus node exporter                        |
                        | Kube state metrics                              |
                        | Cluster scope service monitors                  |     
                        | Namespace scope service monitors                |     
                        | Fluentbit                                       |
                        | Istio control plane                             |
                        | Storage provisioners (hostpath/nfs)             |   
                        | cnvrg control plane (webapp, sidekiqs, etc.)    |
                        | PostgreSQL                                      |
                        | ElasticSearch + Kibana                          | 
                        | Minio                                           |
                        | Redis                                           |  
                        | IstioGateway + VirtualServices                  |
                        ---------------------------------------------------           

Configuration

Helm chart command line options

  1. Globals
  2. Cnvrg Control Plane options
  3. DataBases options
  4. Logging options
  5. Monitoring options
  6. Networking options
  7. SSO options
  8. LDAP (Active Directory)
  9. Storage options
  10. Tenancy options
  11. Registry options
  12. Labels and Annotations
  13. Automatic Config Reload

Globals

FlagDefault valueDescription
clusterDomain-DNS A wildcard record resolving to K8s Ingress IP/LoadBalancer, example: *.cnvrg.my-org.com -> 1.2.3.4
specallinonecan be set to one of allinone - for single namespace deployment. infra and ccp for multi namespaces cnvrg deployments
imageHubdocker.io/cnvrgthe images registry

Control Plane options

FlagDefault valueDescription
controlPlane.imagecnvrg/core:3.6.99cnvrg control plane image
controlPlane.baseConfig.agentCustomTaglatestcnvrg agent image tag
controlPlane.baseConfig.featureFlags{}map of strings, usage example: --set controlPlane.baseConfig.featureFlags.FOO="BAR"
controlPlane.baseConfig.intercomtrueset to false to disable intercom
controlPlane.hyper.enabledtrueset to false to disable hyper
controlPlane.objectStorage.typeminiosupported values: minio,aws,azure,gcp
controlPlane.objectStorage.bucketcnvrg-storageS3 bucket name
controlPlane.objectStorage.regioneastusbucket region
controlPlane.objectStorage.accessKey-bucket access key (if blank - auto generated)
controlPlane.objectStorage.secretKey-bucket secret key (if blank - auto generated)
controlPlane.objectStorage.endpoint-bucket endpoint (if blank - auto generated )
controlPlane.objectStorage.azureAccountName-azure storage account name
controlPlane.objectStorage.azureContainer-azure storage container name
controlPlane.objectStorage.gcpProject-gcp project
controlPlane.objectStorage.gcpSecretRefgcp-storage-secretgcp storage secret
controlPlane.searchkiq.enabledtrueset to false to disable searchkiq
controlPlane.sidekiq.enabledtrueset to false to disable sidekiq
controlPlane.sidekiq.splittrueset to false to disable sidekiq split
controlPlane.systemkiq.enabledtrueset to false to disable systemkiq split
controlPlane.webapp.hap.enabledtrueset to false to disable hpa
controlPlane.webapp.hap.maxReplicas5set max replicas for HPA
controlPlane.sidekiq.hap.enabledtrueset to false to disable hpa
controlPlane.sidekiq.hap.maxReplicas5set max replicas for HPA
controlPlane.searchkiq.hap.enabledtrueset to false to disable hpa
controlPlane.searchkiq.hap.maxReplicas5set max replicas for HPA
controlPlane.systemkiq.hap.enabledtrueset to false to disable hpa
controlPlane.systemkiq.hap.maxReplicas5set max replicas for HPA
controlPlane.smtp.server-smtp server
controlPlane.smtp.port587smtp port
controlPlane.smtp.username-smtp username
controlPlane.smtp.password-smtp password
controlPlane.smtp.domain-smtp domain
controlPlane.smtp.opensslVerifyMode-openssl verify mode for cnvrg smtp client
controlPlane.smtp.sender[email protected]the email address of the sender
controlPlane.webapp.enabledtrueset to false to disable webapp
controlPlane.webapp.replicas1webapp replicas number
controlPlane.mpi.enabledtrueset to false to disable mpi
controlPlane.mpi.imagempioperator/mpi-operator:v0.2.3mpi operator image
controlPlane.mpi.kubectlDeliveryImagempioperator/kubectl-delivery:v0.2.3mpi kubectl delivery image
controlPlane.mpi.registry.urldocker.iompi registry url
controlPlane.mpi.registry.user-mpi registry user
controlPlane.mpi.registry.password-mpi registry password
controlPlane.mpi.extraArgs{}map of strings, usage example: --set controlPlane.mpi.extraArgs.FOO="BAR"

DataBases options

FlagDefault valueDescription
dbs.es.enabledtrueset to false to disable elasticsearch
dbs.es.storageSize80Gistorage size for elasticsearch
dbs.es.storageClass-storage class, if blank default storage class will be used
dbs.minio.enabledtrueset to false to disable minio
dbs.minio.storageSize100Gistorage size for minio
dbs.minio.storageClass-storage class, if blank default storage class will be used
dbs.pg.enabledtrueset to false to disable postgresql
dbs.pg.storageSize80Gistorage size for postgresql
dbs.pg.storageClass-storage class, if blank default storage class will be used
dbs.pg.hugePages.enabledfalseset to true to enable HubePages support for postgresql
dbs.pg.hugePages.size2Misize of hubePages (1Mi, 2Mi, 1Gi)
dbs.pg.hugePages.memory-memory amount to use from the hubepages, default 4Gi
dbs.redis.enabledtrueset to false to disable redis
dbs.redis.storageSize10Gistorage size for redis
dbs.redis.storageClass-storage class, if blank default storage class will be used

Logging options

FlagDefault valueDescription
logging.fluentbit.enabledtrueset to false to disable fluentbit
logging.elastalert.enabledtrueset to false to disable elastalert
logging.elastalert.storageSize30Gistorage size for elastalert
logging.elastalert.storageClass-storage class, if blank default storage class will be used
logging.kibana.enabledtrueset to false to disable kibana

Monitoring options

FlagDefault valueDescription
monitoring.dcgmExporter.enabledtrueset to false to disable dcgmExporter
monitoring.nodeExporter.enabledtrueset to false to disable nodeExporter
monitoring.kubeStateMetrics.enabledtrueset to false to disable kubeStateMetrics
monitoring.grafana.enabledtrueset to false to disable grafana
monitoring.prometheusOperator.enabledtrueset to false to disable prometheusOperator
monitoring.prometheus.enabledtrueset to false to disable prometheus
monitoring.prometheus.storageSize50Gistorage for Prometheus instance
monitoring.prometheus.storageClass-storage class, if blank default storage class will be used
monitoring.defaultServiceMonitors.enabledtrueset to false to disable defaultServiceMonitors
monitoring.cnvrgIdleMetricsExporter.enabledtrueset to false to disable cnvrgIdleMetricsExporter

Networking options

FlagDefault valueDescription
networking.https.enabledfalseset to false to disable https
networking.https.certSecret-K8s tls secret
networking.ingress.typeistioingress type: (istio|ingress|openshift|nodeport)
networking.ingress.istioGwEnabledtrueeither deploy or not Istio GW
networking.ingress.istioGwNameistio-gw-[namespace]name of the istio GW (either to use or create and use if istioGwEnabled is true)
networking.istio.enabledtrueset to false to disable istio deployment
networking.istio.externalIp[]list of IPs to use for istio ingress service: example: --set networking.istio.externalIp={10.0.0.22,10.0.0.33}
networking.istio.ingressSvcExtraPorts[]list extra ports for istio ingress service: example: --set networking.istio.externalIp={1111,2222}
networking.istio.lbSourceRanges[]list extra LB sources ranges, example: --set networking.istio.externalIp={1.1.1.1/32,2.2.2.2/30}
networking.istio.ingressSvcAnnotations{}map of strings for Istio SVC annotations, example : --set networking.istio.ingressSvcAnnotations=networking.istio.ingressSvcAnnotations.service\.beta\.kubernetes\.io\/aws-load-balancer-backend-protocol=tcp
networking.proxy.enabledfalseset to true when yours K8s is behind HTTP/S proxy
networking.proxy.httpProxy[]list of http proxies to use, example --set networking.proxy.httpProxy={http://172.17.0.5:3128}
networking.proxy.httpsProxy[]list of http proxies to use, example --set networking.proxy.httpsProxy={http://172.17.0.5:3128}
networking.proxy.noProxy.svc,.svc.cluster.local,[k8s-api-ip-calculated-automatically],127.0.0.1,kubernetes.default.svc,kubernetes.default.svc.cluster.local,localhostlist of extra no_proxy values to use (will be always appended to default list), example --set networking.proxy.noProxy={my.extra.domain.com}

SSO options

FlagDefault valueDescription
sso.enabledfalseset to true to enable sso
sso.adminUser-cnvrg cluster admin user
sso.provider-one of the following
sso.emailDomain[]list of emails allowed to login
sso.clientId-oauth2 client ID
sso.clientSecret-oauth2 client secret
sso.azureTenant-if sso.provider=azure set azureTenant
sso.oidcIssuerUrl-if sso.provider=oidc set oidcIssuerUrl

Ldap - Active directory

FlagDefault valueDescription
ldap.enabledfalseset to true to enable sso
ldap.host-Ldap host
ldap.port-Ldap port
ldap.account-userPrincipalName
ldap.base-for example: dc=my-domain,dc=local
ldap.adminUser-admin user
ldap.adminPassword-admin password
ldap.ssl"false"("true" or "false")

Storage options

FlagDefault valueDescription
storage.hostpath.enabledFalseset to true to enable hostpath provisioner
storage.hostpath.defaultScFalseset to true to make hostpath default storage class (name: cnvrg-hostpath-storage)
storage.hostpath.path/cnvrg-hostpath-storagehost directory for storage
storage.hostpath.imagequay.io/kubevirt/hostpath-provisionerhostpath provisioner image
storage.hostpath.reclaimPolicyRetainstorage class reclaim policy
storage.nfs.enabledFalseset to true to enable hostpath nfs client provisioner
storage.nfs.server-Ip address of the NFS server
storage.nfs.path-NFS export path
storage.nfs.defaultScFalseset to true to make NFS default storage class (name: cnvrg-nfs-storage)
storage.nfs.imagegcr.io/k8s-staging-sig-storage/nfs-subdir-external-provisioner:v4.0.0Nfs provisioner image
storage.nfs.reclaimPolicyRetainstorage class reclaim policy

Tenancy options

FlagDefault valueDescription
tenancy.enabledFalsewhen true, ccp workloads will be scheduled only on nodes that match node selector: purpose=cnvrg-control-plane
tenancy.keypurposenode selector key
tenancy.valuecnvrg-control-planenode selector value

Registry options

FlagDefault valueDescription
registry.urldocker.ioregistry for pulling images
registry.user-registry user
registry.password-registry password

Labels and Annotations

FlagDefault valueDescription
lablesowner: cnvrg-control-planekey:value map of labels to be passed to every K8s resource deployed by Operator. usage example: --set labels.foo="bar"
annotations-key:value map of annotations to be passed to every K8s resource deployed by Operator. usage example: --set annotations.foo="bar"

Automatic Config Reloader

FlagDefault valueDescription
configReloader.enabledtrueset to false to disable config reloader, note, once disabled, cnvrg admin has to manually restart relevant pods on configuration changes