Skip to content

Components

This page describes each component the chart deploys, what it does, where its configuration lives in values.yaml, and pointers for deeper reading.

Prometheus Operator

What it is: A Kubernetes controller that watches Prometheus-related CRDs and reconciles them into real workloads.

Why it matters: It is the only component you really administer. Once it is running and healthy, you express everything else (Prometheus servers, Alertmanagers, scrape jobs, alerts) as Kubernetes resources.

Deployed as: A Deployment in the chart's namespace (default monitoring).

Values root: prometheusOperator.*

Common keys:

  • prometheusOperator.enabled
  • prometheusOperator.admissionWebhooks.enabled — validating webhook for PrometheusRule
  • prometheusOperator.tls.enabled
  • prometheusOperator.resources
  • prometheusOperator.image.tag (pin if needed)

Reads: all CRDs from the operator group (monitoring.coreos.com).

Reference: Prometheus Operator docs.

Prometheus

What it is: The time-series database and scrape engine.

Deployed as: A StatefulSet materialized by the Operator from the Prometheus Custom Resource that the chart creates.

Values root: prometheus.*, with the most-edited subtree being prometheus.prometheusSpec.*.

Common keys:

  • prometheus.prometheusSpec.replicas
  • prometheus.prometheusSpec.retention, retentionSize
  • prometheus.prometheusSpec.storageSpec (PVC template)
  • prometheus.prometheusSpec.resources
  • prometheus.prometheusSpec.serviceMonitorSelector — controls which ServiceMonitors this Prometheus picks up
  • prometheus.prometheusSpec.ruleSelector — same for PrometheusRule
  • prometheus.prometheusSpec.externalLabels — labels added to all metrics leaving this Prometheus (important for federation / Thanos)
  • prometheus.ingress.enabled

Selectors default to matching everything in the release

The chart sets selectors so that any ServiceMonitor / PrometheusRule with the chart's release labels is picked up. If you want to scrape resources from other releases or namespaces, you need to widen the selectors, e.g.:

prometheus:
  prometheusSpec:
    serviceMonitorSelectorNilUsesHelmValues: false
    serviceMonitorSelector: {}
    serviceMonitorNamespaceSelector: {}

Alertmanager

What it is: Receives alerts from Prometheus, deduplicates and groups them, and sends notifications through receivers (Slack, email, PagerDuty, webhook, ...).

Deployed as: A StatefulSet materialized by the Operator from the Alertmanager CR.

Values root: alertmanager.*

Common keys:

  • alertmanager.enabled
  • alertmanager.config.route — routing tree
  • alertmanager.config.receivers — notification destinations
  • alertmanager.alertmanagerSpec.replicas
  • alertmanager.alertmanagerSpec.storage
  • alertmanager.ingress.enabled

You can also create namespace-scoped AlertmanagerConfig CRs to allow teams to manage their own routing without editing chart values.

Reference: Alertmanager docs.

Grafana

What it is: Dashboards UI. The chart ships dashboards for kubelet, node, API server, etcd, Pods, namespaces, persistent volumes, networking, and more.

Deployed as: Subchart (grafana/grafana). The chart's own templates/grafana/ only ships dashboard ConfigMaps; the runtime is the subchart.

Values root: grafana.*

Common keys:

  • grafana.enabled
  • grafana.adminPassword (default prom-operatorchange this)
  • grafana.persistence.enabled, grafana.persistence.size
  • grafana.ingress.enabled, grafana.ingress.hosts
  • grafana.defaultDashboardsEnabled — toggles all default dashboards
  • grafana.sidecar.dashboards.* — how the dashboard sidecar discovers ConfigMaps
  • grafana.additionalDataSources

Change the Grafana admin password

The default password is well-known. Either set grafana.adminPassword to a strong value, or set grafana.admin.existingSecret and reference a Kubernetes Secret that you manage out of band.

kube-state-metrics

What it is: Exposes metrics about Kubernetes objects themselves (kube_pod_*, kube_deployment_*, kube_node_*, kube_persistentvolume_*, ...). It does not export Pod CPU/memory — that's node-exporter + kubelet.

Deployed as: Subchart (prometheus-community/kube-state-metrics).

Values root: kubeStateMetrics.* (toggle), kube-state-metrics.* (pass-through to subchart).

node-exporter

What it is: Exposes Linux host metrics from each node: CPU, memory, disk, network, file system, hardware, ...

Deployed as: Subchart (prometheus-community/prometheus-node-exporter), runs as a DaemonSet.

Values root: nodeExporter.* (toggle), prometheus-node-exporter.* (pass-through).

Windows exporter (optional)

What it is: node-exporter equivalent for Windows nodes.

Toggle: windowsMonitoring.enabled — off by default.

Values root: prometheus-windows-exporter.*.

Control-plane ServiceMonitors

The chart provides ServiceMonitors for:

  • kubeApiServer
  • kubelet (and through it, cAdvisor metrics)
  • kubeControllerManager
  • kubeScheduler
  • kubeProxy
  • kubeEtcd
  • coreDns / kubeDns

Each of these has a top-level toggle in values.yaml. On managed clusters (EKS, GKE, AKS) some control-plane components are not user-reachable; turn the corresponding *.enabled: false to avoid DOWN targets.

Default PrometheusRules

The chart ships a curated bundle of recording and alerting rules covering:

  • General cluster health (KubeAPIDown, KubeNodeNotReady, etc.)
  • Resource saturation (CPU, memory, disk)
  • Workload state (KubePodCrashLooping, KubeDeploymentReplicasMismatch)
  • Storage (KubePersistentVolumeFillingUp)
  • etcd, scheduler, controller-manager rules
  • Prometheus and Alertmanager self-monitoring rules
  • node-exporter rules

Toggle root: defaultRules.* — toggle the entire bundle, or any individual rule group:

defaultRules:
  create: true
  rules:
    etcd: false           # if you do not have access to etcd metrics
    kubeProxy: false      # if you use a CNI that does not run kube-proxy
    kubeScheduler: false  # often disabled on managed clusters

Thanos Ruler (optional)

What it is: Evaluates Prometheus rules against a Thanos query layer. Lets you keep rule evaluation centralized when you have multiple Prometheis.

Toggle: thanosRuler.enabled — off by default. The chart only deploys the ThanosRuler CR; the rest of Thanos (sidecar, store gateway, querier, compactor) is out of scope.