Skip to content

Architecture

This page explains the moving parts that get created when you install the prometheus-stack chart, and how they talk to each other at runtime.

Layered view

flowchart TB
    subgraph Layer1[1. CRDs]
      direction LR
      CRD1[Prometheus]
      CRD2[Alertmanager]
      CRD3[ServiceMonitor]
      CRD4[PodMonitor]
      CRD5[PrometheusRule]
      CRD6[AlertmanagerConfig]
      CRD7[ThanosRuler]
      CRD8[Probe / ScrapeConfig]
    end

    subgraph Layer2[2. Controller]
      OP[prometheus-operator<br/>Deployment]
    end

    subgraph Layer3[3. Custom Resources from the chart]
      direction LR
      CRP[Prometheus CR]
      CRA[Alertmanager CR]
      CRR[PrometheusRule CRs<br/>default alerts and recording rules]
      CRSM[ServiceMonitor CRs<br/>kubelet / apiserver / etcd / coreDNS / ...]
    end

    subgraph Layer4[4. Workloads materialized by the operator]
      direction LR
      PS[Prometheus<br/>StatefulSet]
      AS[Alertmanager<br/>StatefulSet]
    end

    subgraph Layer5[5. Subchart workloads]
      direction LR
      G[Grafana Deployment]
      KSM[kube-state-metrics Deployment]
      NE[node-exporter DaemonSet]
      WE[windows-exporter DaemonSet<br/>optional]
    end

    Layer1 --> OP
    Layer3 --> OP
    OP -->|creates / reconciles| PS
    OP -->|creates / reconciles| AS
    PS -->|fires alerts| AS
    PS -->|scrapes| KSM
    PS -->|scrapes| NE
    PS -->|scrapes| WE
    G -->|datasource query| PS
    G -->|optional alert source| AS

What lives where in the chart

The chart is an umbrella chart with two layers of content:

1. Subcharts (in prometheus-stack/charts/)

Declared in Chart.yaml and toggled by top-level keys in values.yaml:

Subchart Purpose Toggle
crds Installs the Prometheus Operator CRDs crds.enabled
kube-state-metrics Cluster object metrics (Deployments, Pods, ...) kubeStateMetrics.enabled
prometheus-node-exporter Linux node-level metrics nodeExporter.enabled
grafana Dashboards UI, pre-configured datasource grafana.enabled
prometheus-windows-exporter Node metrics for Windows nodes windowsMonitoring.enabled

2. The chart's own templates (prometheus-stack/templates/)

Folder Resources
prometheus-operator/ Operator Deployment, RBAC, Service, admission webhooks
prometheus/ The Prometheus CR, control-plane ServiceMonitors, default PrometheusRules, ingress, network policy
alertmanager/ The Alertmanager CR, config secret, Service, ingress, network policy
grafana/ Dashboard ConfigMaps auto-discovered by the Grafana subchart
thanos-ruler/ Optional ThanosRuler CR for HA / long-term rule evaluation
exporters/ ServiceMonitors for kube-state-metrics, node-exporter, kubelet, etc.
extra-objects.yaml Free-form manifests injected via extraManifests
_helpers.tpl Shared Go-template helpers (naming, labels)
NOTES.txt Post-install message

Lifecycle of a helm install

  1. Helm applies the CRDs first (subchart crds with condition crds.enabled).
  2. Helm applies the rest of the chart:
  3. prometheus-operator Deployment + RBAC.
  4. Prometheus, Alertmanager, default PrometheusRules and ServiceMonitors as Custom Resources.
  5. kube-state-metrics, node-exporter, grafana from subcharts.
  6. The Operator starts and watches the CRDs.
  7. Seeing the Prometheus and Alertmanager CRs, the Operator generates:
  8. StatefulSets for Prometheus and Alertmanager.
  9. The Prometheus configuration secret (built from all ServiceMonitor / PodMonitor / Probe / ScrapeConfig resources matching its selector).
  10. The Alertmanager configuration secret (from the alertmanager.config value and any AlertmanagerConfig CRs that match).
  11. Prometheus pods start, hot-reload their config when the secret changes, and begin scraping the targets.
  12. Grafana boots, mounts the Kubernetes dashboard ConfigMaps through its sidecar, and connects to Prometheus as a datasource.

How alerts flow

sequenceDiagram
    autonumber
    participant Rule as PrometheusRule CR
    participant Op as Prometheus Operator
    participant Prom as Prometheus
    participant AM as Alertmanager
    participant Recv as Receiver<br/>(Slack, email, PagerDuty, ...)

    Rule->>Op: kubectl apply
    Op->>Prom: regenerate config secret
    Prom->>Prom: hot-reload, evaluate rule
    Prom->>AM: fire alert (HTTP)
    AM->>AM: route + group + dedupe
    AM->>Recv: send notification

You never write prometheus.yml directly. You author PrometheusRule, ServiceMonitor, and (for Alertmanager) AlertmanagerConfig resources, and the Operator regenerates the underlying config.

Storage

By default the chart provisions:

  • A PersistentVolumeClaim per Prometheus replica (governed by prometheus.prometheusSpec.storageSpec). If unset, Prometheus uses an emptyDir and you lose data on restart.
  • A PersistentVolumeClaim per Alertmanager replica (governed by alertmanager.alertmanagerSpec.storage).
  • Grafana storage is governed by the Grafana subchart values (grafana.persistence.*).

See Configuration » Storage for recommended sizing.

Networking

The chart creates a ClusterIP Service for each component. To expose the UIs:

  • Toggle *.ingress.enabled in values.yaml and supply hostnames + TLS, or
  • Port-forward for ad-hoc access — see Accessing services.

If your cluster enforces NetworkPolicies, the chart can render baseline policies via *.networkPolicy.enabled (per component).