93%

Needs Review

Part 8 of 8 in Kubernetes from Zero to Hero

Production-Ready Helm Charts: Templates, Values, Hooks, and Testing

Aareez AsifMarch 20, 20269 min read

Most Helm Charts Are Not Production-Ready

Here's the thing about Helm charts in the wild — the vast majority of them work on a developer's laptop and crumble in production. I've inherited charts that hardcoded replica counts, had no resource limits, used latest as the default image tag, and exposed secrets in plaintext through values files.

A production-ready Helm chart is one that another engineer can deploy to a live cluster with confidence, customize for their environment without forking the chart, and upgrade without downtime. That bar is higher than most people realize.

Let me tell you why these patterns matter, and walk through the practices I enforce on every chart that touches production.

Chart Structure That Scales

Start with a clean layout. Every chart I build follows this structure:

my-app/
├── Chart.yaml
├── Chart.lock
├── values.yaml
├── values-production.yaml
├── values-staging.yaml
├── templates/
│   ├── _helpers.tpl
│   ├── deployment.yaml
│   ├── service.yaml
│   ├── ingress.yaml
│   ├── hpa.yaml
│   ├── pdb.yaml
│   ├── serviceaccount.yaml
│   ├── configmap.yaml
│   ├── secret.yaml
│   ├── networkpolicy.yaml
│   └── tests/
│       └── test-connection.yaml
├── charts/            # subcharts
└── ci/
    └── ci-values.yaml # values used in CI testing

The ci/ directory is something most people skip. It holds a values file specifically for automated testing in your pipeline. More on that later.

Values Design: The API of Your Chart

Your values.yaml is an API contract. Treat it like one. Here's how I structure values for a typical web service:

# values.yaml

# -- Number of replicas. Override per environment.
replicaCount: 2

image:
  # -- Container image repository
  repository: ghcr.io/myorg/my-app
  # -- Image pull policy
  pullPolicy: IfNotPresent
  # -- Image tag. Defaults to chart appVersion.
  tag: ""

# -- Image pull secrets for private registries
imagePullSecrets: []

serviceAccount:
  # -- Create a service account
  create: true
  # -- Annotations for the service account (e.g., IRSA)
  annotations: {}
  # -- Service account name. Auto-generated if not set.
  name: ""

service:
  type: ClusterIP
  port: 80
  targetPort: 8080

ingress:
  enabled: false
  className: nginx
  annotations: {}
  hosts:
    - host: my-app.example.com
      paths:
        - path: /
          pathType: Prefix
  tls: []

resources:
  requests:
    cpu: 100m
    memory: 128Mi
  limits:
    cpu: 500m
    memory: 512Mi

autoscaling:
  enabled: false
  minReplicas: 2
  maxReplicas: 10
  targetCPUUtilizationPercentage: 70

podDisruptionBudget:
  enabled: true
  minAvailable: 1

# -- Extra environment variables as key-value pairs
env: {}

# -- Extra environment variables from secrets/configmaps
envFrom: []

# -- Readiness probe configuration
readinessProbe:
  httpGet:
    path: /health
    port: http
  initialDelaySeconds: 5
  periodSeconds: 10

# -- Liveness probe configuration
livenessProbe:
  httpGet:
    path: /health
    port: http
  initialDelaySeconds: 15
  periodSeconds: 20

# -- Node selector constraints
nodeSelector: {}

# -- Tolerations for pod scheduling
tolerations: []

# -- Affinity rules for pod scheduling
affinity: {}

Let me tell you why every field has a comment with --. That double-dash prefix is a convention that helm-docs picks up to auto-generate documentation. If you're not generating docs from your values file, you're asking every consumer of your chart to read your templates to understand what's configurable.

Template Helpers That Prevent Disasters

Your _helpers.tpl should define reusable named templates. Here's the foundation I use:

{{/* templates/_helpers.tpl */}}

{{/*
Expand the name of the chart.
*/}}
{{- define "my-app.name" -}}
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" }}
{{- end }}

{{/*
Create a fully qualified app name.
We truncate at 63 characters because Kubernetes name fields are limited.
*/}}
{{- define "my-app.fullname" -}}
{{- if .Values.fullnameOverride }}
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- $name := default .Chart.Name .Values.nameOverride }}
{{- if contains $name .Release.Name }}
{{- .Release.Name | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" }}
{{- end }}
{{- end }}
{{- end }}

{{/*
Common labels applied to every resource.
*/}}
{{- define "my-app.labels" -}}
helm.sh/chart: {{ include "my-app.chart" . }}
{{ include "my-app.selectorLabels" . }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end }}

{{/*
Selector labels — used in deployments and services.
*/}}
{{- define "my-app.selectorLabels" -}}
app.kubernetes.io/name: {{ include "my-app.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}

{{/*
Chart name and version for the chart label.
*/}}
{{- define "my-app.chart" -}}
{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }}
{{- end }}

{{/*
Service account name.
*/}}
{{- define "my-app.serviceAccountName" -}}
{{- if .Values.serviceAccount.create }}
{{- default (include "my-app.fullname" .) .Values.serviceAccount.name }}
{{- else }}
{{- default "default" .Values.serviceAccount.name }}
{{- end }}
{{- end }}

The 63-character truncation is not optional. Kubernetes rejects names longer than 63 characters, and when your release name is staging-my-long-application-name, that limit comes fast. I've watched deployments fail in CI because nobody tested with long release names.

The Deployment Template Done Right

Here's a deployment template with the patterns I consider mandatory:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ include "my-app.fullname" . }}
  labels:
    {{- include "my-app.labels" . | nindent 4 }}
spec:
  {{- if not .Values.autoscaling.enabled }}
  replicas: {{ .Values.replicaCount }}
  {{- end }}
  selector:
    matchLabels:
      {{- include "my-app.selectorLabels" . | nindent 6 }}
  template:
    metadata:
      annotations:
        checksum/config: {{ include (print $.Template.BasePath "/configmap.yaml") . | sha256sum }}
      labels:
        {{- include "my-app.labels" . | nindent 8 }}
    spec:
      {{- with .Values.imagePullSecrets }}
      imagePullSecrets:
        {{- toYaml . | nindent 8 }}
      {{- end }}
      serviceAccountName: {{ include "my-app.serviceAccountName" . }}
      securityContext:
        runAsNonRoot: true
        fsGroup: 65534
      containers:
        - name: {{ .Chart.Name }}
          image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
          imagePullPolicy: {{ .Values.image.pullPolicy }}
          securityContext:
            allowPrivilegeEscalation: false
            readOnlyRootFilesystem: true
            capabilities:
              drop:
                - ALL
          ports:
            - name: http
              containerPort: {{ .Values.service.targetPort }}
              protocol: TCP
          {{- with .Values.readinessProbe }}
          readinessProbe:
            {{- toYaml . | nindent 12 }}
          {{- end }}
          {{- with .Values.livenessProbe }}
          livenessProbe:
            {{- toYaml . | nindent 12 }}
          {{- end }}
          resources:
            {{- toYaml .Values.resources | nindent 12 }}
          {{- if .Values.env }}
          env:
            {{- range $key, $value := .Values.env }}
            - name: {{ $key }}
              value: {{ $value | quote }}
            {{- end }}
          {{- end }}
          {{- with .Values.envFrom }}
          envFrom:
            {{- toYaml . | nindent 12 }}
          {{- end }}
      {{- with .Values.nodeSelector }}
      nodeSelector:
        {{- toYaml . | nindent 8 }}
      {{- end }}
      {{- with .Values.tolerations }}
      tolerations:
        {{- toYaml . | nindent 8 }}
      {{- end }}
      {{- with .Values.affinity }}
      affinity:
        {{- toYaml . | nindent 8 }}
      {{- end }}

Here's the thing about that checksum/config annotation — it forces a rolling restart when your ConfigMap changes. Without it, you update a config value, Helm reports success, and your pods keep running with the old config because the Deployment spec itself didn't change. I've seen this cause hours of confusion.

Also note the security context: runAsNonRoot, readOnlyRootFilesystem, and dropping all capabilities. These should be defaults, not opt-in.

Helm Hooks for Lifecycle Management

Hooks let you run actions at specific points in the release lifecycle. Here's a database migration hook that runs before upgrades:

apiVersion: batch/v1
kind: Job
metadata:
  name: {{ include "my-app.fullname" . }}-migrate
  labels:
    {{- include "my-app.labels" . | nindent 4 }}
  annotations:
    "helm.sh/hook": pre-upgrade,pre-install
    "helm.sh/hook-weight": "-1"
    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
spec:
  backoffLimit: 3
  template:
    spec:
      restartPolicy: Never
      containers:
        - name: migrate
          image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
          command: ["./migrate", "--direction", "up"]
          envFrom:
            - secretRef:
                name: {{ include "my-app.fullname" . }}-db-credentials

Let me tell you why hook-delete-policy is critical. Without before-hook-creation, if a previous migration Job still exists (maybe it failed), the new hook can't create a Job with the same name and the entire upgrade hangs. I've been paged for exactly this scenario.

The hook-weight controls ordering when you have multiple hooks. Lower numbers run first. Use negative weights for migrations that must complete before other setup hooks.

Testing Your Charts

Helm has a built-in test framework that almost nobody uses. Add test pods in templates/tests/:

apiVersion: v1
kind: Pod
metadata:
  name: "{{ include "my-app.fullname" . }}-test-connection"
  labels:
    {{- include "my-app.labels" . | nindent 4 }}
  annotations:
    "helm.sh/hook": test
spec:
  restartPolicy: Never
  containers:
    - name: wget
      image: busybox
      command: ['wget']
      args: ['{{ include "my-app.fullname" . }}:{{ .Values.service.port }}/health']

Run tests after install:

helm test my-release -n production

But in-cluster tests are only one layer. For CI, I also run:

# Lint the chart
helm lint ./my-app --values ./my-app/ci/ci-values.yaml

# Template rendering — catches syntax errors without a cluster
helm template test-release ./my-app --values ./my-app/ci/ci-values.yaml > rendered.yaml

# Validate rendered manifests against Kubernetes schemas
kubeconform -strict -kubernetes-version 1.29.0 rendered.yaml

# Policy checks with conftest
conftest test rendered.yaml --policy ./policies/

This pipeline catches the majority of issues before anything touches a cluster. The ci-values.yaml file should enable every feature toggle so your templates get fully rendered and tested.

Patterns to Avoid

After maintaining charts across many teams, these are the anti-patterns I push back on:

Defaulting image tag to latest. Use .Chart.AppVersion as the default. Pinned versions are non-negotiable for reproducible deployments.
Putting secrets in values.yaml. Secrets belong in external secret managers (Vault, AWS Secrets Manager) referenced via envFrom or external-secrets-operator. Never check credentials into a chart.
Massive monolithic templates. If a template file exceeds 150 lines, split it. Use named templates in _helpers.tpl for repeated blocks.
No resource requests or limits. A chart without resource definitions will get scheduled on nodes that can't handle it, or worse, it'll consume unbounded resources and starve other workloads.
Skipping PodDisruptionBudgets. If you care about availability during node drains and cluster upgrades, a PDB is mandatory. Default minAvailable: 1 for any multi-replica workload.

Final Thoughts

A Helm chart is the interface between your application and the cluster. It encodes your operational knowledge: how the app should be deployed, what resources it needs, how it scales, and what happens during upgrades.

Treat your charts with the same rigor as application code. Review them in PRs, test them in CI, version them properly. The chart that works on your laptop and the chart that survives a production node failure at 3 AM are very different things. Build for the 3 AM scenario, and the laptop scenario takes care of itself.

Advertise here

kubernetes helm helm-charts devops ci-cd templating

Was this article helpful?

Aareez Asif

Senior Kubernetes Architect

10+ years orchestrating containers in production. Battle-tested opinions on everything from pod scheduling to service mesh. I've seen clusters burn and helped rebuild them better.

Twitter/X LinkedIn

What our experts think

Aareez AsifSenior Kubernetes ArchitectAgrees

Helm is still the best way to package Kubernetes applications when done right. The key is treating your values.yaml as a contract, not a dumping ground for every possible configuration option.

Sarah ChenCI/CD Engineering LeadAdds Context

Lint and template-test your Helm charts in CI. A syntax error in a Go template that only triggers with certain value combinations will ruin your on-call's evening.

Kubernetes Advanced OperationsChapter 4 of 6

Previous Chapter

Secrets Encryption at Rest

Next Chapter

Security Hardening for Production

JenkinsTutorialBeginnerNeeds Review

Jenkins Kubernetes Agents: Dynamic Build Pods for Scalable CI/CD

Configure Jenkins to dynamically spin up Kubernetes pods as build agents. Covers the Kubernetes plugin, pod templates, JNLP agents, multi-container pods, resource limits, and RBAC — so you never manage static build nodes again.

Sarah Chen·Apr 5, 2026

6 min read

KubernetesTutorialBeginnerNeeds Review

Istio Observability and Authorization: Distributed Tracing, Metrics, and Access Policies

How to use Istio's built-in observability — distributed tracing with Jaeger, Prometheus metrics, Kiali service graph — and enforce zero-trust access control with AuthorizationPolicies.

Aareez Asif·Apr 2, 2026

5 min read

KubernetesTutorialBeginnerNeeds Review

Istio Service Mesh: Installation, Traffic Management, and mTLS

A practical guide to getting started with Istio — installing on Kubernetes, enabling automatic mTLS, configuring VirtualServices for traffic management, and understanding the sidecar injection model.

Aareez Asif·Apr 2, 2026

5 min read

KubernetesQuick RefBeginnerNeeds Review

Fix Kubernetes OOMKilled: Pod Keeps Getting Killed for Memory

Diagnose and fix OOMKilled errors in Kubernetes pods — understand memory limits, identify leaks, and configure resource requests correctly.

Aareez Asif·Mar 30, 2026

3 min read

KubernetesDeep DiveIntermediateNeeds Review

The Complete Guide to Kubernetes Deployment Strategies: Rolling, Blue-Green, Canary, and Progressive Delivery

A comprehensive guide to every Kubernetes deployment strategy — rolling updates, blue-green, canary, and progressive delivery with Argo Rollouts and Flagger.

Aareez Asif·Mar 23, 2026

15 min read

KubernetesTutorialIntermediateNeeds Review

Kubernetes Ingress vs Gateway API: When to Migrate and How to Do It Without Breaking Everything

A practical comparison of Kubernetes Ingress and Gateway API, with a migration strategy that won't take down your production traffic.

Aareez Asif·Mar 22, 2026

10 min read

More in Kubernetes

View all →

KubernetesDeep DiveIntermediate

Kubernetes Vertical Pod Autoscaler: Automating Resource Request Tuning In Production

Let me be direct with you: most Kubernetes clusters I audit are hemorrhaging money because of poorly configured resource requests. I've seen teams running...

Dev Patel·Apr 23, 2026

14 min read

KubernetesQuick RefBeginnerNeeds Review

Fix Helm 'UPGRADE FAILED: has no deployed releases'

Fix the Helm 'UPGRADE FAILED: has no deployed releases' error when a previous install failed and left the release in a broken state.

Zara Blackwood·Mar 30, 2026

3 min read

KubernetesQuick RefBeginnerNeeds Review

Fix Kubernetes 'Evicted' Pods Filling Up the Node

Clean up Kubernetes evicted pods and fix the underlying disk pressure or resource exhaustion that causes pod evictions.

Riku Tanaka·Mar 30, 2026

4 min read

KubernetesQuick RefBeginnerNeeds Review

Fix Kubernetes ImagePullBackOff: Container Image Won't Pull

Resolve ImagePullBackOff and ErrImagePull errors in Kubernetes — fix registry credentials, image tags, and network access issues.

Sarah Chen·Mar 30, 2026

3 min read

Discussion

View all

On this page

Production-Ready Helm Charts: Templates, Values, Hooks, and Testing

Most Helm Charts Are Not Production-Ready

Chart Structure That Scales

Values Design: The API of Your Chart

Template Helpers That Prevent Disasters

The Deployment Template Done Right

Helm Hooks for Lifecycle Management

Testing Your Charts

Patterns to Avoid

Final Thoughts

What our experts think

Related Articles

Jenkins Kubernetes Agents: Dynamic Build Pods for Scalable CI/CD

Istio Observability and Authorization: Distributed Tracing, Metrics, and Access Policies

Istio Service Mesh: Installation, Traffic Management, and mTLS

Fix Kubernetes OOMKilled: Pod Keeps Getting Killed for Memory

The Complete Guide to Kubernetes Deployment Strategies: Rolling, Blue-Green, Canary, and Progressive Delivery

Kubernetes Ingress vs Gateway API: When to Migrate and How to Do It Without Breaking Everything

More in Kubernetes

Kubernetes Vertical Pod Autoscaler: Automating Resource Request Tuning In Production

Fix Helm 'UPGRADE FAILED: has no deployed releases'

Fix Kubernetes 'Evicted' Pods Filling Up the Node

Fix Kubernetes ImagePullBackOff: Container Image Won't Pull

Discussion