DevOpsil
Security
90%
Fresh
Part 4 of 6 in Security Hardening

HashiCorp Vault and Kubernetes: Secrets Management That Actually Works

Amara OkaforAmara Okafor10 min read

Kubernetes Secrets Are Not Secret

Let me be blunt: Kubernetes Secrets are base64-encoded, not encrypted. Anyone with get secrets RBAC permission in a namespace can decode every credential in that namespace with a one-liner:

kubectl get secret db-credentials -o jsonpath='{.data.password}' | base64 -d

In the 2024 Tesla Kubernetes breach (disclosed via their bug bounty program), exposed Kubernetes dashboards gave attackers access to pods that contained plaintext AWS credentials. The credentials weren't rotated, because nobody had set up automated rotation. The blast radius was enormous.

If your secrets live as Kubernetes Secret objects checked into Git or created manually, you're operating on borrowed time. You need a secrets manager. HashiCorp Vault is the industry standard for a reason — and integrating it with Kubernetes is more straightforward than most teams think.

Architecture Overview

There are three primary patterns for Vault-Kubernetes integration:

PatternHow It WorksBest For
Vault Agent InjectorSidecar injects secrets into pod filesystemExisting workloads, minimal code changes
Vault CSI ProviderMounts secrets via CSI volume driverTeams already using CSI, ephemeral secrets
Vault Secrets OperatorSyncs Vault secrets to K8s Secret objectsGitOps workflows, Helm-based deployments

I recommend the Vault Agent Injector for most teams starting out — it requires zero application code changes and integrates cleanly with any language or framework.

Setting Up Vault With Kubernetes Auth

First, Vault needs to trust your Kubernetes cluster. The Kubernetes auth method lets pods authenticate to Vault using their ServiceAccount tokens.

Enable the auth method and configure it:

# Enable Kubernetes auth in Vault
vault auth enable kubernetes

# Configure it to talk to the K8s API
vault write auth/kubernetes/config \
  kubernetes_host="https://kubernetes.default.svc:443" \
  kubernetes_ca_cert=@/var/run/secrets/kubernetes.io/serviceaccount/ca.crt

Create a Vault policy that scopes access tightly:

# vault-policy-app-db.hcl
path "secret/data/production/db-credentials" {
  capabilities = ["read"]
}

path "secret/data/production/api-keys" {
  capabilities = ["read"]
}

# Explicitly deny listing the parent path
path "secret/data/production/*" {
  capabilities = ["deny"]
}

Notice the explicit deny. Without it, a compromised pod could enumerate every secret under the production/ path. Assume breach — scope every policy to the exact paths the workload needs.

Now create a Vault role that maps a Kubernetes ServiceAccount to this policy:

vault write auth/kubernetes/role/webapp-production \
  bound_service_account_names=webapp-sa \
  bound_service_account_namespaces=production \
  policies=app-db-readonly \
  ttl=1h \
  max_ttl=4h

The ttl=1h is critical. Short-lived tokens mean that even if a token leaks, the window for exploitation is bounded. Compare this to a static Kubernetes Secret that never expires.

Deploying the Vault Agent Injector

Install the Vault Agent Injector via Helm:

helm repo add hashicorp https://helm.releases.hashicorp.com
helm repo update

helm install vault hashicorp/vault \
  --namespace vault \
  --create-namespace \
  --set "injector.enabled=true" \
  --set "server.enabled=false" \
  --set "injector.externalVaultAddr=https://vault.internal.example.com:8200"

Setting server.enabled=false assumes you're running Vault externally (which you should in production — don't run your secrets manager inside the cluster it's protecting).

Injecting Secrets Into Pods

Annotate your deployment to have the injector sidecar automatically fetch secrets:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: webapp
  namespace: production
spec:
  replicas: 3
  selector:
    matchLabels:
      app: webapp
  template:
    metadata:
      labels:
        app: webapp
      annotations:
        vault.hashicorp.com/agent-inject: "true"
        vault.hashicorp.com/role: "webapp-production"
        vault.hashicorp.com/agent-inject-secret-db-creds: "secret/data/production/db-credentials"
        vault.hashicorp.com/agent-inject-template-db-creds: |
          {{- with secret "secret/data/production/db-credentials" -}}
          export DB_HOST="{{ .Data.data.host }}"
          export DB_USER="{{ .Data.data.username }}"
          export DB_PASS="{{ .Data.data.password }}"
          {{- end }}
    spec:
      serviceAccountName: webapp-sa
      containers:
        - name: webapp
          image: registry.example.com/webapp:v2.4.1
          command: ["/bin/sh", "-c"]
          args: ["source /vault/secrets/db-creds && /app/start"]
          volumeMounts: []

The Vault Agent runs as an init container (to fetch secrets before app start) and a sidecar (to rotate them). Your application reads secrets from /vault/secrets/ — a tmpfs volume that never touches disk.

Threat Model: Why This Matters

Let's walk through the attack chain with and without Vault:

Without Vault (static K8s Secrets):

  1. Attacker compromises a pod via RCE vulnerability (CVE-2024-21626, for example)
  2. Reads ServiceAccount token, calls K8s API
  3. kubectl get secrets — retrieves database password, API keys, TLS certs
  4. Credentials are static and valid indefinitely
  5. Attacker pivots to database, exfiltrates data over weeks

With Vault integration:

  1. Attacker compromises a pod via the same RCE
  2. Reads /vault/secrets/db-creds — gets current database password
  3. Password rotates in under 1 hour (Vault dynamic secrets can rotate in minutes)
  4. Attacker cannot reach Vault directly (network policy blocks it)
  5. Attacker cannot enumerate other secrets (Vault policy denies list)
  6. Blast radius: one database, limited time window, detectable via audit logs

The difference isn't theoretical. It's the difference between a contained incident and a headline.

Dynamic Database Credentials

The real power of Vault is dynamic secrets. Instead of storing a static database password, Vault generates short-lived credentials on demand:

# Enable the database secrets engine
vault secrets enable database

# Configure a PostgreSQL connection
vault write database/config/production-db \
  plugin_name=postgresql-database-plugin \
  allowed_roles="webapp-role" \
  connection_url="postgresql://{{username}}:{{password}}@db.internal:5432/appdb?sslmode=require" \
  username="vault_admin" \
  password="initial-password"

# Create a role that generates time-limited credentials
vault write database/roles/webapp-role \
  db_name=production-db \
  creation_statements="CREATE ROLE \"{{name}}\" WITH LOGIN PASSWORD '{{password}}' VALID UNTIL '{{expiration}}'; \
    GRANT SELECT, INSERT, UPDATE ON ALL TABLES IN SCHEMA public TO \"{{name}}\";" \
  revocation_statements="REVOKE ALL PRIVILEGES ON ALL TABLES IN SCHEMA public FROM \"{{name}}\"; DROP ROLE IF EXISTS \"{{name}}\";" \
  default_ttl="30m" \
  max_ttl="1h"

Every pod gets unique database credentials that expire in 30 minutes. If one set leaks, you know exactly which pod was compromised (the username maps to a specific lease), and the credentials self-destruct.

Network Policies: Defense in Depth

Don't let arbitrary pods talk to Vault. Lock it down:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: vault-access
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: webapp
  egress:
    - to:
        - namespaceSelector:
            matchLabels:
              name: vault
          podSelector:
            matchLabels:
              app.kubernetes.io/name: vault-agent-injector
      ports:
        - port: 8200
          protocol: TCP
  policyTypes:
    - Egress

Pipeline Integration: Validating Vault Annotations

Add a CI check to ensure every Deployment in production namespaces has Vault annotations and doesn't use raw Kubernetes Secrets:

#!/bin/bash
# scripts/check-vault-annotations.sh
set -euo pipefail

FAILURES=0

for file in k8s/production/*.yaml; do
  if grep -q "kind: Deployment" "$file"; then
    if ! grep -q "vault.hashicorp.com/agent-inject" "$file"; then
      echo "FAIL: $file — missing Vault injection annotations"
      FAILURES=$((FAILURES + 1))
    fi
  fi

  if grep -q "kind: Secret" "$file"; then
    echo "FAIL: $file — raw Kubernetes Secret detected. Use Vault instead."
    FAILURES=$((FAILURES + 1))
  fi
done

if [ "$FAILURES" -gt 0 ]; then
  echo "$FAILURES policy violations found."
  exit 1
fi

echo "All checks passed."

Monitoring Vault Access

Enable Vault audit logging and alert on anomalies:

vault audit enable file file_path=/vault/logs/audit.log

Key events to alert on:

  • Authentication failures from Kubernetes ServiceAccounts
  • Access to secrets outside normal patterns (time of day, frequency)
  • Any list operations on secret paths (potential enumeration)
  • Token renewals past expected thresholds

Pipe these into your SIEM. Correlate Vault audit logs with Kubernetes audit logs. When a pod authenticates to Vault, you should be able to trace the full chain: which node, which namespace, which ServiceAccount, which secret, and when.

Migration Checklist

Moving from static Kubernetes Secrets to Vault doesn't happen overnight. Here's a practical migration path:

  1. Week 1: Deploy Vault Agent Injector. Pick one non-critical workload and migrate its secrets.
  2. Week 2: Add the CI validation script. Block new raw Secrets from merging.
  3. Week 3-4: Migrate remaining workloads namespace by namespace. Start with staging, then production.
  4. Week 5: Enable dynamic database credentials for at least one database.
  5. Ongoing: Rotate the Vault unseal keys. Audit Vault policies quarterly. Monitor the audit log.

Troubleshooting Common Vault-Kubernetes Issues

After deploying Vault integration across dozens of clusters, these are the failure modes that bite teams most often.

Pod Stuck in Init

The Vault Agent init container runs before your app starts. If it can't authenticate to Vault, the pod hangs forever in Init:0/1. Debug it by checking the init container logs:

kubectl logs <pod-name> -c vault-agent-init -n production

Common causes and fixes:

# Cause 1: ServiceAccount doesn't match the Vault role's bound_service_account_names
# Fix: Verify the SA name matches exactly
vault read auth/kubernetes/role/webapp-production
# Check bound_service_account_names and bound_service_account_namespaces

# Cause 2: Vault can't validate the Kubernetes token reviewer
# This happens after cluster upgrades that rotate the API server CA
vault write auth/kubernetes/config \
  kubernetes_host="https://kubernetes.default.svc:443" \
  kubernetes_ca_cert=@/var/run/secrets/kubernetes.io/serviceaccount/ca.crt

# Cause 3: The Vault policy doesn't allow reading the requested path
# Check the exact path — note the "data/" segment in KV v2
vault kv get secret/production/db-credentials           # CLI path
# actual API path: secret/data/production/db-credentials  # Policy path

The KV v2 data/ path prefix trips up nearly every team. Your Vault policy must use the full API path (secret/data/production/db-credentials), but your annotation uses the logical path (secret/data/production/db-credentials). If these don't align, the pod gets a 403 and the init container loops indefinitely.

Secrets Not Updating After Rotation

The Vault Agent sidecar periodically renews and re-fetches secrets. But your application might cache the old values in memory. You need to either:

  1. Watch the file for changes — many frameworks support this natively
  2. Use a signal-based reload — configure the Vault Agent to send SIGHUP to your app process
# Vault Agent template with a command that triggers reload
vault.hashicorp.com/agent-inject-command-db-creds: "/bin/sh -c 'kill -HUP $(pidof myapp) || true'"
  1. Use a sidecar reload container — for apps that can't handle signals, a small sidecar that watches /vault/secrets/ and restarts the main process via a shared PID namespace.

Vault Lease Exhaustion

If your pods restart frequently (e.g., during a rolling update of 50 replicas), each pod creates a new Vault lease for its dynamic database credentials. With a 30-minute TTL and 50 pods restarting every 5 minutes, you can accumulate hundreds of orphaned leases. Watch for this:

# Check active lease count for a specific role
vault list sys/leases/lookup/database/creds/webapp-role | wc -l

# Revoke orphaned leases if they pile up
vault lease revoke -prefix database/creds/webapp-role

Set your Vault role's max_ttl lower than the expected pod lifecycle to prevent lease accumulation during aggressive rollouts.

Vault High Availability Considerations

Running Vault itself as a single instance is a reliability risk that undermines everything you've built. In production, deploy Vault in HA mode with integrated Raft storage or a Consul backend:

# vault-config.hcl — HA with Raft storage
storage "raft" {
  path    = "/vault/data"
  node_id = "vault-0"

  retry_join {
    leader_api_addr = "https://vault-0.vault-internal:8200"
  }
  retry_join {
    leader_api_addr = "https://vault-1.vault-internal:8200"
  }
  retry_join {
    leader_api_addr = "https://vault-2.vault-internal:8200"
  }
}

listener "tcp" {
  address     = "0.0.0.0:8200"
  tls_cert_file = "/vault/tls/tls.crt"
  tls_key_file  = "/vault/tls/tls.key"
}

seal "awskms" {
  region     = "us-east-1"
  kms_key_id = "alias/vault-unseal-key"
}

api_addr     = "https://vault-0.vault-internal:8200"
cluster_addr = "https://vault-0.vault-internal:8201"

The awskms seal block enables auto-unseal — Vault restarts without manual unseal key entry. Without auto-unseal, a Vault pod restart at 3 AM means your secrets manager is down until someone manually provides unseal keys. That's a pager-worthy gap in your security infrastructure.

The Hard Truth

Every week you delay this migration, your static credentials sit in etcd, in CI environment variables, in Helm values files, in someone's local kubeconfig. Each one is a breach waiting to happen. Vault doesn't make you invulnerable — nothing does. But it shrinks the blast radius, limits the time window, and gives you an audit trail. That's the difference between a contained security incident and an existential one.

Stop storing secrets in plain text. Start today.

Share:
Amara Okafor
Amara Okafor

DevSecOps Lead

Security-first mindset in everything I ship. From zero-trust architectures to supply chain security, I make sure your pipeline doesn't become your weakest link.

Related Articles