Scalable Log Aggregation with Grafana Loki and Promtail

Why Elasticsearch Isn't the Only Answer Anymore

Every team I've worked with that ran Elasticsearch for logs eventually hit the same wall: storage costs spiral, cluster management becomes a full-time job, and the JVM heap tuning alone requires a PhD. Elasticsearch is powerful, but it indexes everything by default — and most of that index is never queried.

Loki takes a fundamentally different approach. It indexes only metadata (labels), not log content. The actual log lines are stored compressed in object storage. This makes it dramatically cheaper to operate and pairs naturally with the Prometheus label model your team already understands.

As the Google SRE book reminds us, the cost of your observability stack should be proportional to the value it delivers. Loki gets that balance right for most teams.

Architecture Overview

┌──────────┐     ┌───────────┐     ┌──────────┐     ┌──────────────┐
│  App Pods │────▶│  Promtail │────▶│   Loki   │────▶│   Grafana    │
│  (stdout) │    │ (DaemonSet)│    │ (Gateway) │    │  (LogQL UI)  │
└──────────┘     └───────────┘     └──────────┘     └──────────────┘
                                        │
                                   ┌────┴────┐
                                   │   S3 /  │
                                   │  MinIO  │
                                   └─────────┘

Promtail runs on every node, tails container log files, attaches Kubernetes labels, and pushes to Loki. Loki stores chunks in object storage and maintains a small index for label lookups. Grafana queries Loki using LogQL.

Deploying Loki with Helm

For production, use the Simple Scalable deployment mode. It separates read and write paths for independent scaling.

# values-loki.yaml
loki:
  auth_enabled: false
  commonConfig:
    replication_factor: 1
  storage:
    type: s3
    bucketNames:
      chunks: loki-chunks
      ruler: loki-ruler
    s3:
      endpoint: minio.storage:9000
      accessKeyId: ${MINIO_ACCESS_KEY}
      secretAccessKey: ${MINIO_SECRET_KEY}
      s3ForcePathStyle: true
      insecure: true

  schemaConfig:
    configs:
      - from: "2024-01-01"
        store: tsdb
        object_store: s3
        schema: v13
        index:
          prefix: loki_index_
          period: 24h

  limits_config:
    retention_period: 744h          # 31 days
    max_query_length: 721h
    max_query_parallelism: 32
    ingestion_rate_mb: 10
    ingestion_burst_size_mb: 20
    per_stream_rate_limit: 5MB
    per_stream_rate_limit_burst: 15MB

  compactor:
    retention_enabled: true
    working_directory: /tmp/compactor

write:
  replicas: 3
  resources:
    requests:
      cpu: 500m
      memory: 1Gi
    limits:
      cpu: "2"
      memory: 2Gi

read:
  replicas: 2
  resources:
    requests:
      cpu: 500m
      memory: 1Gi
    limits:
      cpu: "2"
      memory: 2Gi

gateway:
  replicas: 2

helm install loki grafana/loki -n observability -f values-loki.yaml

Key configuration choices here. The per_stream_rate_limit prevents a single noisy service from overwhelming the cluster. The retention_period of 31 days is enough for most operational use — if you need logs older than that, you probably need an audit system, not a log aggregator.

Deploying Promtail

Promtail runs as a DaemonSet, reading container logs from the node filesystem.

# values-promtail.yaml
config:
  clients:
    - url: http://loki-gateway.observability/loki/api/v1/push
      tenant_id: default
      batchwait: 1s
      batchsize: 1048576    # 1 MiB

  positions:
    filename: /run/promtail/positions.yaml

  scrape_configs:
    - job_name: kubernetes-pods
      kubernetes_sd_configs:
        - role: pod
      relabel_configs:
        # Only collect logs from pods with annotation
        - source_labels: [__meta_kubernetes_pod_annotation_logging_enabled]
          action: keep
          regex: "true"

        # Set namespace label
        - source_labels: [__meta_kubernetes_namespace]
          target_label: namespace

        # Set pod name label
        - source_labels: [__meta_kubernetes_pod_name]
          target_label: pod

        # Set container name label
        - source_labels: [__meta_kubernetes_pod_container_name]
          target_label: container

        # Set app label from pod label
        - source_labels: [__meta_kubernetes_pod_label_app]
          target_label: app

      pipeline_stages:
        # Parse JSON logs
        - json:
            expressions:
              level: level
              msg: message
              trace_id: trace_id

        # Set level as a label for filtering
        - labels:
            level:

        # Drop debug logs in production
        - match:
            selector: '{level="debug"}'
            action: drop

        # Extract timestamp from log line
        - timestamp:
            source: timestamp
            format: RFC3339Nano

helm install promtail grafana/promtail -n observability -f values-promtail.yaml

Notice the drop stage for debug logs. In production, debug logs are almost never queried but account for 60-70% of log volume in most services. Dropping them at the agent saves storage, bandwidth, and money.

The keep relabel on logging_enabled is equally important. Opt-in logging means new services don't accidentally flood your pipeline. Add the annotation when you're ready.

# Pod annotation to enable log collection
metadata:
  annotations:
    logging.enabled: "true"

Label Cardinality: The One Thing That Will Break Loki

Loki's index is label-based. Every unique combination of labels creates a stream. Too many streams and Loki grinds to a halt.

Good labels: namespace, app, container, level — low cardinality, stable values.

Bad labels: user_id, request_id, trace_id, ip_address — high cardinality, creates millions of streams.

# WRONG: This creates a stream per request ID
- source_labels: [__meta_kubernetes_pod_annotation_request_id]
  target_label: request_id

# RIGHT: Keep request_id in the log line, not as a label
# Query it with LogQL filters instead:
# {app="api"} |= "request_id=abc123"

If you need to search by trace_id, use a LogQL filter on the log content, not a label. Loki is designed for this — content filtering is fast because chunks are compressed, not indexed.

Useful LogQL Queries

Once data flows, here's how to actually use it.

# All error logs for the api service in the last hour
{app="api", level="error"}

# Error logs containing a specific trace ID
{app="api", level="error"} |= "trace_id=abc123def456"

# Parse JSON and filter by status code
{app="api"} | json | status_code >= 500

# Count errors per minute by service
sum(count_over_time({level="error"}[1m])) by (app)

# Top 10 most frequent error messages
{level="error"} | json | line_format "{{.message}}" | topk(10, count_over_time({level="error"}[1h]))

# Detect log volume spikes (useful for anomaly detection)
sum(rate({namespace="production"}[5m])) by (app) > 2 *
sum(rate({namespace="production"}[5m] offset 1h)) by (app)

Monitoring Loki Itself

Just like any observability component, Loki needs to be monitored. It exposes Prometheus metrics.

# Ingestion rate in bytes per second
sum(rate(loki_distributor_bytes_received_total[5m]))

# Ingestion failures — should be zero
sum(rate(loki_distributor_ingester_append_failures_total[5m]))

# Query latency P99
histogram_quantile(0.99,
  sum(rate(loki_request_duration_seconds_bucket{route="loki_api_v1_query_range"}[5m])) by (le)
)

# Active streams count — watch for cardinality explosions
loki_ingester_memory_streams

groups:
  - name: loki
    rules:
      - alert: LokiIngestionFailures
        expr: sum(rate(loki_distributor_ingester_append_failures_total[5m])) > 0
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "Loki is failing to ingest logs"
          runbook: "https://wiki.internal/runbooks/loki-ingestion-failure"

      - alert: LokiHighStreamCount
        expr: loki_ingester_memory_streams > 100000
        for: 15m
        labels:
          severity: warning
        annotations:
          summary: "Loki stream count exceeding 100k — check for label cardinality issues"

Retention and Cost Control

The biggest operational win with Loki is controlling what you store.

Drop debug logs at the agent — covered above with Promtail pipeline stages.
Set per-tenant retention — different namespaces may have different compliance needs.
Use lifecycle policies on your object storage — belt and suspenders with Loki's compactor.
Monitor ingestion rate — set alerts when a service suddenly starts logging 10x more than usual.

Logs are the highest-volume telemetry signal. A single verbose Java service can generate more bytes per day than every metric in your Prometheus instance. Control the volume at the source, not at the storage layer.

The Practical Path

Start with Promtail collecting from a single namespace. Validate that logs arrive in Grafana and queries work. Then expand namespace by namespace, adding pipeline stages to parse and filter as needed.

Loki won't replace Elasticsearch for every use case — full-text search across months of data is still Elasticsearch's strength. But for the 90% of log queries that are "show me errors for this service in the last hour," Loki is faster to operate, cheaper to run, and fits naturally into your existing Grafana and Prometheus ecosystem.

Troubleshooting Common Loki Issues

Loki is simpler than Elasticsearch, but it has its own failure modes. Here's what to check when things go wrong.

Logs Not Appearing in Grafana

Work backwards from Grafana to the source:

# Step 1: Verify Promtail is running and tailing logs
kubectl get pods -n observability -l app.kubernetes.io/name=promtail
kubectl logs <promtail-pod> -n observability --tail=20

# Step 2: Check Promtail targets — are your pods being discovered?
# Port-forward to Promtail's HTTP endpoint
kubectl port-forward -n observability <promtail-pod> 3101:3101
curl -s http://localhost:3101/targets | jq '.[] | select(.labels.app == "your-app")'

# Step 3: Check if Loki is receiving data
curl -s http://loki-gateway.observability/loki/api/v1/labels | jq .
# If your app label isn't listed, Promtail isn't sending data for it

# Step 4: Check for rate limiting
kubectl logs -n observability -l app.kubernetes.io/name=loki-write --tail=50 | grep "rate limit"

Common root causes:

Symptom	Cause	Fix
No targets in Promtail	Missing `logging.enabled: "true"` annotation	Add annotation to pod spec
Targets exist but no logs	Promtail can't read log files	Check volume mounts on DaemonSet
`429 Too Many Requests`	Per-stream rate limit exceeded	Increase `per_stream_rate_limit` or reduce log volume
`entry out of order`	Timestamps are arriving non-sequentially	Enable `unordered_writes: true` in Loki config

The Out-of-Order Entries Problem

By default, Loki rejects log entries that arrive with timestamps older than the most recent entry for that stream. This happens frequently with pods that buffer logs or when Promtail restarts and replays its position file. Enable unordered writes to fix it:

# Add to values-loki.yaml under loki.config
loki:
  limits_config:
    unordered_writes: true
    max_query_length: 721h

This adds a small performance overhead but eliminates the most common source of dropped logs in production.

Multi-Tenant Loki for Team Isolation

When multiple teams share a Loki cluster, tenant isolation prevents noisy services from one team degrading query performance for everyone. Enable multi-tenancy and configure per-tenant limits:

# values-loki.yaml
loki:
  auth_enabled: true

  limits_config:
    # Default limits for all tenants
    retention_period: 744h
    ingestion_rate_mb: 4
    ingestion_burst_size_mb: 8
    per_stream_rate_limit: 3MB

  runtime_config:
    overrides:
      # Team with high log volume gets higher limits
      platform-team:
        ingestion_rate_mb: 20
        ingestion_burst_size_mb: 40
        per_stream_rate_limit: 10MB
        max_query_parallelism: 64

      # Team with lower needs gets standard limits
      frontend-team:
        ingestion_rate_mb: 4
        per_stream_rate_limit: 3MB
        max_query_parallelism: 16

Configure Promtail to set the tenant ID based on the namespace:

# values-promtail.yaml — add to scrape_configs
pipeline_stages:
  - tenant:
      source: namespace

Now each namespace's logs are isolated. The platform team's verbose debug logging can't exhaust the frontend team's query budget. Grafana passes the X-Scope-OrgID header to query a specific tenant:

# Query logs for a specific tenant
curl -H "X-Scope-OrgID: platform-team" \
  "http://loki-gateway.observability/loki/api/v1/query_range" \
  --data-urlencode 'query={app="api", level="error"}' \
  --data-urlencode 'start=1711000000000000000' \
  --data-urlencode 'end=1711100000000000000'

In Grafana, configure separate Loki data sources per tenant, each with a custom HTTP header for the org ID. This gives teams self-service access to their own logs without stepping on each other.

That's the kind of trade-off an SRE should be making: optimize for the common case, not the edge case.

On this page

Scalable Log Aggregation with Grafana Loki and Promtail

Why Elasticsearch Isn't the Only Answer Anymore

Architecture Overview

Deploying Loki with Helm

Deploying Promtail

Label Cardinality: The One Thing That Will Break Loki

Useful LogQL Queries

Monitoring Loki Itself

Retention and Cost Control

The Practical Path

Troubleshooting Common Loki Issues

Logs Not Appearing in Grafana

The Out-of-Order Entries Problem

Multi-Tenant Loki for Team Isolation

Related Articles

Building a Complete Prometheus + Grafana Monitoring Stack from Scratch

Grafana Loki: Log Aggregation Without the Price Tag

OpenTelemetry Collector: Deploying Your Observability Pipeline the Right Way

Designing Grafana Dashboards That SREs Actually Use

Elasticsearch Cluster Sizing for Production: Nodes, Shards, and Memory

PromQL: Cheat Sheet

More in Monitoring

Distributed Tracing With Jaeger: Pinpointing Latency Bottlenecks In Microservices

Prometheus Scrape Target Down: Diagnosing And Fixing "connection Refused" Errors Step By Step

DNS Troubleshooting for DevOps: dig, nslookup, and Common Failures

Prometheus Recording Rules: Fix Your Query Performance Before It Breaks Grafana

Discussion

Related Articles

Building a Complete Prometheus + Grafana Monitoring Stack from Scratch

OpenTelemetry Collector: Deploying Your Observability Pipeline the Right Way

Designing Grafana Dashboards That SREs Actually Use