Istio Traffic Management: Routing, Canary, and Circuit Breaking

Traffic management is where Istio delivers the most immediate, visible value. Instead of building routing logic into your application or relying on Kubernetes' basic round-robin service discovery, Istio lets you define sophisticated routing rules declaratively. You can split traffic between service versions, inject faults for chaos testing, configure circuit breakers, and roll out canary deployments, all without touching a single line of application code. What would traditionally require a complex API gateway configuration, custom load balancer rules, or application-level routing libraries becomes a set of YAML resources that any team member can understand and modify.

This guide covers the core traffic management resources in depth, walks through practical patterns you will use in production, and provides the operational context needed to implement these patterns safely at scale.

Traffic Management Architecture

Before diving into individual resources, understanding how Istio's traffic management components interact is essential.

External Traffic
     |
     v
[ Gateway ] -- Configures which ports/protocols/hosts to accept
     |
     v
[ VirtualService ] -- Routes traffic based on match rules (URI, headers, etc.)
     |
     v
[ DestinationRule ] -- Applies policies after routing (load balancing, circuit breaking)
     |
     v
[ Envoy Proxy ] -- Executes the compiled configuration
     |
     v
[ Kubernetes Service / Endpoints ] -- Final destination pods

Configuration evaluation order matters:

Gateway decides whether to accept the connection
VirtualService determines which destination receives the traffic
DestinationRule applies traffic policies (load balancing, connection pools, outlier detection)
Envoy proxy executes the compiled configuration

For mesh-internal traffic (service-to-service), the Gateway step is skipped. VirtualServices can operate on both gateway-bound and mesh-internal traffic.

VirtualService Deep Dive

A VirtualService defines how requests are routed to a destination service. It sits between the client and the Kubernetes Service, intercepting traffic and applying routing rules before the request reaches any pod.

Basic HTTP Routing

apiVersion: networking.istio.io/v1
kind: VirtualService
metadata:
  name: reviews
spec:
  hosts:
    - reviews  # Kubernetes service name
  http:
    - route:
        - destination:
            host: reviews
            subset: v2

The hosts field specifies which service this VirtualService applies to. It can be a short Kubernetes service name, a fully qualified name (reviews.default.svc.cluster.local), or a wildcard (*.example.com).

Header-Based Routing

Route specific users to a different version based on request headers:

apiVersion: networking.istio.io/v1
kind: VirtualService
metadata:
  name: reviews
spec:
  hosts:
    - reviews
  http:
    # Internal testers get v3
    - match:
        - headers:
            x-user-group:
              exact: internal-testers
      route:
        - destination:
            host: reviews
            subset: v3
    # Beta users get v2
    - match:
        - headers:
            x-user-group:
              exact: beta
        - headers:
            cookie:
              regex: ".*beta=true.*"
      route:
        - destination:
            host: reviews
            subset: v2
    # Everyone else gets v1
    - route:
        - destination:
            host: reviews
            subset: v1

Match rules within a single match block are ANDed together. Multiple match blocks under the same rule are ORed. Rules are evaluated top-to-bottom, and the first match wins.

URI Matching

Route based on request path with prefix, exact, or regex matching:

apiVersion: networking.istio.io/v1
kind: VirtualService
metadata:
  name: api-gateway
spec:
  hosts:
    - api.example.com
  gateways:
    - api-gateway
  http:
    # API v2 gets the new service
    - match:
        - uri:
            prefix: /api/v2/users
      route:
        - destination:
            host: users-service-v2
            port:
              number: 8080
      timeout: 10s

    # API v1 goes to legacy
    - match:
        - uri:
            prefix: /api/v1
      route:
        - destination:
            host: legacy-api
            port:
              number: 8080
      timeout: 30s  # Legacy is slower

    # Health checks
    - match:
        - uri:
            regex: "^/health(z)?$"
      route:
        - destination:
            host: health-checker

    # Static assets with longer timeout
    - match:
        - uri:
            prefix: /static
      route:
        - destination:
            host: cdn-service
      headers:
        response:
          set:
            cache-control: "public, max-age=86400"

    # Default catch-all
    - route:
        - destination:
            host: frontend-service
            port:
              number: 3000

URL Rewriting

Rewrite the request path before forwarding to the upstream service:

http:
  - match:
      - uri:
          prefix: /api/v2/catalog
    rewrite:
      uri: /catalog
    route:
      - destination:
          host: catalog-service
          port:
            number: 8080
  # Rewrite authority (Host header) for cross-cluster routing
  - match:
      - uri:
          prefix: /external-api
    rewrite:
      uri: /api
      authority: external-service.partner.com
    route:
      - destination:
          host: external-service

Request Header Manipulation

http:
  - route:
      - destination:
          host: backend-service
    headers:
      request:
        set:
          x-forwarded-client-cert: "%DOWNSTREAM_PEER_CERT_V_START%"
          x-request-start: "%START_TIME(%s.%3f)%"
        add:
          x-custom-trace: "mesh-routed"
        remove:
          - x-internal-debug
      response:
        set:
          strict-transport-security: "max-age=31536000; includeSubDomains"
          x-content-type-options: "nosniff"
        remove:
          - server
          - x-powered-by

DestinationRule Deep Dive

A DestinationRule defines policies applied to traffic after routing has occurred. It configures subsets (versions), load balancing, connection pools, and outlier detection (circuit breaking).

Defining Subsets

Subsets map to different versions of a service using Kubernetes labels:

apiVersion: networking.istio.io/v1
kind: DestinationRule
metadata:
  name: reviews
spec:
  host: reviews
  subsets:
    - name: v1
      labels:
        version: v1
      trafficPolicy:
        connectionPool:
          tcp:
            maxConnections: 100
    - name: v2
      labels:
        version: v2
      trafficPolicy:
        connectionPool:
          tcp:
            maxConnections: 50  # New version gets lower limits initially
    - name: v3
      labels:
        version: v3

The corresponding Kubernetes deployments must have matching labels:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: reviews-v2
spec:
  replicas: 3
  selector:
    matchLabels:
      app: reviews
      version: v2
  template:
    metadata:
      labels:
        app: reviews
        version: v2
    spec:
      containers:
        - name: reviews
          image: reviews:2.0
          ports:
            - containerPort: 8080
          resources:
            requests:
              cpu: 100m
              memory: 128Mi
            limits:
              cpu: 500m
              memory: 256Mi

Both deployments serve the same Kubernetes Service (reviews), which selects pods by app: reviews. The version label differentiates them for Istio routing.

Load Balancing Policies

apiVersion: networking.istio.io/v1
kind: DestinationRule
metadata:
  name: reviews
spec:
  host: reviews
  trafficPolicy:
    loadBalancer:
      simple: LEAST_REQUEST  # Default for the service
  subsets:
    - name: v1
      labels:
        version: v1
    - name: v2
      labels:
        version: v2
      trafficPolicy:
        loadBalancer:
          simple: ROUND_ROBIN  # Override for this subset

Available load balancing algorithms:

Algorithm	Behavior	Best For
ROUND_ROBIN	Even distribution across all endpoints	Uniform workloads
LEAST_REQUEST	Routes to the endpoint with fewest active requests	Variable latency workloads
RANDOM	Random endpoint selection	Simple, stateless services
PASSTHROUGH	Direct connection to the original destination IP	When you need to bypass Envoy's load balancing

Consistent Hashing (Session Affinity)

For services that benefit from sticky sessions:

apiVersion: networking.istio.io/v1
kind: DestinationRule
metadata:
  name: user-service
spec:
  host: user-service
  trafficPolicy:
    loadBalancer:
      consistentHash:
        httpHeaderName: x-user-id
        # Or use one of these alternatives:
        # httpCookie:
        #   name: session-id
        #   ttl: 3600s
        # useSourceIp: true
        # httpQueryParameterName: user_id
    connectionPool:
      http:
        http2MaxRequests: 1000

Connection Pool Settings

Control how many connections and requests can be made to a service. These settings are critical for preventing resource exhaustion:

apiVersion: networking.istio.io/v1
kind: DestinationRule
metadata:
  name: reviews
spec:
  host: reviews
  trafficPolicy:
    connectionPool:
      tcp:
        maxConnections: 100       # Max TCP connections
        connectTimeout: 30ms      # TCP connect timeout
        tcpKeepalive:
          time: 7200s             # Keepalive interval
          interval: 75s           # Keepalive probe interval
      http:
        h2UpgradePolicy: DEFAULT  # Upgrade HTTP/1.1 to HTTP/2 when possible
        http1MaxPendingRequests: 100  # Max pending HTTP/1.1 requests
        http2MaxRequests: 1000    # Max active HTTP/2 requests
        maxRequestsPerConnection: 10  # Reuse connections up to 10 requests
        maxRetries: 3             # Max concurrent retries
        idleTimeout: 300s         # Close idle connections after 5 minutes

Gateway Configuration

The Gateway resource configures the Istio ingress gateway to accept external traffic:

apiVersion: networking.istio.io/v1
kind: Gateway
metadata:
  name: production-gateway
spec:
  selector:
    istio: ingress
  servers:
    # HTTPS with TLS termination
    - port:
        number: 443
        name: https
        protocol: HTTPS
      tls:
        mode: SIMPLE
        credentialName: production-tls  # Secret in the gateway namespace
      hosts:
        - "app.example.com"
        - "api.example.com"
    # HTTP with redirect to HTTPS
    - port:
        number: 80
        name: http
        protocol: HTTP
      hosts:
        - "app.example.com"
        - "api.example.com"
      tls:
        httpsRedirect: true
    # Mutual TLS for partner APIs
    - port:
        number: 443
        name: https-mtls
        protocol: HTTPS
      tls:
        mode: MUTUAL
        credentialName: partner-mtls-cert
      hosts:
        - "partner-api.example.com"

TLS Modes

Mode	Description	Use Case
SIMPLE	Standard TLS (server cert only)	Public-facing websites and APIs
MUTUAL	Mutual TLS (server and client certs)	Partner APIs, B2B integrations
PASSTHROUGH	TLS not terminated, forwarded as-is	When the backend handles its own TLS
AUTO_PASSTHROUGH	Like PASSTHROUGH but with SNI-based routing	Multi-cluster gateways
ISTIO_MUTUAL	Istio internal mTLS	East-west gateways between clusters

Traffic Splitting for Canary Deployments

Canary deployments route a small percentage of traffic to a new version while the majority continues to the stable version. This is one of Istio's most powerful capabilities.

Weighted Routing

apiVersion: networking.istio.io/v1
kind: VirtualService
metadata:
  name: reviews
spec:
  hosts:
    - reviews
  http:
    - route:
        - destination:
            host: reviews
            subset: v1
          weight: 90
        - destination:
            host: reviews
            subset: v2
          weight: 10

Progressive Canary Rollout with Automated Health Checks

#!/bin/bash
# canary-rollout.sh - Progressive canary with automated health monitoring
set -euo pipefail

SERVICE="reviews"
NAMESPACE="default"
PROMETHEUS_URL="http://prometheus:9090"
CANARY_WEIGHTS=(5 10 25 50 75 100)
OBSERVATION_PERIOD=300  # 5 minutes between steps
ERROR_THRESHOLD=0.01    # 1% error rate threshold
LATENCY_THRESHOLD=500   # 500ms P99 latency threshold

check_error_rate() {
    local error_rate
    error_rate=$(curl -s "${PROMETHEUS_URL}/api/v1/query" \
      --data-urlencode "query=sum(rate(istio_requests_total{destination_service='${SERVICE}.${NAMESPACE}.svc.cluster.local',response_code=~'5..'}[5m])) / sum(rate(istio_requests_total{destination_service='${SERVICE}.${NAMESPACE}.svc.cluster.local'}[5m]))" \
      | jq -r '.data.result[0].value[1] // "0"')

    echo "Current error rate: ${error_rate}"
    if (( $(echo "${error_rate} > ${ERROR_THRESHOLD}" | bc -l) )); then
        return 1
    fi
    return 0
}

check_latency() {
    local p99_latency
    p99_latency=$(curl -s "${PROMETHEUS_URL}/api/v1/query" \
      --data-urlencode "query=histogram_quantile(0.99, sum(rate(istio_request_duration_milliseconds_bucket{destination_service='${SERVICE}.${NAMESPACE}.svc.cluster.local',reporter='destination'}[5m])) by (le))" \
      | jq -r '.data.result[0].value[1] // "0"')

    echo "Current P99 latency: ${p99_latency}ms"
    if (( $(echo "${p99_latency} > ${LATENCY_THRESHOLD}" | bc -l) )); then
        return 1
    fi
    return 0
}

rollback() {
    echo "ROLLING BACK: Setting 100% traffic to v1"
    kubectl apply -f - <<EOF
apiVersion: networking.istio.io/v1
kind: VirtualService
metadata:
  name: ${SERVICE}
  namespace: ${NAMESPACE}
spec:
  hosts:
    - ${SERVICE}
  http:
    - route:
        - destination:
            host: ${SERVICE}
            subset: v1
          weight: 100
EOF
    echo "Rollback complete."
    exit 1
}

for weight in "${CANARY_WEIGHTS[@]}"; do
    stable_weight=$((100 - weight))

    echo "=== Setting canary to ${weight}% (stable: ${stable_weight}%) ==="

    kubectl apply -f - <<EOF
apiVersion: networking.istio.io/v1
kind: VirtualService
metadata:
  name: ${SERVICE}
  namespace: ${NAMESPACE}
spec:
  hosts:
    - ${SERVICE}
  http:
    - route:
        - destination:
            host: ${SERVICE}
            subset: v1
          weight: ${stable_weight}
        - destination:
            host: ${SERVICE}
            subset: v2
          weight: ${weight}
EOF

    if [ "${weight}" -eq 100 ]; then
        echo "Canary rollout complete. v2 is now receiving 100% of traffic."
        exit 0
    fi

    echo "Observing for ${OBSERVATION_PERIOD} seconds..."
    sleep "${OBSERVATION_PERIOD}"

    if ! check_error_rate; then
        echo "ERROR: Error rate exceeds threshold"
        rollback
    fi

    if ! check_latency; then
        echo "ERROR: Latency exceeds threshold"
        rollback
    fi

    echo "Health checks passed. Proceeding to next step."
done

Traffic Mirroring (Shadowing)

Mirror production traffic to a new version for testing without affecting users:

apiVersion: networking.istio.io/v1
kind: VirtualService
metadata:
  name: reviews
spec:
  hosts:
    - reviews
  http:
    - route:
        - destination:
            host: reviews
            subset: v1
      mirror:
        host: reviews
        subset: v2
      mirrorPercentage:
        value: 100.0

Mirrored requests are fire-and-forget. Responses from the mirrored destination are discarded. The request is sent asynchronously, so it does not add latency to the primary request path. This lets you validate that v2 handles production traffic correctly by checking its logs, metrics, and error rates.

Use cases for traffic mirroring:

Scenario	Mirror Percentage	Duration
New database migration validation	100%	Until all queries verified
Performance regression testing	50%	24-48 hours
Shadow deployment before canary	100%	1-2 hours
Ongoing A/B data collection	10%	Indefinite

Fault Injection

Test your application's resilience by injecting controlled failures. This is essential for chaos engineering and validating that your services handle degraded dependencies gracefully.

Delay Injection

apiVersion: networking.istio.io/v1
kind: VirtualService
metadata:
  name: ratings
spec:
  hosts:
    - ratings
  http:
    - fault:
        delay:
          percentage:
            value: 50.0
          fixedDelay: 3s
      route:
        - destination:
            host: ratings
            subset: v1

Abort Injection

apiVersion: networking.istio.io/v1
kind: VirtualService
metadata:
  name: ratings
spec:
  hosts:
    - ratings
  http:
    - fault:
        abort:
          percentage:
            value: 10.0
          httpStatus: 503
      route:
        - destination:
            host: ratings
            subset: v1

Combined Faults with Targeted Injection

Inject faults only for specific users or test traffic:

apiVersion: networking.istio.io/v1
kind: VirtualService
metadata:
  name: ratings
spec:
  hosts:
    - ratings
  http:
    # Chaos testing traffic gets faults
    - match:
        - headers:
            x-chaos-test:
              exact: "true"
      fault:
        delay:
          percentage:
            value: 50.0
          fixedDelay: 5s
        abort:
          percentage:
            value: 10.0
          httpStatus: 500
      route:
        - destination:
            host: ratings
            subset: v1
    # Normal traffic is unaffected
    - route:
        - destination:
            host: ratings
            subset: v1

Use fault injection during chaos engineering exercises to verify that upstream services handle degraded dependencies gracefully with proper timeouts, fallbacks, and circuit breaking.

Retries and Timeouts

Configure automatic retries and request timeouts. These are among the most important traffic management settings for production reliability.

apiVersion: networking.istio.io/v1
kind: VirtualService
metadata:
  name: reviews
spec:
  hosts:
    - reviews
  http:
    - timeout: 10s
      retries:
        attempts: 3
        perTryTimeout: 3s
        retryOn: gateway-error,connect-failure,refused-stream,503,retriable-4xx
        retryRemoteLocalities: true  # Try endpoints in other zones on retry
      route:
        - destination:
            host: reviews
            subset: v2

Key considerations for retry configuration:

Setting	Recommendation	Reason
`perTryTimeout`	Less than `timeout / (attempts + 1)`	Leaves room for all retry attempts within the overall timeout
`retryOn`	Include `gateway-error,connect-failure,refused-stream`	These are safe to retry since the request was never processed
`503` in `retryOn`	Include for idempotent endpoints only	503 may mean the request was partially processed
`retriable-4xx`	Only include if your service returns 409 on transient conflicts	Avoids retrying permanent client errors
Max attempts	2-3 for synchronous requests	More retries increase total latency budget

Critical warning: Be cautious with retries on non-idempotent operations (POST, DELETE, PATCH). If a request times out but was actually processed, a retry will execute the operation twice. Only enable retries on these methods when your service is idempotent or uses idempotency keys.

Per-Route Timeout Strategy

http:
  # Read operations: short timeout, aggressive retries
  - match:
      - method:
          exact: GET
    timeout: 5s
    retries:
      attempts: 3
      perTryTimeout: 1500ms
      retryOn: gateway-error,connect-failure,refused-stream,503
    route:
      - destination:
          host: api-service

  # Write operations: longer timeout, no retries
  - match:
      - method:
          exact: POST
      - method:
          exact: PUT
      - method:
          exact: DELETE
    timeout: 30s
    retries:
      attempts: 0  # No retries for writes
    route:
      - destination:
          host: api-service

Circuit Breaking Configuration

Circuit breaking prevents cascading failures by stopping requests to an unhealthy service. Istio implements circuit breaking through connection pool limits and outlier detection.

apiVersion: networking.istio.io/v1
kind: DestinationRule
metadata:
  name: reviews
spec:
  host: reviews
  trafficPolicy:
    connectionPool:
      tcp:
        maxConnections: 100
        connectTimeout: 100ms
      http:
        http1MaxPendingRequests: 50
        http2MaxRequests: 100
        maxRequestsPerConnection: 10
        maxRetries: 3
    outlierDetection:
      consecutive5xxErrors: 5
      interval: 10s
      baseEjectionTime: 30s
      maxEjectionPercent: 50
      minHealthPercent: 30
      splitExternalLocalOriginErrors: true
      consecutiveLocalOriginFailures: 5

How Outlier Detection Works

Parameter	Meaning	Recommended Value
`consecutive5xxErrors`	Eject a pod after N consecutive 5xx errors	3-5
`interval`	How often to evaluate endpoints for ejection	10-30s
`baseEjectionTime`	Minimum time an ejected pod stays out	30-60s
`maxEjectionPercent`	Never eject more than this % of pods	30-50%
`minHealthPercent`	Disable ejection if fewer than this % are healthy	30%
`splitExternalLocalOriginErrors`	Distinguish between upstream 5xx and local errors	true
`consecutiveLocalOriginFailures`	Eject on connection failures (not just 5xx)	3-5

The ejection time increases exponentially with each consecutive ejection of the same pod: baseEjectionTime * number_of_ejections. This prevents a pod from being immediately re-ejected but allows quick recovery if the issue was transient.

Testing Circuit Breaking

Use a load testing tool to verify your circuit breaker configuration:

# Deploy a simple load generator
kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
  name: fortio
spec:
  replicas: 1
  selector:
    matchLabels:
      app: fortio
  template:
    metadata:
      labels:
        app: fortio
    spec:
      containers:
        - name: fortio
          image: fortio/fortio
          ports:
            - containerPort: 8080
EOF

# Generate load with many concurrent connections
kubectl exec deploy/fortio -c fortio -- \
  fortio load -c 50 -qps 0 -n 1000 -loglevel Warning \
  http://reviews:8080/api/reviews

# Check for circuit breaker triggers in proxy stats
kubectl exec deploy/fortio -c istio-proxy -- \
  pilot-agent request GET stats | grep "upstream_rq_pending_overflow"

If you see upstream_rq_pending_overflow counts increasing, the circuit breaker is activating and rejecting excess requests. This protects the downstream service from overload.

Rate Limiting

Istio does not have a built-in rate limiting CRD, but you can configure local rate limiting via EnvoyFilter or use an external rate limit service.

Local Rate Limiting via EnvoyFilter

apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: rate-limit
  namespace: istio-system
spec:
  workloadSelector:
    labels:
      istio: ingress
  configPatches:
    - applyTo: HTTP_FILTER
      match:
        context: GATEWAY
        listener:
          filterChain:
            filter:
              name: envoy.filters.network.http_connection_manager
              subFilter:
                name: envoy.filters.http.router
      patch:
        operation: INSERT_BEFORE
        value:
          name: envoy.filters.http.local_ratelimit
          typed_config:
            "@type": type.googleapis.com/envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit
            stat_prefix: http_local_rate_limiter
            token_bucket:
              max_tokens: 1000
              tokens_per_fill: 1000
              fill_interval: 60s
            filter_enabled:
              runtime_key: local_rate_limit_enabled
              default_value:
                numerator: 100
                denominator: HUNDRED
            filter_enforced:
              runtime_key: local_rate_limit_enforced
              default_value:
                numerator: 100
                denominator: HUNDRED
            response_headers_to_add:
              - append_action: OVERWRITE_IF_EXISTS_OR_ADD
                header:
                  key: x-rate-limited
                  value: "true"
            status:
              code: TooManyRequests

External Rate Limit Service

For production rate limiting with per-user or per-API-key limits, deploy a dedicated rate limit service:

# Deploy the rate limit service
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ratelimit
  namespace: istio-system
spec:
  replicas: 2
  selector:
    matchLabels:
      app: ratelimit
  template:
    metadata:
      labels:
        app: ratelimit
    spec:
      containers:
        - name: ratelimit
          image: envoyproxy/ratelimit:latest
          ports:
            - containerPort: 8080  # HTTP
            - containerPort: 8081  # gRPC
          env:
            - name: REDIS_SOCKET_TYPE
              value: tcp
            - name: REDIS_URL
              value: redis:6379
            - name: RUNTIME_ROOT
              value: /data
            - name: RUNTIME_SUBDIRECTORY
              value: ratelimit
          volumeMounts:
            - name: config
              mountPath: /data/ratelimit/config
      volumes:
        - name: config
          configMap:
            name: ratelimit-config
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: ratelimit-config
  namespace: istio-system
data:
  config.yaml: |
    domain: production
    descriptors:
      - key: header_match
        value: api
        rate_limit:
          unit: minute
          requests_per_unit: 100
      - key: remote_address
        rate_limit:
          unit: second
          requests_per_unit: 10

Practical Canary Deployment Walkthrough

Here is a complete end-to-end canary deployment with all the necessary resources.

Step 1: Ensure Both Versions Are Deployed

# Deploy v1 (already running)
kubectl get deployment reviews-v1

# Deploy v2 alongside v1
kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
  name: reviews-v2
spec:
  replicas: 2
  selector:
    matchLabels:
      app: reviews
      version: v2
  template:
    metadata:
      labels:
        app: reviews
        version: v2
    spec:
      containers:
        - name: reviews
          image: reviews:2.0
          ports:
            - containerPort: 8080
          readinessProbe:
            httpGet:
              path: /health
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 10
          livenessProbe:
            httpGet:
              path: /health
              port: 8080
            initialDelaySeconds: 15
            periodSeconds: 20
EOF

# Wait for v2 to be ready
kubectl rollout status deployment/reviews-v2 --timeout=120s

Step 2: Create DestinationRule with Subsets

kubectl apply -f - <<EOF
apiVersion: networking.istio.io/v1
kind: DestinationRule
metadata:
  name: reviews
spec:
  host: reviews
  trafficPolicy:
    connectionPool:
      tcp:
        maxConnections: 100
      http:
        http1MaxPendingRequests: 50
        http2MaxRequests: 100
    outlierDetection:
      consecutive5xxErrors: 3
      interval: 10s
      baseEjectionTime: 30s
      maxEjectionPercent: 50
  subsets:
    - name: v1
      labels:
        version: v1
    - name: v2
      labels:
        version: v2
EOF

Step 3: Start with 100% to v1

kubectl apply -f - <<EOF
apiVersion: networking.istio.io/v1
kind: VirtualService
metadata:
  name: reviews
spec:
  hosts:
    - reviews
  http:
    - route:
        - destination:
            host: reviews
            subset: v1
          weight: 100
        - destination:
            host: reviews
            subset: v2
          weight: 0
EOF

Step 4: Shift Traffic Gradually

# 10% canary
kubectl patch virtualservice reviews --type merge -p '
spec:
  http:
  - route:
    - destination:
        host: reviews
        subset: v1
      weight: 90
    - destination:
        host: reviews
        subset: v2
      weight: 10'

Step 5: Monitor and Complete

Watch error rates and latency in your observability stack. If metrics look good, increase to 25%, 50%, 75%, then 100%. If anything goes wrong, set v1 back to 100%.

# Monitor canary health in real time
watch -n 5 'kubectl exec deploy/prometheus-server -n monitoring -- \
  wget -qO- "http://localhost:9090/api/v1/query?query=sum(rate(istio_requests_total{destination_app=\"reviews\",response_code=~\"5..\"}[1m]))by(destination_version)" 2>/dev/null | jq .'

Debugging Traffic Management Issues

# Check if VirtualService is being applied
istioctl analyze -n default

# View effective routes for a specific proxy
istioctl proxy-config routes deploy/reviews-v1 -o json

# Check if destination subsets match deployment labels
kubectl get pods -l app=reviews --show-labels

# View Envoy access logs for a specific pod
kubectl logs deploy/reviews-v1 -c istio-proxy --tail=50

# Test routing from within the mesh
kubectl exec deploy/sleep -- curl -s -H "x-user-group: internal-testers" http://reviews:8080/api/reviews

Summary

Istio's traffic management capabilities eliminate the need for application-level routing logic. Start with VirtualServices and DestinationRules for basic routing, use traffic splitting for safe canary deployments, configure circuit breakers to prevent cascading failures, and use fault injection to validate your resilience. Every rule is declarative, version-controlled, and takes effect without redeploying your applications. The key to success is starting simple --- route all traffic to v1, then gradually introduce traffic splitting --- and always having rollback ready. Monitor error rates and latency at every step, and automate the canary progression with health checks. The combination of traffic mirroring for validation, canary deployment for safe rollout, and circuit breaking for protection gives you a robust deployment strategy that minimizes risk while maximizing deployment velocity.

On this page

Related Articles

Istio Installation & Architecture: Your First Service Mesh

Envoy Traffic Management: Retries, Timeouts, Canary Deployments, and Rate Limiting

Istio Service Mesh: Installation, Traffic Management, and mTLS

Istio mTLS & Security: Zero-Trust Service Communication

Istio Observability: Kiali, Jaeger, and Prometheus Integration

Istio AuthorizationPolicy For Namespace-Level RBAC In Multi-Tenant Kubernetes Clusters

More in Istio

Istio Pilot Discovery XDS Push Errors: Fixing "No Healthy Upstream" In Multi-Cluster Mesh Deployments

Istio EnvoyFilter Custom Configuration For Advanced HTTP Header Manipulation

Istio Gateway Returns 503 Service Unavailable: Debugging VirtualService And DestinationRule Misconfigurations

Fix Istio Sidecar Injection Not Working

Discussion