Azure Core Services: The DevOps Engineer's Essential Guide
Azure is the cloud platform most DevOps engineers encounter when their organization runs Microsoft workloads, uses Active Directory, or has enterprise agreements in place. Even if you are primarily an AWS shop, understanding Azure is increasingly valuable as multi-cloud becomes the norm rather than the exception. Azure's hybrid cloud story, enterprise identity management, and deep integration with the Microsoft ecosystem give it strengths that other clouds struggle to match.
Azure Resource Model
Azure organizes resources differently from AWS, and understanding this hierarchy is critical before you deploy anything. Misunderstanding the resource model leads to billing surprises, access control gaps, and operational headaches.
The Hierarchy
Azure AD Tenant (Entra ID)
+-- Management Groups (optional, for governance at scale)
+-- Subscriptions (billing boundary, like AWS accounts)
+-- Resource Groups (logical container for related resources)
+-- Resources (VMs, storage accounts, databases, etc.)
Key Concepts Explained
Management Groups sit at the top and let you organize subscriptions into a hierarchy. You can apply Azure Policies and RBAC assignments to management groups, and they cascade down to all child subscriptions. A company might have a root management group, then child groups for Production, Non-Production, and Sandbox, each containing the relevant subscriptions.
Subscriptions are the primary billing boundary. They map roughly to AWS accounts. Each subscription has a trust relationship with exactly one Entra ID tenant. Common patterns include one subscription per environment (dev, staging, prod), one per team, or one per application. Subscriptions have resource limits (quotas) that can be increased by contacting Azure support.
Resource Groups are mandatory -- every resource belongs to exactly one. They serve as deployment targets, access control boundaries, and lifecycle management units. Delete a resource group and everything inside it gets deleted. This is both powerful and dangerous.
Cross-Cloud Resource Model Comparison
| Concept | Azure | AWS | GCP | Alibaba Cloud |
|---|---|---|---|---|
| Billing boundary | Subscription | Account | Project | Account |
| Logical grouping | Resource Group | Tags (no direct equivalent) | Labels, Folders | Resource Group |
| Identity provider | Entra ID (Azure AD) | IAM + Identity Center | Cloud Identity | IDaaS |
| Policy enforcement | Azure Policy | SCPs + Config Rules | Organization Policies | Config Rules |
| Region scoping | Per-resource | Per-resource | Per-resource | Per-resource |
| Top-level container | Management Group | Organization | Organization | Resource Directory |
# Create a resource group
az group create --name rg-production-web --location eastus
# List all resources in a group
az resource list --resource-group rg-production-web --output table
# Tag a resource group for cost tracking
az group update --name rg-production-web \
--tags Environment=production Team=platform CostCenter=CC-1234
# Delete a resource group (destructive - deletes everything inside)
az group delete --name rg-staging-test --yes --no-wait
Azure Policy
Azure Policy evaluates resources for compliance with organizational standards. Unlike SCPs in AWS (which only deny), Azure Policy can deny, audit, modify, and deploy resources.
# Assign a built-in policy to require tags on resource groups
az policy assignment create \
--name require-env-tag \
--display-name "Require Environment tag on resource groups" \
--policy "/providers/Microsoft.Authorization/policyDefinitions/96670d01-0a4d-4649-9c89-2d3abc0a5025" \
--scope /subscriptions/00000000-0000-0000-0000-000000000000 \
--params '{"tagName": {"value": "Environment"}}'
# List non-compliant resources
az policy state list \
--filter "complianceState eq 'NonCompliant'" \
--query "[].{Resource:resourceId, Policy:policyDefinitionName}" \
--output table
Common policies include enforcing resource tagging, restricting VM sizes (preventing expensive GPU instances in dev), requiring encryption on storage accounts, and restricting which regions resources can be deployed to.
Azure AD / Entra ID
Microsoft renamed Azure Active Directory to Microsoft Entra ID in 2023, but the functionality is the same. It is the identity backbone of Azure and integrates with Microsoft 365, on-premises Active Directory, and third-party applications via SAML and OIDC.
Key Concepts for DevOps
- Users and Groups -- human identities, organized into security groups for RBAC. Groups can be dynamic (membership based on attributes) or assigned.
- Service Principals -- application identities used by CI/CD pipelines and automation tools. Similar to AWS IAM roles for services. Created automatically when you register an application.
- Managed Identities -- Azure-managed service principals that automatically handle credential rotation. System-assigned (tied to a resource lifecycle) or user-assigned (independent lifecycle, reusable across resources).
- App Registrations -- how you register applications that need to authenticate against Entra ID. Each registration creates an application object and a service principal.
- Conditional Access -- policies that enforce MFA, device compliance, or location-based restrictions before granting access. Enterprise feature critical for security.
Always prefer managed identities over service principals with client secrets. Managed identities eliminate credential management entirely -- no secrets to rotate, no keys to leak.
# Create a user-assigned managed identity
az identity create \
--name mi-webapp-production \
--resource-group rg-production-web
# Get the principal ID for role assignments
PRINCIPAL_ID=$(az identity show \
--name mi-webapp-production \
--resource-group rg-production-web \
--query principalId -o tsv)
# Assign a role to the managed identity
az role assignment create \
--assignee "$PRINCIPAL_ID" \
--role "Storage Blob Data Reader" \
--scope /subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/rg-production-web
# Create a service principal for CI/CD (when managed identity is not possible)
az ad sp create-for-rbac \
--name sp-github-deploy \
--role Contributor \
--scopes /subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/rg-production-web \
--years 1
Federated Identity Credentials
For GitHub Actions and other external CI/CD systems, use federated identity credentials instead of client secrets. This eliminates the need to store Azure credentials as secrets in your CI/CD system.
# Create federated credential for GitHub Actions
az ad app federated-credential create \
--id "00000000-0000-0000-0000-000000000000" \
--parameters '{
"name": "github-main-branch",
"issuer": "https://token.actions.githubusercontent.com",
"subject": "repo:myorg/myrepo:ref:refs/heads/main",
"audiences": ["api://AzureADTokenExchange"]
}'
RBAC (Role-Based Access Control)
Azure RBAC is built into every resource. Assignments consist of three elements: who (user, group, or service principal), what (role definition), and where (scope). Permissions are additive -- if you have two role assignments, you get the union of both.
Built-in roles you will use constantly:
| Role | What It Allows | Scope Level |
|---|---|---|
| Reader | View everything, change nothing | Any |
| Contributor | Create and manage resources, but not assign roles | Any |
| Owner | Full access including role assignments | Any |
| User Access Administrator | Manage role assignments only | Any |
| AKS Cluster Admin | Full access to AKS cluster resources | AKS cluster |
| Storage Blob Data Contributor | Read/write blob data (not management plane) | Storage account |
| Key Vault Secrets User | Read secrets from Key Vault | Key Vault |
| Network Contributor | Manage networking resources | Any |
Important distinction: Contributor can manage resources but cannot grant access to others. Owner can do both. For CI/CD pipelines, Contributor is usually sufficient and follows least privilege.
# Assign Contributor role on a resource group
az role assignment create \
--assignee user@example.com \
--role "Contributor" \
--resource-group rg-production-web
# List all role assignments for a resource group
az role assignment list \
--resource-group rg-production-web \
--output table
# Create a custom role
az role definition create --role-definition '{
"Name": "VM Restart Operator",
"Description": "Can restart VMs but nothing else",
"Actions": [
"Microsoft.Compute/virtualMachines/restart/action",
"Microsoft.Compute/virtualMachines/read"
],
"AssignableScopes": ["/subscriptions/00000000-0000-0000-0000-000000000000"]
}'
Virtual Machines
Azure VMs are the compute foundation. The VM size naming convention tells you what you are getting, and understanding it saves time when selecting the right instance.
Understanding VM Size Names
The format is: [Family][Sub-family][vCPUs][Constrained vCPUs][Additive Features][Accelerator Type][Version]
Common families:
| Family | Optimized For | Example | Specs | Approx Cost/hr |
|---|---|---|---|---|
| B | Burstable (dev/test) | B2s | 2 vCPU, 4 GB RAM | ~$0.042 |
| D | General purpose | D4s_v5 | 4 vCPU, 16 GB RAM | ~$0.192 |
| E | Memory optimized | E8s_v5 | 8 vCPU, 64 GB RAM | ~$0.504 |
| F | Compute optimized | F4s_v2 | 4 vCPU, 8 GB RAM | ~$0.169 |
| L | Storage optimized | L8s_v3 | 8 vCPU, 64 GB, NVMe | ~$0.624 |
| N | GPU | NC6s_v3 | 6 vCPU, 112 GB, 1 GPU | ~$3.06 |
The s suffix means premium storage capable. The d suffix means local temp disk. Always use the latest version (v5, v6) for best price-performance. Older versions are not retired immediately, but they cost more per unit of performance.
Pricing Options
| Option | Savings | Commitment | Best For |
|---|---|---|---|
| Pay-as-you-go | 0% (baseline) | None | Unpredictable workloads |
| Reserved Instances | 30-72% | 1 or 3 years | Steady-state production |
| Savings Plans | Up to 65% | 1 or 3 years | Flexible across sizes |
| Spot VMs | Up to 90% | None (can be evicted) | Batch, CI/CD, fault-tolerant |
| Azure Hybrid Benefit | Up to 85% | Existing Windows Server or SQL Server licenses | Bring your own license |
| Dev/Test Pricing | Up to 55% | Visual Studio subscription | Non-production environments |
Azure Hybrid Benefit is a major differentiator for organizations with existing Microsoft licensing. If you have Windows Server licenses with Software Assurance, you can use them in Azure and pay only for the compute infrastructure, saving up to 85% when combined with Reserved Instances.
Availability and Scaling
- Availability Sets -- distribute VMs across fault domains and update domains within a single datacenter. Older approach, still valid for legacy configurations.
- Availability Zones -- physically separate datacenters within a region (typically 3 zones). Use for production workloads. 99.99% SLA.
- Virtual Machine Scale Sets (VMSS) -- auto-scaling groups of identical VMs. The Azure equivalent of AWS Auto Scaling Groups. Supports both Uniform (identical instances) and Flexible (heterogeneous) orchestration modes.
# Create a VM with availability zone and managed identity
az vm create \
--resource-group rg-production-web \
--name vm-web-01 \
--image Ubuntu2204 \
--size Standard_D2s_v5 \
--zone 1 \
--admin-username azureops \
--ssh-key-values ~/.ssh/id_rsa.pub \
--nsg nsg-web-servers \
--vnet-name vnet-production \
--subnet snet-app \
--assign-identity mi-webapp-production \
--os-disk-size-gb 64 \
--storage-sku Premium_LRS \
--tags Environment=production Team=platform \
--custom-data cloud-init.yml
# Create a scale set with rolling upgrades
az vmss create \
--resource-group rg-production-web \
--name vmss-web \
--image Ubuntu2204 \
--instance-count 2 \
--vm-sku Standard_D2s_v5 \
--zones 1 2 3 \
--admin-username azureops \
--ssh-key-values ~/.ssh/id_rsa.pub \
--lb lb-web-frontend \
--upgrade-policy-mode Rolling \
--max-batch-instance-percent 20 \
--pause-time-between-batches PT2S \
--health-probe /health
# Configure autoscaling for the scale set
az monitor autoscale create \
--resource-group rg-production-web \
--resource vmss-web \
--resource-type Microsoft.Compute/virtualMachineScaleSets \
--name autoscale-web \
--min-count 2 \
--max-count 10 \
--count 2
az monitor autoscale rule create \
--resource-group rg-production-web \
--autoscale-name autoscale-web \
--condition "Percentage CPU > 70 avg 5m" \
--scale out 2
az monitor autoscale rule create \
--resource-group rg-production-web \
--autoscale-name autoscale-web \
--condition "Percentage CPU < 30 avg 10m" \
--scale in 1
Managed Disks
| Disk Type | Max IOPS | Max Throughput | Use Case | Cost (per GB/mo) |
|---|---|---|---|---|
| Premium SSD v2 | Up to 80,000 | 1,200 MB/s | Mission-critical DBs | ~$0.082 + IOPS + throughput |
| Premium SSD | Up to 20,000 | 900 MB/s | Production workloads | ~$0.132 (P10 128GB) |
| Standard SSD | Up to 6,000 | 750 MB/s | Dev/test, web servers | ~$0.048 (E10 128GB) |
| Standard HDD | Up to 2,000 | 500 MB/s | Backups, infrequent access | ~$0.024 (S10 128GB) |
| Ultra Disk | Up to 160,000 | 4,000 MB/s | SAP HANA, top-tier DBs | ~$0.082 + IOPS + throughput |
Premium SSD v2 and Ultra Disks allow you to independently configure IOPS, throughput, and capacity -- you only pay for what you provision.
Virtual Networks (VNets)
VNets are the Azure networking foundation, equivalent to AWS VPCs. A VNet is regional -- it exists in one Azure region and cannot span regions (unlike GCP's global VPCs).
Standard Architecture
VNet: 10.0.0.0/16 (vnet-production)
|-- snet-gateway (10.0.0.0/24) -- VPN/ExpressRoute gateways
|-- snet-web (10.0.1.0/24) -- Application Gateway, frontends
|-- snet-app (10.0.2.0/24) -- Application VMs, containers
|-- snet-data (10.0.3.0/24) -- Databases, caches
|-- snet-aks (10.0.16.0/20) -- AKS nodes (large CIDR for pods)
|-- snet-pe (10.0.4.0/24) -- Private Endpoints
+-- AzureBastionSubnet (10.0.255.0/26) -- Bastion host (must use this exact name)
Azure reserves 5 IP addresses per subnet (first, last, and 3 for internal services). A /24 gives you 251 usable addresses.
Network Security Groups (NSGs)
NSGs are stateful firewalls applied to subnets or individual NICs. Unlike AWS where security groups and NACLs are separate concepts, NSGs combine both roles. Rules have priorities (100-4096), and lower numbers are evaluated first.
# Create an NSG
az network nsg create \
--resource-group rg-production-web \
--name nsg-app-tier
# Allow HTTPS from the web tier
az network nsg rule create \
--resource-group rg-production-web \
--nsg-name nsg-app-tier \
--name AllowHTTPS \
--priority 100 \
--direction Inbound \
--access Allow \
--protocol Tcp \
--source-address-prefixes 10.0.1.0/24 \
--destination-port-ranges 443
# Allow SSH from Bastion subnet only
az network nsg rule create \
--resource-group rg-production-web \
--nsg-name nsg-app-tier \
--name AllowSSHFromBastion \
--priority 110 \
--direction Inbound \
--access Allow \
--protocol Tcp \
--source-address-prefixes 10.0.255.0/26 \
--destination-port-ranges 22
# Allow Azure Load Balancer health probes
az network nsg rule create \
--resource-group rg-production-web \
--nsg-name nsg-app-tier \
--name AllowAzureLBProbes \
--priority 120 \
--direction Inbound \
--access Allow \
--protocol Tcp \
--source-address-prefixes AzureLoadBalancer \
--destination-port-ranges '*'
# Deny all other inbound
az network nsg rule create \
--resource-group rg-production-web \
--nsg-name nsg-app-tier \
--name DenyAllInbound \
--priority 4096 \
--direction Inbound \
--access Deny \
--protocol '*' \
--source-address-prefixes '*' \
--destination-port-ranges '*'
# Associate NSG with subnet
az network vnet subnet update \
--resource-group rg-production-web \
--vnet-name vnet-production \
--name snet-app \
--network-security-group nsg-app-tier
Private Endpoints
Private Endpoints bring Azure PaaS services (Storage, SQL, Key Vault, etc.) into your VNet with a private IP address. Traffic stays on the Azure backbone and never traverses the public internet.
# Create a private endpoint for a storage account
az network private-endpoint create \
--name pe-storage-prod \
--resource-group rg-production-web \
--vnet-name vnet-production \
--subnet snet-pe \
--private-connection-resource-id /subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/rg-production-web/providers/Microsoft.Storage/storageAccounts/stprodwebdata \
--group-id blob \
--connection-name pe-storage-connection
# Create private DNS zone for automatic resolution
az network private-dns zone create \
--resource-group rg-production-web \
--name privatelink.blob.core.windows.net
az network private-dns link vnet create \
--resource-group rg-production-web \
--zone-name privatelink.blob.core.windows.net \
--name link-vnet-prod \
--virtual-network vnet-production \
--registration-enabled false
VNet Peering and Hub-Spoke Topology
For multi-VNet environments, Azure supports VNet Peering (direct connection) and Azure Virtual WAN (managed hub-and-spoke). The classic hub-spoke topology uses a central hub VNet for shared services (firewall, VPN gateway, DNS) and spoke VNets for workloads.
# Create VNet peering (must be done from both sides)
az network vnet peering create \
--name peer-prod-to-hub \
--resource-group rg-production-web \
--vnet-name vnet-production \
--remote-vnet /subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/rg-shared/providers/Microsoft.Network/virtualNetworks/vnet-hub \
--allow-vnet-access \
--allow-forwarded-traffic \
--use-remote-gateways
az network vnet peering create \
--name peer-hub-to-prod \
--resource-group rg-shared \
--vnet-name vnet-hub \
--remote-vnet /subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/rg-production-web/providers/Microsoft.Network/virtualNetworks/vnet-production \
--allow-vnet-access \
--allow-forwarded-traffic \
--allow-gateway-transit
Networking Comparison Across Clouds
| Feature | Azure VNet | AWS VPC | GCP VPC | Alibaba VPC |
|---|---|---|---|---|
| Scope | Regional | Regional | Global | Regional |
| Firewall | NSGs (subnet/NIC) | Security Groups + NACLs | Firewall Rules (VPC) | Security Groups |
| Private PaaS access | Private Endpoints | VPC Endpoints | Private Service Connect | PrivateLink |
| Load Balancer L4 | Azure Load Balancer | NLB | TCP/UDP Load Balancer | CLB/NLB |
| Load Balancer L7 | Application Gateway | ALB | HTTP(S) Load Balancer | ALB |
| WAF | Azure WAF | AWS WAF | Cloud Armor | WAF |
| DDoS protection | Azure DDoS Protection | AWS Shield | Cloud Armor | Anti-DDoS |
| Hybrid connectivity | ExpressRoute | Direct Connect | Cloud Interconnect | Express Connect |
Azure Storage
Azure Storage is a single service with four sub-services, all sharing a storage account. A storage account is the top-level namespace (globally unique name) and the billing entity.
| Service | Type | Use Case | AWS Equivalent |
|---|---|---|---|
| Blob Storage | Object storage | Files, backups, static assets | S3 |
| File Storage | SMB/NFS file shares | Shared storage for VMs, lift-and-shift | EFS / FSx |
| Table Storage | NoSQL key-value | Simple structured data | DynamoDB (basic) |
| Queue Storage | Message queue | Decoupling application components | SQS |
Storage Account Redundancy
| Redundancy | Copies | Scope | SLA | Use Case |
|---|---|---|---|---|
| LRS | 3 | Single datacenter | 99.9% | Dev/test, easily re-creatable data |
| ZRS | 3 | Three availability zones | 99.9% | Production data, primary region HA |
| GRS | 6 | Two regions (3+3) | 99.9% (primary) | DR with secondary region failover |
| GZRS | 6 | Three zones + secondary region | 99.9% | Maximum durability and availability |
| RA-GRS | 6 | Two regions, read-access secondary | 99.99% | Read-heavy with DR requirements |
| RA-GZRS | 6 | Three zones + readable secondary | 99.99% | Highest tier, mission-critical |
Storage Account Tiers and Pricing
| Tier | Access Pattern | Storage Cost (per GB/mo, LRS) | Access Cost |
|---|---|---|---|
| Hot | Frequent access | ~$0.018 | Low |
| Cool | Infrequent (30+ days) | ~$0.01 | Moderate |
| Cold | Rare (90+ days) | ~$0.0036 | Higher |
| Archive | Long-term (180+ days) | ~$0.002 | Hours to rehydrate |
# Create a storage account with ZRS
az storage account create \
--name stprodwebdata \
--resource-group rg-production-web \
--location eastus \
--sku Standard_ZRS \
--kind StorageV2 \
--access-tier Hot \
--min-tls-version TLS1_2 \
--allow-blob-public-access false \
--require-infrastructure-encryption true
# Create a blob container
az storage container create \
--name app-assets \
--account-name stprodwebdata \
--auth-mode login
# Set lifecycle management policy
az storage account management-policy create \
--account-name stprodwebdata \
--resource-group rg-production-web \
--policy '{
"rules": [
{
"name": "archive-old-data",
"type": "Lifecycle",
"definition": {
"filters": { "blobTypes": ["blockBlob"], "prefixMatch": ["logs/"] },
"actions": {
"baseBlob": {
"tierToCool": { "daysAfterModificationGreaterThan": 30 },
"tierToCold": { "daysAfterModificationGreaterThan": 90 },
"tierToArchive": { "daysAfterModificationGreaterThan": 180 },
"delete": { "daysAfterModificationGreaterThan": 730 }
}
}
}
}
]
}'
# Upload with azcopy (much faster for large transfers)
azcopy copy './dist/*' \
'https://stprodwebdata.blob.core.windows.net/app-assets/' \
--recursive \
--put-md5
Immutable Storage
For compliance and audit requirements, Azure Blob Storage supports immutable policies that prevent modification or deletion for a specified period:
# Set a time-based retention policy (WORM)
az storage container immutability-policy create \
--account-name stprodauditlogs \
--container-name audit-trails \
--period 365
AKS: Azure Kubernetes Service
AKS is the managed Kubernetes offering. Azure manages the control plane for free -- you only pay for worker nodes. This is a meaningful cost advantage over AWS EKS ($73/month for the control plane).
Cluster Architecture Decisions
| Feature | Decision | Recommendation |
|---|---|---|
| Network plugin | kubenet vs Azure CNI vs Azure CNI Overlay | Azure CNI Overlay for most (pod IPs from overlay, saves VNet IPs) |
| Network policy | Calico vs Azure NPM vs Cilium | Cilium for best performance and observability |
| Identity | Managed Identity vs Service Principal | Managed Identity always |
| Pod identity | Workload Identity vs Pod Identity (deprecated) | Workload Identity |
| Ingress | NGINX vs Application Gateway Ingress | AGIC for Azure-native, NGINX for portability |
| Node OS | Ubuntu vs Azure Linux (Mariner) | Azure Linux for smaller attack surface |
# Create an AKS cluster with best practices
az aks create \
--resource-group rg-production-web \
--name aks-production \
--node-count 3 \
--node-vm-size Standard_D4s_v5 \
--zones 1 2 3 \
--network-plugin azure \
--network-plugin-mode overlay \
--network-dataplane cilium \
--network-policy cilium \
--pod-cidr 192.168.0.0/16 \
--vnet-subnet-id /subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/rg-production-web/providers/Microsoft.Network/virtualNetworks/vnet-production/subnets/snet-aks \
--enable-managed-identity \
--enable-oidc-issuer \
--enable-workload-identity \
--enable-cluster-autoscaler \
--min-count 2 \
--max-count 10 \
--kubernetes-version 1.29 \
--os-sku AzureLinux \
--tier standard \
--enable-defender \
--enable-azure-monitor-metrics \
--generate-ssh-keys
# Get credentials
az aks get-credentials --resource-group rg-production-web --name aks-production
# Verify
kubectl get nodes -o wide
# Add a spot node pool for non-critical workloads
az aks nodepool add \
--resource-group rg-production-web \
--cluster-name aks-production \
--name spotnodes \
--priority Spot \
--eviction-policy Delete \
--spot-max-price -1 \
--node-vm-size Standard_D4s_v5 \
--enable-cluster-autoscaler \
--min-count 0 \
--max-count 10 \
--node-taints "kubernetes.azure.com/scalesetpriority=spot:NoSchedule" \
--labels workload-type=batch
AKS Workload Identity
Workload Identity is the recommended way for pods to authenticate to Azure services. It replaces the deprecated AAD Pod Identity.
# Create a user-assigned managed identity for the workload
az identity create \
--name mi-app-workload \
--resource-group rg-production-web
# Create a federated credential linking the Kubernetes service account
az identity federated-credential create \
--name fed-cred-app \
--identity-name mi-app-workload \
--resource-group rg-production-web \
--issuer "$(az aks show --name aks-production --resource-group rg-production-web --query oidcIssuerProfile.issuerUrl -o tsv)" \
--subject system:serviceaccount:production:app-backend \
--audiences api://AzureADTokenExchange
# Grant the identity access to a Key Vault
az keyvault set-policy \
--name kv-production \
--object-id "$(az identity show --name mi-app-workload --resource-group rg-production-web --query principalId -o tsv)" \
--secret-permissions get list
Then in your Kubernetes deployment, annotate the service account:
apiVersion: v1
kind: ServiceAccount
metadata:
name: app-backend
namespace: production
annotations:
azure.workload.identity/client-id: "00000000-0000-0000-0000-000000000000"
labels:
azure.workload.identity/use: "true"
App Service
App Service is the PaaS compute option -- deploy web applications without managing VMs or containers directly. It supports .NET, Java, Node.js, Python, PHP, and custom containers. App Service is ideal for teams that want to focus on application code rather than infrastructure management.
App Service Plans and Pricing
| Tier | Features | Use Case | Starting Price |
|---|---|---|---|
| Free (F1) | 60 min/day CPU, 1 GB RAM | Exploration | $0 |
| Basic (B1) | Custom domains, manual scale | Dev/test | ~$13/mo |
| Standard (S1) | Autoscale, staging slots, daily backups | Production | ~$73/mo |
| Premium (P1v3) | Zone redundancy, more scale, better performance | High-traffic production | ~$138/mo |
| Isolated (I1v2) | Dedicated VNet, ASE | Compliance, isolation | ~$298/mo |
# Create an App Service plan and web app
az appservice plan create \
--name asp-production \
--resource-group rg-production-web \
--sku P1v3 \
--is-linux \
--zone-redundant
az webapp create \
--name webapp-production-001 \
--resource-group rg-production-web \
--plan asp-production \
--runtime "NODE:20-lts" \
--assign-identity mi-webapp-production
# Configure deployment slots for blue-green deployments
az webapp deployment slot create \
--name webapp-production-001 \
--resource-group rg-production-web \
--slot staging
# Deploy to staging slot
az webapp deployment source config-zip \
--name webapp-production-001 \
--resource-group rg-production-web \
--slot staging \
--src app.zip
# Swap staging to production (zero-downtime)
az webapp deployment slot swap \
--name webapp-production-001 \
--resource-group rg-production-web \
--slot staging \
--target-slot production
# Deploy from a container image
az webapp config container set \
--name webapp-production-001 \
--resource-group rg-production-web \
--container-image-name myregistry.azurecr.io/webapp:v1.2.3 \
--container-registry-url https://myregistry.azurecr.io
App Configuration and Secrets
# Set application settings (environment variables)
az webapp config appsettings set \
--name webapp-production-001 \
--resource-group rg-production-web \
--settings NODE_ENV=production LOG_LEVEL=info
# Reference Key Vault secrets in app settings
az webapp config appsettings set \
--name webapp-production-001 \
--resource-group rg-production-web \
--settings DB_PASSWORD="@Microsoft.KeyVault(SecretUri=https://kv-production.vault.azure.net/secrets/db-password/)"
Azure Functions
Azure Functions is the serverless compute offering, equivalent to AWS Lambda. Supports C#, JavaScript/TypeScript, Python, Java, PowerShell, and Go.
Hosting Plans Comparison
| Plan | Cold Starts | Max Execution | VNet Integration | Pricing |
|---|---|---|---|---|
| Consumption | Yes (seconds) | 10 min | Limited | Per-execution ($0.20/M + GB-s) |
| Flex Consumption | Reduced | 10 min | Yes | Per-execution + always-ready |
| Premium (EP1) | No (pre-warmed) | Unlimited | Yes | ~$173/mo + scale |
| Dedicated | No | Unlimited | Yes | App Service Plan pricing |
The Consumption plan is cheapest for sporadic workloads. Premium is necessary when you need consistent latency, VNet integration, or long-running executions. Flex Consumption is the newest option, offering a middle ground with per-execution pricing but faster cold starts.
# Create a Function App on Consumption plan
az functionapp create \
--name func-event-processor \
--resource-group rg-production-web \
--storage-account stprodwebdata \
--consumption-plan-location eastus \
--runtime node \
--runtime-version 20 \
--functions-version 4 \
--assign-identity mi-webapp-production
# Deploy function code
func azure functionapp publish func-event-processor
Azure DevOps
Azure DevOps is a complete platform covering the entire development lifecycle. It competes with GitHub (which Microsoft also owns), GitLab, and Jenkins.
- Azure Repos -- Git repositories with pull request workflows and branch policies.
- Azure Pipelines -- CI/CD pipelines with YAML definitions. Supports any language, platform, and cloud.
- Azure Boards -- work item tracking with Scrum, Kanban, and custom process templates.
- Azure Artifacts -- package management for npm, NuGet, Maven, pip, and universal packages.
- Azure Test Plans -- manual and exploratory test management.
Pipeline YAML with Best Practices
trigger:
branches:
include: [main]
paths:
exclude: ['docs/*', '*.md']
variables:
- group: production-secrets
- name: imageRepository
value: 'webapp'
- name: dockerRegistryServiceConnection
value: 'acr-connection'
pool:
vmImage: 'ubuntu-latest'
stages:
- stage: Build
jobs:
- job: BuildAndTest
steps:
- task: NodeTool@0
inputs:
versionSpec: '20.x'
- script: npm ci && npm run lint && npm test
displayName: 'Install, Lint, Test'
- task: Docker@2
displayName: 'Build and Push Image'
inputs:
containerRegistry: '$(dockerRegistryServiceConnection)'
repository: '$(imageRepository)'
command: 'buildAndPush'
tags: |
$(Build.BuildId)
latest
- publish: $(Build.ArtifactStagingDirectory)
artifact: drop
- stage: DeployStaging
dependsOn: Build
jobs:
- deployment: Staging
environment: 'staging'
strategy:
runOnce:
deploy:
steps:
- task: AzureWebAppContainer@1
inputs:
azureSubscription: 'staging-connection'
appName: 'webapp-staging-001'
containers: 'myregistry.azurecr.io/webapp:$(Build.BuildId)'
- stage: DeployProduction
dependsOn: DeployStaging
condition: and(succeeded(), eq(variables['Build.SourceBranch'], 'refs/heads/main'))
jobs:
- deployment: Production
environment: 'production'
strategy:
runOnce:
deploy:
steps:
- task: AzureWebAppContainer@1
inputs:
azureSubscription: 'production-connection'
appName: 'webapp-production-001'
containers: 'myregistry.azurecr.io/webapp:$(Build.BuildId)'
Azure DevOps vs GitHub Actions
| Feature | Azure DevOps Pipelines | GitHub Actions |
|---|---|---|
| Self-hosted agents | Yes (any OS) | Yes (any OS) |
| YAML pipelines | Yes | Yes |
| Marketplace extensions | 1,000+ | 20,000+ |
| Environments with approvals | Yes | Yes |
| Multi-stage pipelines | Native | Reusable workflows |
| Parallel jobs (free) | 1 (1,800 min/mo) | 2,000 min/mo |
| On-premises support | Azure DevOps Server | GitHub Enterprise Server |
| Integration with Azure | Deep, native | Good, via actions |
Azure CLI and Cloud Shell
The Azure CLI (az) is your primary automation interface. It follows a consistent az [service] [subcommand] pattern that is easy to learn.
# Login and set subscription
az login
az account set --subscription "Production"
az account show --output table
# Common queries
az vm list --resource-group rg-production-web --output table
az aks list --output table
# Use JMESPath queries for filtering
az vm list \
--query "[?tags.Environment=='production'].[name,resourceGroup,hardwareProfile.vmSize]" \
--output table
# Resource Graph queries (fast, cross-subscription)
az graph query -q "
Resources
| where type =~ 'microsoft.compute/virtualmachines'
| where tags.Environment == 'production'
| project name, resourceGroup, location, properties.hardwareProfile.vmSize
| order by name asc
" --output table
# Cloud Shell (browser-based terminal)
# Pre-authenticated, pre-installed with az, kubectl, terraform, ansible
# Access at https://shell.azure.com
Bicep and ARM Templates
While Terraform is the dominant multi-cloud IaC tool, Azure-native teams often use Bicep, which compiles to ARM templates:
// main.bicep - Simplified Azure resource definition
param location string = resourceGroup().location
param environmentName string = 'production'
resource storageAccount 'Microsoft.Storage/storageAccounts@2023-01-01' = {
name: 'st${environmentName}data'
location: location
sku: { name: 'Standard_ZRS' }
kind: 'StorageV2'
properties: {
minimumTlsVersion: 'TLS1_2'
allowBlobPublicAccess: false
}
}
# Deploy a Bicep file
az deployment group create \
--resource-group rg-production-web \
--template-file main.bicep \
--parameters environmentName=production
Cost Management
Azure provides built-in cost management tools that are more integrated than what AWS offers natively.
# Set a budget with alerts
az consumption budget create \
--budget-name MonthlyBudget \
--amount 5000 \
--category Cost \
--time-grain Monthly \
--start-date 2026-03-01 \
--end-date 2027-03-01
# View current costs by resource group
az costmanagement query \
--type Usage \
--scope /subscriptions/00000000-0000-0000-0000-000000000000 \
--timeframe MonthToDate \
--dataset-grouping name=ResourceGroup type=Dimension
Cost Optimization Strategies
- Azure Advisor -- flags underutilized VMs, idle resources, and potential savings from reserved instances. Check it weekly.
- Azure Hybrid Benefit -- use existing Windows Server and SQL Server licenses. Saves up to 85%.
- Reserved Instances -- 1-year (30-40% savings) or 3-year (55-72% savings) commitments for predictable workloads.
- Spot VMs -- up to 90% savings for interruptible workloads. VMSS supports mixed Spot/regular configurations.
- Right-sizing -- Azure Advisor identifies VMs with consistently low CPU. Downsize aggressively.
- Auto-shutdown -- schedule dev/test VMs to shut down outside business hours.
- Resource locks -- prevent accidental deletion of critical (expensive) resources.
- Storage lifecycle policies -- automatically transition infrequently accessed blobs to cheaper tiers.
- Dev/Test pricing -- Visual Studio subscribers get discounted rates for non-production workloads.
- Azure Savings Plans -- commit to a consistent hourly spend for compute across regions and instance types.
# Auto-shutdown a dev VM at 7 PM
az vm auto-shutdown \
--resource-group rg-dev \
--name vm-dev-01 \
--time 1900
# Apply a resource lock on a production resource group
az lock create \
--name DoNotDelete \
--resource-group rg-production-web \
--lock-type CanNotDelete
Migration Considerations
When migrating to Azure from on-premises or another cloud:
- Azure Migrate -- the central hub for assessment, migration tracking, and modernization tools. Supports VMware, Hyper-V, physical servers, and databases.
- Azure Site Recovery (ASR) -- replicates VMs from on-premises or other clouds to Azure. Also provides disaster recovery between Azure regions.
- Azure Database Migration Service -- handles online (minimal downtime) and offline migrations for SQL Server, MySQL, PostgreSQL, and MongoDB.
- Azure Arc -- extends Azure management to on-premises servers, Kubernetes clusters, and other clouds. Deploy Azure services anywhere and manage them from the Azure portal.
- Azure File Sync -- syncs on-premises file servers with Azure File Storage. Supports cloud tiering to keep frequently accessed files local.
Azure's strength lies in its enterprise integration, hybrid cloud capabilities with Azure Arc, and deep Microsoft ecosystem support. The Entra ID integration alone is a deciding factor for many organizations. Even if it is not your primary cloud, knowing Azure makes you a more versatile DevOps engineer and opens doors in enterprise environments where Microsoft dominance is a reality.
Senior Kubernetes Architect
10+ years orchestrating containers in production. Battle-tested opinions on everything from pod scheduling to service mesh. I've seen clusters burn and helped rebuild them better.
Related Articles
AWS Core Services: The DevOps Engineer's Essential Guide
Navigate the essential AWS building blocks — EC2, S3, VPC, IAM, RDS, Lambda, and EKS explained for DevOps engineers with practical examples.
Ansible Dynamic Inventory: Automating Cloud Infrastructure
Use dynamic inventories to automatically discover and manage cloud infrastructure — AWS EC2, Azure VMs, and GCP instances with Ansible inventory plugins.
HAProxy Load Balancing: From Installation to Production
Configure HAProxy for HTTP and TCP load balancing — installation, frontends, backends, health checks, ACLs, SSL termination, and the stats dashboard.