DevOpsil

Alibaba Cloud for DevOps: ECS, ACK, and the China Cloud Ecosystem

Aareez AsifAareez Asif27 min read

If your infrastructure serves users in mainland China, Alibaba Cloud is not optional -- it is the default choice. China's internet regulations, data residency laws, and the Great Firewall create an environment where AWS, Azure, and GCP either cannot operate fully or deliver subpar performance. Alibaba Cloud is the largest cloud provider in Asia-Pacific and the third largest globally. Understanding it makes you a more complete DevOps engineer, and for companies expanding into Asian markets, it is a requirement, not a nice-to-have.

Why Alibaba Cloud Matters

The China Factor

Running workloads in China is fundamentally different from running them anywhere else. The regulatory and technical landscape creates challenges that only a China-native cloud provider can fully address:

  • ICP License Requirement -- to host a website accessible from mainland China, you need an Internet Content Provider (ICP) license filed with the Chinese government. This is a legal requirement with no exceptions. Alibaba Cloud helps facilitate this process through their console, typically taking 2-4 weeks. Without an ICP license, Chinese ISPs will block your domain.
  • Data Residency -- China's Cybersecurity Law (2017), Data Security Law (2021), and Personal Information Protection Law (PIPL, 2021) require certain data to remain within Chinese borders. Cross-border data transfers require security assessments. Alibaba Cloud has multiple regions within mainland China (Beijing, Shanghai, Shenzhen, Hangzhou, Zhangjiakou, Hohhot, and more).
  • The Great Firewall -- connections to services outside China (including AWS, GCP, Azure global regions) are unreliable and slow. DNS resolution, API calls, package downloads (npm, pip, Docker Hub), and even Git operations all suffer. Response times of 500ms-2000ms to services outside China are common. Within Alibaba Cloud's China regions, latency is typically 1-10ms.
  • Content Delivery -- CDNs must have Points of Presence (PoPs) within China to serve Chinese users effectively. International CDNs like CloudFront or Cloudflare perform poorly in China without a separate China configuration.
  • Payment Processing -- Alipay and WeChat Pay are the dominant payment methods. Alibaba Cloud integrates natively with these systems.

Beyond China

Alibaba Cloud also competes well in Southeast Asia, the Middle East, and other markets where it has invested heavily in regional infrastructure:

RegionLocationsStrengths
ChinaBeijing, Shanghai, Shenzhen, Hangzhou, Zhangjiakou, Hohhot, Chengdu, Heyuan, Wulanchabu, Nanjing, FuzhouMost China regions of any provider
Asia-PacificSingapore, Jakarta, Mumbai, Hong Kong, Tokyo, Sydney, Kuala Lumpur, ManilaStrong presence, competitive pricing
Middle EastDubai, RiyadhGrowing market, government partnerships
EuropeFrankfurt, LondonEU data residency compliance
AmericasSilicon Valley, VirginiaFor China-outbound traffic

Alibaba Cloud Account Structure

Alibaba Cloud uses a Resource Directory for multi-account governance, similar to AWS Organizations:

Resource Directory
|-- Root Folder
|   |-- Folder: Production
|   |   |-- Account: prod-web (China regions)
|   |   +-- Account: prod-intl (International regions)
|   |-- Folder: Staging
|   |   +-- Account: staging-all
|   +-- Folder: Shared
|       +-- Account: shared-services

An important distinction: Alibaba Cloud China and Alibaba Cloud International are separate platforms with separate accounts. A China account (aliyun.com) accesses China regions, while an International account (alibabacloud.com) accesses international regions. You need both if you serve users in China and globally.

ECS: Elastic Compute Service

ECS is Alibaba Cloud's VM service, directly comparable to AWS EC2 or Azure VMs. The service is mature, well-documented, and follows familiar patterns if you have experience with other clouds.

Instance Families

FamilyUse CaseExampleSpecsApprox Cost (cn-shanghai)
ecs.t6Burstable, dev/testecs.t6-c1m2.large2 vCPU, 4 GB~CNY 0.15/hr
ecs.g7General purposeecs.g7.xlarge4 vCPU, 16 GB~CNY 0.90/hr
ecs.g8iLatest gen Intelecs.g8i.xlarge4 vCPU, 16 GB~CNY 0.85/hr
ecs.c7Compute optimizedecs.c7.2xlarge8 vCPU, 16 GB~CNY 1.20/hr
ecs.r7Memory optimizedecs.r7.xlarge4 vCPU, 32 GB~CNY 1.10/hr
ecs.gn7GPU (inference)ecs.gn7i-c8g1.2xlarge8 vCPU, 30 GB, 1 GPU~CNY 8.50/hr
ecs.ebmBare metalecs.ebmg7.32xlarge128 vCPU, 512 GB~CNY 22.00/hr

The naming convention follows the pattern: ecs.[family][generation].[size]. The g prefix means general purpose, c is compute, r is memory, similar to AWS conventions. The generation number matters -- always choose the latest generation available for best price-performance.

Pricing Options

OptionSavingsCommitmentBest For
Pay-As-You-Go0% (baseline)NoneUnpredictable workloads
Subscription15-60%1 month to 3 yearsPredictable production workloads
Preemptible InstancesUp to 90%None (can be reclaimed)Batch processing, CI/CD
Reserved InstancesUp to 55%1 or 3 yearsFlexible commitment
Savings PlansUp to 57%1 or 3 yearsCross-instance flexibility

Subscription pricing is unique to Alibaba Cloud and common in the China market. You prepay for a fixed period (monthly, yearly) and receive significant discounts. Most production workloads in China use subscription pricing.

Creating and Managing ECS Instances

Alibaba Cloud provides the aliyun CLI (also called Alibaba Cloud CLI):

# Configure the CLI
aliyun configure set \
  --profile production \
  --mode AK \
  --region cn-shanghai \
  --access-key-id LTAI5tXXXXXXXXXXXX \
  --access-key-secret XXXXXXXXXXXXXXXXXXXXXXXX

# Create an ECS instance
aliyun ecs CreateInstance \
  --RegionId cn-shanghai \
  --ZoneId cn-shanghai-b \
  --InstanceType ecs.g7.xlarge \
  --ImageId ubuntu_22_04_x64_20G_alibase_20230907.vhd \
  --SecurityGroupId sg-bp1abc123def456 \
  --VSwitchId vsw-bp1abc123 \
  --InstanceName web-server-01 \
  --HostName web-server-01 \
  --InternetMaxBandwidthOut 10 \
  --SystemDiskCategory cloud_essd \
  --SystemDiskSize 50 \
  --KeyPairName my-key-pair \
  --Tag.1.Key Environment \
  --Tag.1.Value production \
  --Tag.2.Key Team \
  --Tag.2.Value platform

# Start the instance
aliyun ecs StartInstance --InstanceId i-bp1abc123def456

# List running instances
aliyun ecs DescribeInstances \
  --RegionId cn-shanghai \
  --Status Running \
  --output cols=InstanceId,InstanceName,Status,PublicIpAddress

# Stop an instance
aliyun ecs StopInstance --InstanceId i-bp1abc123def456

# Describe instance details
aliyun ecs DescribeInstanceAttribute --InstanceId i-bp1abc123def456

Disk Types

Disk TypeMax IOPSMax ThroughputUse CaseCost (per GB/mo, China)
ESSD PL010,000180 MB/sDev/test~CNY 0.50
ESSD PL150,000350 MB/sMost production workloads~CNY 1.00
ESSD PL2100,000750 MB/sDatabase workloads~CNY 2.00
ESSD PL31,000,0004,000 MB/sHigh-performance databases~CNY 4.00
Cloud SSD25,000300 MB/sStandard SSD~CNY 1.00
Cloud Efficiency5,000140 MB/sBulk storage~CNY 0.35

ESSD (Enhanced SSD) uses NVMe technology backed by RDMA networking. For production workloads, ESSD PL1 is the standard choice. Alibaba Cloud's ESSD performance levels are a notable advantage -- you can scale IOPS independently of capacity.

Terraform Support

Most DevOps teams will manage Alibaba Cloud resources through Terraform, which has a mature Alibaba Cloud provider with excellent coverage:

terraform {
  required_providers {
    alicloud = {
      source  = "aliyun/alicloud"
      version = "~> 1.220"
    }
  }

  backend "oss" {
    bucket   = "terraform-state-prod"
    prefix   = "web-app"
    region   = "cn-shanghai"
    encrypt  = true
  }
}

provider "alicloud" {
  region = "cn-shanghai"
}

resource "alicloud_instance" "web_server" {
  instance_name        = "web-server-01"
  instance_type        = "ecs.g7.xlarge"
  image_id             = "ubuntu_22_04_x64_20G_alibase_20230907.vhd"
  security_groups      = [alicloud_security_group.web.id]
  vswitch_id           = alicloud_vswitch.app.id
  system_disk_category = "cloud_essd"
  system_disk_size     = 50

  internet_max_bandwidth_out = 10

  key_name = alicloud_key_pair.deployer.key_name

  user_data = base64encode(file("${path.module}/scripts/bootstrap.sh"))

  tags = {
    Environment = "production"
    Team        = "platform"
    ManagedBy   = "terraform"
  }
}

# Auto Scaling Group
resource "alicloud_ess_scaling_group" "web" {
  scaling_group_name = "web-scaling-group"
  min_size           = 2
  max_size           = 10
  desired_capacity   = 3
  vswitch_ids        = [alicloud_vswitch.app_a.id, alicloud_vswitch.app_b.id]
  removal_policies   = ["OldestScalingConfiguration", "OldestInstance"]
  multi_az_policy    = "BALANCE"

  lifecycle {
    ignore_changes = [desired_capacity]
  }
}

resource "alicloud_ess_scaling_rule" "scale_out" {
  scaling_group_id  = alicloud_ess_scaling_group.web.id
  scaling_rule_name = "scale-out-cpu"
  scaling_rule_type = "TargetTrackingScalingRule"
  target_value      = 70.0
  metric_name       = "CpuUtilization"
}

ACK: Container Service for Kubernetes

ACK (Alibaba Cloud Container Service for Kubernetes) is the managed Kubernetes offering. It is fully CNCF-certified and comes in three flavors, each suited to different operational models.

ACK Variants

VariantControl PlaneWorker NodesBest ForCost
ACK ManagedAlibaba managesYou manage (ECS)Standard production useFree control plane + ECS nodes
ACK ProAlibaba manages (enhanced SLA, etcd backup)You manage (ECS)Large-scale, mission-critical~CNY 3,600/yr + ECS nodes
ACK ServerlessAlibaba managesElastic Container InstancesVariable workloads, no node opsPer-pod pricing

ACK Pro includes features that matter for production: managed etcd with automatic backups, enhanced monitoring, Sandboxed-Container support for stronger isolation, and 99.95% SLA on the control plane.

ACK Networking

ACK supports two CNI plugins:

  • Flannel -- simple overlay network. Pods get IPs from a separate CIDR. Lower performance, simpler setup. Good for small clusters.
  • Terway -- Alibaba Cloud's advanced CNI. Pods get real VPC IP addresses (like AWS VPC CNI). Supports network policies natively. Better performance and security. Recommended for production.
# Create a managed Kubernetes cluster via CLI
aliyun cs CreateCluster \
  --ClusterType ManagedKubernetes \
  --Name ack-production \
  --RegionId cn-shanghai \
  --ZoneId cn-shanghai-b \
  --VpcId vpc-bp1abc123 \
  --VSwitchIds '["vsw-bp1abc123"]' \
  --ContainerCidr 172.20.0.0/16 \
  --ServiceCidr 172.21.0.0/20 \
  --NumOfNodes 3 \
  --WorkerInstanceTypes '["ecs.g7.xlarge"]' \
  --WorkerSystemDiskCategory cloud_essd \
  --WorkerSystemDiskSize 120 \
  --KeyPair my-key-pair \
  --SnatEntry true \
  --Addons '[{"name":"terway-eniip"},{"name":"csi-plugin"},{"name":"csi-provisioner"},{"name":"nginx-ingress-controller","config":"{\"IngressSlbNetworkType\":\"intranet\"}"}]'

With Terraform:

resource "alicloud_cs_managed_kubernetes" "production" {
  name         = "ack-production"
  cluster_spec = "ack.pro.small"
  version      = "1.28.9-aliyun.1"

  pod_cidr       = "172.20.0.0/16"
  service_cidr   = "172.21.0.0/20"
  slb_internet_enabled = false

  worker_vswitch_ids = [
    alicloud_vswitch.app_a.id,
    alicloud_vswitch.app_b.id,
  ]

  dynamic "addons" {
    for_each = [
      { name = "terway-eniip", config = "" },
      { name = "csi-plugin", config = "" },
      { name = "csi-provisioner", config = "" },
      { name = "nginx-ingress-controller", config = jsonencode({ IngressSlbNetworkType = "intranet" }) },
      { name = "arms-prometheus", config = "" }
    ]
    content {
      name   = addons.value.name
      config = addons.value.config
    }
  }

  maintenance_window {
    enable            = true
    maintenance_time  = "04:00:00Z"
    duration          = "4h"
    weekly_period     = "Saturday"
  }
}

resource "alicloud_cs_kubernetes_node_pool" "workers" {
  cluster_id           = alicloud_cs_managed_kubernetes.production.id
  name                 = "worker-pool"
  vswitch_ids          = [alicloud_vswitch.app_a.id, alicloud_vswitch.app_b.id]
  instance_types       = ["ecs.g7.xlarge"]
  system_disk_category = "cloud_essd"
  system_disk_size     = 120
  desired_size         = 3
  key_name             = alicloud_key_pair.deployer.key_name

  scaling_config {
    min_size = 2
    max_size = 10
  }

  labels = {
    "workload-type" = "production"
  }

  taints {
    key    = "dedicated"
    value  = "production"
    effect = "NoSchedule"
  }

  management {
    auto_repair  = true
    auto_upgrade = true
    max_unavailable = 1
  }
}

# Spot node pool for batch workloads
resource "alicloud_cs_kubernetes_node_pool" "spot_workers" {
  cluster_id     = alicloud_cs_managed_kubernetes.production.id
  name           = "spot-pool"
  vswitch_ids    = [alicloud_vswitch.app_a.id]
  instance_types = ["ecs.g7.xlarge", "ecs.g7.2xlarge"]
  desired_size   = 0

  spot_strategy    = "SpotWithPriceLimit"
  spot_price_limit {
    instance_type = "ecs.g7.xlarge"
    price_limit   = "0.5"
  }

  scaling_config {
    min_size = 0
    max_size = 20
  }

  labels = {
    "workload-type" = "batch"
  }

  taints {
    key    = "spot"
    value  = "true"
    effect = "NoSchedule"
  }
}

Kubernetes Cross-Cloud Comparison

FeatureACK (Alibaba)EKS (AWS)AKS (Azure)GKE (GCP)
Control plane costFree (Managed) / ~CNY 3,600/yr (Pro)$73/moFreeFree (Autopilot) / $73/mo (Standard)
Pod networkingTerway (VPC IPs) or FlannelVPC CNIAzure CNI or KubenetGKE VPC-native
Serverless podsECIFargateACIAutopilot
Max nodes5,0005,0005,00015,000
China regions10+2 (Beijing, Ningxia via NWCD/Sinnet)3 (China East, North, East 2)0
Container runtimecontainerdcontainerdcontainerdcontainerd
Sandboxed containersYes (runV)No (native)No (native)GKE Sandbox (gVisor)

OSS: Object Storage Service

OSS is Alibaba Cloud's object storage, equivalent to AWS S3. It supports the same concepts: buckets, objects, storage classes, lifecycle policies, and cross-region replication. OSS also provides an S3-compatible API, making migrations from AWS easier.

Storage Classes

ClassUse CaseMinimum StorageMonthly Cost (per GB, cn-shanghai)Retrieval Fee
StandardFrequent accessNone~CNY 0.12None
Infrequent AccessMonthly access30 days~CNY 0.08CNY 0.0325/GB
ArchiveQuarterly access60 days~CNY 0.033CNY 0.06/GB (1 min to restore)
Cold ArchiveRare access180 days~CNY 0.015CNY 0.10/GB (1-5 hours to restore)
Deep Cold ArchiveExtremely rare180 days~CNY 0.0075CNY 0.14/GB (12 hours to restore)

OSS Operations

# Create a bucket
aliyun oss mb oss://prod-app-data --region cn-shanghai --storage-class Standard

# Upload files
aliyun oss cp ./dist/ oss://prod-app-data/assets/ --recursive

# Sync a directory (like aws s3 sync)
aliyun oss sync ./build/ oss://prod-app-data/static/ --delete --include '*.js' --include '*.css'

# Download files
aliyun oss cp oss://prod-app-data/config/app.yaml ./config/

# Set lifecycle rules
aliyun oss bucket-lifecycle --method put oss://prod-app-logs \
  --lifecycle '{
    "Rule": [
      {
        "ID": "ArchiveOldLogs",
        "Prefix": "logs/",
        "Status": "Enabled",
        "Transition": [
          { "Days": 30, "StorageClass": "IA" },
          { "Days": 90, "StorageClass": "Archive" },
          { "Days": 365, "StorageClass": "ColdArchive" }
        ],
        "Expiration": { "Days": 1095 }
      }
    ]
  }'

# Enable versioning
aliyun oss bucket-versioning --method put oss://prod-terraform-state --versioning-configuration Enabled

# Configure cross-region replication for DR
aliyun oss bucket-replication --method put oss://prod-app-data \
  --replication-configuration '{
    "Rule": {
      "Action": "ALL",
      "Destination": {
        "Bucket": "prod-app-data-dr",
        "Location": "oss-cn-beijing"
      }
    }
  }'

With Terraform:

resource "alicloud_oss_bucket" "app_data" {
  bucket = "prod-app-data"
  acl    = "private"

  server_side_encryption_rule {
    sse_algorithm = "AES256"
  }

  lifecycle_rule {
    id      = "archive-old-data"
    enabled = true
    prefix  = "logs/"

    transitions {
      days          = 30
      storage_class = "IA"
    }
    transitions {
      days          = 90
      storage_class = "Archive"
    }
    expiration {
      days = 1095
    }
  }

  versioning {
    status = "Enabled"
  }

  cors_rule {
    allowed_origins = ["https://app.example.com"]
    allowed_methods = ["GET", "HEAD"]
    allowed_headers = ["*"]
    max_age_seconds = 3600
  }

  tags = {
    Environment = "production"
    Team        = "platform"
  }
}

S3 Compatibility

OSS provides an S3-compatible endpoint, which means tools like aws s3, boto3, and other S3 SDKs can work with OSS by changing the endpoint:

# Use AWS CLI with OSS (S3-compatible endpoint)
aws s3 ls s3://prod-app-data \
  --endpoint-url https://oss-cn-shanghai.aliyuncs.com

aws s3 cp ./file.txt s3://prod-app-data/uploads/ \
  --endpoint-url https://oss-cn-shanghai.aliyuncs.com

This compatibility simplifies migrations and allows teams to use familiar tools.

VPC and Networking

Alibaba Cloud VPCs follow the same regional model as AWS. A VPC contains VSwitches (their term for subnets), and security groups control traffic. Each VSwitch maps to a single availability zone.

Network Architecture

VPC: 10.0.0.0/8 (vpc-production, cn-shanghai)
|-- VSwitch: vsw-web-a   (10.0.1.0/24)  -- cn-shanghai-a
|-- VSwitch: vsw-web-b   (10.0.2.0/24)  -- cn-shanghai-b
|-- VSwitch: vsw-app-a   (10.0.11.0/24) -- cn-shanghai-a
|-- VSwitch: vsw-app-b   (10.0.12.0/24) -- cn-shanghai-b
|-- VSwitch: vsw-data-a  (10.0.21.0/24) -- cn-shanghai-a
|-- VSwitch: vsw-data-b  (10.0.22.0/24) -- cn-shanghai-b
+-- VSwitch: vsw-k8s     (10.0.32.0/20) -- cn-shanghai-b

VPC and Security Groups with Terraform

resource "alicloud_vpc" "production" {
  vpc_name   = "vpc-production"
  cidr_block = "10.0.0.0/8"

  tags = {
    Environment = "production"
  }
}

resource "alicloud_vswitch" "app_a" {
  vswitch_name = "vsw-app-a"
  vpc_id       = alicloud_vpc.production.id
  cidr_block   = "10.0.11.0/24"
  zone_id      = "cn-shanghai-a"
}

resource "alicloud_vswitch" "app_b" {
  vswitch_name = "vsw-app-b"
  vpc_id       = alicloud_vpc.production.id
  cidr_block   = "10.0.12.0/24"
  zone_id      = "cn-shanghai-b"
}

resource "alicloud_security_group" "web" {
  name        = "sg-web-servers"
  vpc_id      = alicloud_vpc.production.id
  description = "Security group for web-facing servers"
}

resource "alicloud_security_group_rule" "allow_https" {
  type              = "ingress"
  ip_protocol       = "tcp"
  port_range        = "443/443"
  cidr_ip           = "0.0.0.0/0"
  security_group_id = alicloud_security_group.web.id
  description       = "Allow HTTPS from internet"
}

resource "alicloud_security_group_rule" "allow_http" {
  type              = "ingress"
  ip_protocol       = "tcp"
  port_range        = "80/80"
  cidr_ip           = "0.0.0.0/0"
  security_group_id = alicloud_security_group.web.id
  description       = "Allow HTTP from internet (redirect to HTTPS)"
}

resource "alicloud_security_group_rule" "allow_internal" {
  type              = "ingress"
  ip_protocol       = "tcp"
  port_range        = "1/65535"
  cidr_ip           = "10.0.0.0/8"
  security_group_id = alicloud_security_group.web.id
  description       = "Allow all internal VPC traffic"
}

# NAT Gateway for private subnet internet access
resource "alicloud_nat_gateway" "production" {
  vpc_id           = alicloud_vpc.production.id
  nat_gateway_name = "nat-production"
  payment_type     = "PayAsYouGo"
  vswitch_id       = alicloud_vswitch.app_a.id
  nat_type         = "Enhanced"
}

resource "alicloud_eip_address" "nat" {
  address_name = "eip-nat-production"
  bandwidth    = 200
  payment_type = "PayAsYouGo"
}

resource "alicloud_eip_association" "nat" {
  allocation_id = alicloud_eip_address.nat.id
  instance_id   = alicloud_nat_gateway.production.id
  instance_type = "Nat"
}

resource "alicloud_snat_entry" "app" {
  snat_table_id     = alicloud_nat_gateway.production.snat_table_ids
  source_vswitch_id = alicloud_vswitch.app_a.id
  snat_ip           = alicloud_eip_address.nat.ip_address
}

VPC Peering and CEN

For multi-VPC and multi-region connectivity, Alibaba Cloud offers Cloud Enterprise Network (CEN), equivalent to AWS Transit Gateway. CEN provides a global network mesh that connects VPCs across regions with automatic route distribution.

resource "alicloud_cen_instance" "global_network" {
  cen_instance_name = "cen-global"
  description       = "Global network for all production VPCs"
}

resource "alicloud_cen_instance_attachment" "shanghai" {
  instance_id              = alicloud_cen_instance.global_network.id
  child_instance_id        = alicloud_vpc.production.id
  child_instance_type      = "VPC"
  child_instance_region_id = "cn-shanghai"
}

resource "alicloud_cen_instance_attachment" "beijing" {
  instance_id              = alicloud_cen_instance.global_network.id
  child_instance_id        = alicloud_vpc.production_beijing.id
  child_instance_type      = "VPC"
  child_instance_region_id = "cn-beijing"
}

Networking Comparison

FeatureAlibaba CloudAWSAzureGCP
Virtual networkVPC (regional)VPC (regional)VNet (regional)VPC (global)
SubnetVSwitchSubnetSubnetSubnet (regional)
FirewallSecurity GroupsSecurity Groups + NACLsNSGsFirewall Rules
NATNAT GatewayNAT GatewayNAT GatewayCloud NAT
Global transitCENTransit GatewayVirtual WANVPC (native global)
Private linkPrivateLinkPrivateLinkPrivate EndpointPrivate Service Connect
DDoSAnti-DDoS Basic/ProAWS ShieldAzure DDoS ProtectionCloud Armor
DNSAlibaba Cloud DNSRoute 53Azure DNSCloud DNS

SLB: Server Load Balancer

SLB is Alibaba Cloud's load balancing service. The product line has evolved to include multiple options:

ProductLayerScopeUse Case
CLB (Classic LB)L4/L7RegionalLegacy, basic load balancing
ALB (Application LB)L7RegionalAdvanced HTTP routing, WAF integration
NLB (Network LB)L4RegionalHigh-performance TCP/UDP
GA (Global Accelerator)L4/L7GlobalCross-region acceleration, anycast

For new deployments, use ALB for HTTP/HTTPS traffic and NLB for TCP/UDP. CLB is being superseded.

resource "alicloud_alb_load_balancer" "web" {
  vpc_id                 = alicloud_vpc.production.id
  address_type           = "Internet"
  address_allocated_mode = "Dynamic"
  load_balancer_name     = "alb-web-production"
  load_balancer_edition  = "Standard"

  load_balancer_billing_config {
    pay_type = "PayAsYouGo"
  }

  zone_mappings {
    vswitch_id = alicloud_vswitch.app_a.id
    zone_id    = "cn-shanghai-a"
  }

  zone_mappings {
    vswitch_id = alicloud_vswitch.app_b.id
    zone_id    = "cn-shanghai-b"
  }

  tags = {
    Environment = "production"
  }
}

resource "alicloud_alb_server_group" "web" {
  server_group_name = "sg-web-production"
  vpc_id            = alicloud_vpc.production.id
  protocol          = "HTTPS"

  health_check_config {
    health_check_enabled      = true
    health_check_path         = "/health"
    health_check_codes        = ["http_2xx", "http_3xx"]
    health_check_interval     = 5
    healthy_threshold         = 3
    unhealthy_threshold       = 3
  }

  sticky_session_config {
    sticky_session_enabled = true
    sticky_session_type    = "Insert"
    cookie_timeout         = 3600
  }
}

RAM: Resource Access Management

RAM is Alibaba Cloud's identity and access management service, equivalent to AWS IAM. The policy language is similar but simpler.

Key Concepts

ConceptAlibaba Cloud RAMAWS IAM Equivalent
UserRAM UserIAM User
GroupRAM GroupIAM Group
RoleRAM RoleIAM Role
PolicyRAM PolicyIAM Policy
Instance RoleInstance RAM RoleInstance Profile
STSSTS (Security Token Service)STS
SSOIDaaS / SAML 2.0IAM Identity Center
# Create a RAM user for CI/CD
aliyun ram CreateUser --UserName cicd-deployer --DisplayName "CI/CD Deployer"

# Create an access key for the user
aliyun ram CreateAccessKey --UserName cicd-deployer

# Create a custom policy with least privilege
aliyun ram CreatePolicy \
  --PolicyName CustomDeployPolicy \
  --PolicyDocument '{
    "Version": "1",
    "Statement": [
      {
        "Effect": "Allow",
        "Action": [
          "ecs:DescribeInstances",
          "ecs:StartInstance",
          "ecs:StopInstance",
          "oss:GetObject",
          "oss:PutObject",
          "oss:ListObjects",
          "cs:GetClusterById",
          "cs:GetUserClusterKubeConfig",
          "cr:GetRepository",
          "cr:PushImage"
        ],
        "Resource": "*",
        "Condition": {
          "IpAddress": {
            "acs:SourceIp": ["203.0.113.0/24"]
          }
        }
      }
    ]
  }'

# Attach the policy to the user
aliyun ram AttachPolicyToUser \
  --PolicyType Custom \
  --PolicyName CustomDeployPolicy \
  --UserName cicd-deployer

Instance RAM Roles

For production workloads, use RAM Roles (instance roles) instead of access keys, just like AWS IAM roles for EC2:

resource "alicloud_ram_role" "ecs_role" {
  name        = "ECSAppServerRole"
  description = "Role for application servers"
  document    = jsonencode({
    Version   = "1"
    Statement = [
      {
        Action = "sts:AssumeRole"
        Effect = "Allow"
        Principal = {
          Service = ["ecs.aliyuncs.com"]
        }
      }
    ]
  })
}

resource "alicloud_ram_policy" "app_policy" {
  policy_name = "AppServerPolicy"
  policy_document = jsonencode({
    Version   = "1"
    Statement = [
      {
        Effect = "Allow"
        Action = [
          "oss:GetObject",
          "oss:ListObjects",
          "kms:Decrypt",
          "log:PostLogStoreLogs"
        ]
        Resource = "*"
      }
    ]
  })
}

resource "alicloud_ram_role_policy_attachment" "app_attach" {
  role_name   = alicloud_ram_role.ecs_role.name
  policy_name = alicloud_ram_policy.app_policy.policy_name
  policy_type = "Custom"
}

STS for Temporary Credentials

# Assume a role for temporary credentials
aliyun sts AssumeRole \
  --RoleArn acs:ram::123456789012:role/CrossAccountRole \
  --RoleSessionName deploy-session \
  --DurationSeconds 3600

ApsaraDB RDS

Alibaba Cloud's managed database service supports MySQL, PostgreSQL, SQL Server, and MariaDB. It is feature-rich and competitively priced, especially in China regions.

resource "alicloud_db_instance" "production" {
  engine               = "PostgreSQL"
  engine_version       = "15.0"
  instance_type        = "pg.x2.medium.2c"
  instance_storage     = 100
  instance_charge_type = "Postpaid"
  vswitch_id           = alicloud_vswitch.data_a.id
  security_ips         = ["10.0.0.0/8"]

  db_instance_storage_type = "cloud_essd"

  parameters {
    name  = "max_connections"
    value = "500"
  }

  tags = {
    Environment = "production"
    Team        = "platform"
  }
}

resource "alicloud_db_readonly_instance" "replica" {
  master_db_instance_id = alicloud_db_instance.production.id
  engine_version        = "15.0"
  instance_type         = "pg.x2.medium.2c"
  instance_storage      = 100
  vswitch_id            = alicloud_vswitch.data_b.id

  db_instance_storage_type = "cloud_essd"
}

Database Comparison

FeatureApsaraDB RDSPolarDBAnalyticDB
TypeStandard RDBMSCloud-native RDBMSOLAP/HTAP
EnginesMySQL, PostgreSQL, SQL Server, MariaDBMySQL, PostgreSQL, Oracle-compatibleMySQL, PostgreSQL
Max storage32 TB128 TBUnlimited
Read replicasUp to 5Up to 15 (shared storage)N/A
Failover time30 secondsSeconds (shared storage)N/A
Best forStandard workloadsHigh-performance, large-scaleAnalytics, reporting

Container Registry (ACR)

Alibaba Cloud Container Registry provides Docker image hosting, vulnerability scanning, and image signing. The Enterprise Edition includes geo-replication across regions -- critical for multi-region deployments in China.

ACR Editions

FeaturePersonal EditionEnterprise BasicEnterprise StandardEnterprise Advanced
Private repos3001,0005,000Unlimited
Image scanningNoYesYesYes
Geo-replicationNoNoYes (3 regions)Yes (unlimited)
Image signingNoNoNoYes
CostFree~CNY 60/mo~CNY 200/mo~CNY 400/mo
# Login to ACR
docker login --username=your-username registry.cn-shanghai.aliyuncs.com

# Tag and push
docker tag webapp:latest registry.cn-shanghai.aliyuncs.com/myorg/webapp:v1.2.3
docker push registry.cn-shanghai.aliyuncs.com/myorg/webapp:v1.2.3

# List images
aliyun cr GetRepoTags \
  --RepoNamespace myorg \
  --RepoName webapp \
  --output cols=tag,imageCreate,imageSize

With Terraform (Enterprise Edition):

resource "alicloud_cr_ee_instance" "registry" {
  instance_type   = "Standard"
  instance_name   = "acr-production"
  payment_type    = "Subscription"
  renewal_status  = "AutoRenewal"
  period          = 12
}

resource "alicloud_cr_ee_namespace" "production" {
  instance_id        = alicloud_cr_ee_instance.registry.id
  name               = "production"
  auto_create        = true
  default_visibility = "PRIVATE"
}

resource "alicloud_cr_ee_repo" "webapp" {
  instance_id = alicloud_cr_ee_instance.registry.id
  namespace   = alicloud_cr_ee_namespace.production.name
  name        = "webapp"
  repo_type   = "PRIVATE"
  summary     = "Production web application"
}

DevOps Pipeline on Alibaba Cloud

Alibaba Cloud offers Flow (their native CI/CD platform), but most international teams use familiar tools with Alibaba Cloud integrations. The key challenge is network access -- builds running outside China will be slow pulling images and packages from Chinese registries.

GitHub Actions with Alibaba Cloud

name: Deploy to Alibaba Cloud

on:
  push:
    branches: [main]

env:
  REGION: cn-shanghai
  ACR_REGISTRY: registry.cn-shanghai.aliyuncs.com
  ACR_NAMESPACE: production

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'
      - run: npm ci && npm run lint && npm test

  build-and-push:
    needs: test
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Login to ACR
        uses: docker/login-action@v3
        with:
          registry: ${{ env.ACR_REGISTRY }}
          username: ${{ secrets.ACR_USERNAME }}
          password: ${{ secrets.ACR_PASSWORD }}

      - name: Build and Push
        uses: docker/build-push-action@v5
        with:
          push: true
          tags: |
            ${{ env.ACR_REGISTRY }}/${{ env.ACR_NAMESPACE }}/webapp:${{ github.sha }}
            ${{ env.ACR_REGISTRY }}/${{ env.ACR_NAMESPACE }}/webapp:latest
          cache-from: type=registry,ref=${{ env.ACR_REGISTRY }}/${{ env.ACR_NAMESPACE }}/webapp:latest
          cache-to: type=inline

  deploy:
    needs: build-and-push
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Set up aliyun CLI
        uses: aliyun/setup-cli@v1
        with:
          aliyun-access-key-id: ${{ secrets.ALICLOUD_ACCESS_KEY }}
          aliyun-access-key-secret: ${{ secrets.ALICLOUD_SECRET_KEY }}
          aliyun-region: ${{ env.REGION }}

      - name: Deploy to ACK
        run: |
          # Get kubeconfig
          aliyun cs GetUserClusterKubeConfig \
            --ClusterId ${{ secrets.ACK_CLUSTER_ID }} \
            | jq -r '.config' > kubeconfig
          export KUBECONFIG=./kubeconfig

          # Update deployment image
          kubectl set image deployment/webapp \
            webapp=${{ env.ACR_REGISTRY }}/${{ env.ACR_NAMESPACE }}/webapp:${{ github.sha }} \
            -n production

          # Wait for rollout
          kubectl rollout status deployment/webapp -n production --timeout=300s

Dealing with Network Challenges

When your CI/CD runs outside China (e.g., GitHub Actions), consider these strategies:

  1. Mirror npm/pip registries -- Use Alibaba Cloud's npm mirror (npmmirror.com) and pip mirror in your Dockerfile.
  2. Use ACR image acceleration -- Enterprise Edition ACR supports acceleration from international networks.
  3. Multi-stage builds -- Build outside China, push to an international ACR region, then use geo-replication to sync to China regions.
  4. Self-hosted runners in China -- Run GitHub Actions runners on ECS instances in China for the fastest builds.
# Dockerfile optimized for China builds
FROM node:20-alpine

# Use Alibaba Cloud npm mirror for faster package installation
RUN npm config set registry https://registry.npmmirror.com

# Use Alibaba Cloud Alpine mirror
RUN sed -i 's/dl-cdn.alpinelinux.org/mirrors.aliyun.com/g' /etc/apk/repositories

WORKDIR /app
COPY package*.json ./
RUN npm ci --production
COPY . .
EXPOSE 8080
CMD ["node", "server.js"]

AWS to Alibaba Cloud Service Mapping

If you are coming from AWS, this mapping helps you navigate Alibaba Cloud:

AWS ServiceAlibaba Cloud EquivalentNotes
EC2ECSVery similar API model
S3OSSS3-compatible API available
VPCVPCNearly identical concepts (VSwitches instead of Subnets)
IAMRAMSimilar but simpler policy language
EKSACKACK Pro adds enhanced SLA
RDSApsaraDB RDSSupports MySQL, PostgreSQL, SQL Server
AuroraPolarDBCloud-native distributed database
LambdaFunction ComputeSupports Node.js, Python, Java, Go, PHP, C#
CloudWatchCloudMonitor + SLS (Log Service)SLS is particularly powerful for log analytics
Route 53Alibaba Cloud DNSICP filing integrated
CloudFrontAlibaba Cloud CDN / DCDNEssential for China delivery, DCDN adds edge compute
ELB/ALBCLB/ALB/NLBCLB (legacy), ALB for L7, NLB for L4
ECRACREnterprise edition has geo-replication
Systems ManagerCloud AssistantRemote command execution on ECS
CloudFormationROS (Resource Orchestration)Or just use Terraform (recommended)
CodePipelineFlowMost international teams use GitHub Actions
KMSKMSKey management and encryption
ElastiCacheTair (ApsaraDB for Redis)Redis-compatible, enhanced features
SQSMessage Queue (MQ)Supports MQTT, RocketMQ, Kafka, RabbitMQ
API GatewayAPI GatewayIncludes China-specific auth integrations
WAFWeb Application FirewallIncludes China-specific threat intelligence
Transit GatewayCEN (Cloud Enterprise Network)Global network mesh

Cost Optimization on Alibaba Cloud

Pricing Differences from Western Clouds

Alibaba Cloud pricing in China regions is typically 20-40% lower than equivalent AWS services in the US. However, international regions are competitively priced with or slightly above AWS.

Key Strategies

  1. Use Subscription pricing for predictable workloads. 1-year subscriptions save 15-30%, 3-year saves 40-60%.
  2. Preemptible instances for batch and CI/CD. Up to 90% savings.
  3. Reserved Instances for flexible commitment across instance types.
  4. Storage tiering with OSS lifecycle policies. Cold data in Archive or Cold Archive is extremely cheap.
  5. Right-size ECS instances. CloudMonitor provides utilization reports.
  6. Use CEN instead of multiple EIPs for inter-region traffic. CEN pricing is more predictable.
  7. Alibaba Cloud CDN for content delivery instead of origin-pull. Bandwidth in China is expensive; CDN offloads significantly.
  8. Spot instances for ACK node pools. Scale batch workloads on spot nodes.
# Check current costs
aliyun bssopenapi QueryAccountBill \
  --BillingCycle 2026-03 \
  --Granularity MONTHLY \
  --output cols=ProductCode,AdjustAmount,Currency

When to Choose Alibaba Cloud

Choose Alibaba Cloud when:

  1. Your users are in mainland China. No other cloud provider can match the performance, compliance tooling, and regulatory support. This is the number one reason.
  2. You need ICP filing support. Alibaba Cloud streamlines the process and offers guidance through the bureaucratic requirements.
  3. Data must stay in China. Alibaba Cloud has more China regions and availability zones than any competitor, with full compliance tooling for PIPL and Cybersecurity Law.
  4. You serve the Asia-Pacific market. Strong presence in Singapore, Indonesia, Malaysia, and Hong Kong with competitive pricing.
  5. Your organization has an existing Alibaba ecosystem relationship (e.g., using DingTalk, Tmall, or other Alibaba services).
  6. You need global acceleration into China. Alibaba Cloud's Global Accelerator (GA) provides optimized routing from international users to China origins.

Be cautious when:

  • Your team has zero Chinese language capability -- some documentation and console sections are Chinese-first, especially for newer services. The English documentation has gaps.
  • You need deep integration with Western SaaS tools (Datadog, PagerDuty, etc.) -- integrations exist but are less mature. Consider using Alibaba Cloud's native monitoring (CloudMonitor, ARMS, SLS) instead.
  • You are running exclusively in North America or Europe with no APAC presence -- AWS or Azure will be more cost-effective and better supported.
  • You need cutting-edge AI/ML services -- while Alibaba Cloud has ML services (PAI), the ecosystem is smaller than AWS SageMaker or GCP Vertex AI for English-language users.

Migration Path to Alibaba Cloud

For teams migrating to Alibaba Cloud (typically for China market entry):

  1. Start with Terraform. The alicloud provider has excellent coverage. Write your infrastructure as code from day one.
  2. Mirror your container images to ACR before deploying. Docker Hub is unreliable from China.
  3. Set up a private npm/pip mirror using Alibaba Cloud's mirrors or self-hosted solutions.
  4. Plan your network architecture including CEN if you need multi-region. China bandwidth is expensive -- minimize cross-region data transfer.
  5. Apply for ICP filing early. It takes 2-4 weeks and blocks your launch date if delayed.
  6. Test the Great Firewall impact on your application. If your app calls external APIs, those calls may fail or be very slow from China.
  7. Use China-specific CDN configuration. A global CDN configuration will not perform well in China.

Alibaba Cloud is not a niche provider. It serves millions of businesses and handles the infrastructure behind Singles' Day (11.11), the world's largest online shopping event, which processes over 580,000 transactions per second at peak. For DevOps engineers working with global infrastructure, understanding Alibaba Cloud is a genuine competitive advantage, especially as more companies expand into Asian markets. The skills transfer well from AWS and Azure, and the Terraform provider makes the transition manageable.

Share:
Aareez Asif
Aareez Asif

Senior Kubernetes Architect

10+ years orchestrating containers in production. Battle-tested opinions on everything from pod scheduling to service mesh. I've seen clusters burn and helped rebuild them better.

Related Articles