Ansible Dynamic Inventory: Automating Cloud Infrastructure
Static vs Dynamic Inventory
A static inventory file works well when your infrastructure is stable: a handful of servers that rarely change, provisioned by hand and maintained for months or years. But in cloud environments, instances are ephemeral by design. Auto Scaling groups spin up and terminate EC2 instances based on demand. Kubernetes nodes come and go. Spot instances appear for hours and vanish. Blue-green deployments create entire fleets and destroy them. In this world, maintaining a static inventory is a losing battle.
You either automate the inventory file updates (fragile, prone to race conditions, and guaranteed to drift) or accept that your inventory is always stale (dangerous, because you might miss new servers or target terminated ones). Dynamic inventory solves this by querying your cloud provider's API at runtime. Every time you run a playbook, Ansible discovers the current state of your infrastructure. No manual updates, no drift, no stale hosts.
| Aspect | Static Inventory | Dynamic Inventory |
|---|---|---|
| Source of truth | Text file in repo | Cloud provider API |
| Update frequency | Manual edits | Every playbook run (or cached) |
| Scale | Tens of servers | Thousands of instances |
| Accuracy | Drifts over time | Always current |
| Setup effort | Minimal | Requires credentials and plugin config |
| Cloud integration | None | Native tags, regions, metadata |
| Multi-cloud | Manual aggregation | Automatic with multiple plugins |
| Cost | Free | API calls (usually negligible) |
Inventory Plugins vs Legacy Scripts
Ansible supports two approaches to dynamic inventory:
Legacy inventory scripts are standalone executables that output JSON when called with --list or --host. You still see these in older tutorials and some vendor documentation, but they are deprecated in favor of plugins and should not be used for new projects.
Inventory plugins (the modern approach) are configured with YAML files and integrated directly into Ansible's inventory system. They support caching, composable variables, filtering, constructed groups, and templated host variables. Always use plugins for new projects.
| Feature | Legacy Scripts | Inventory Plugins |
|---|---|---|
| Configuration | Command-line args or env vars | YAML configuration file |
| Caching | Must implement yourself | Built-in cache support |
| Composed groups | Not supported | keyed_groups and groups directives |
| Composed variables | Not supported | compose directive |
| Filtering | Must implement yourself | Built-in filter support |
| Maintenance | Deprecated | Actively maintained |
| Testing | Difficult | Can use ansible-inventory commands |
To see available inventory plugins:
ansible-doc -t inventory -l
To view documentation for a specific plugin:
ansible-doc -t inventory amazon.aws.aws_ec2
ansible-doc -t inventory azure.azcollection.azure_rm
ansible-doc -t inventory google.cloud.gcp_compute
AWS EC2 Inventory Plugin
Prerequisites
Install the AWS collection and Python dependencies:
ansible-galaxy collection install amazon.aws
pip install boto3 botocore
Configure AWS credentials. The plugin supports multiple authentication methods, evaluated in this order:
- Environment variables (
AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY) - AWS credentials file (
~/.aws/credentials) - AWS config file (
~/.aws/config) - IAM instance profile (when running on EC2)
- IAM role assumed via STS
For local development, use the AWS CLI to configure credentials:
aws configure
# or for named profiles:
aws configure --profile ansible
For CI/CD, use environment variables or IAM roles.
Minimal IAM Policy
The EC2 inventory plugin needs read access to EC2. Here is a minimal IAM policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ec2:DescribeInstances",
"ec2:DescribeRegions",
"ec2:DescribeTags",
"ec2:DescribeInstanceStatus"
],
"Resource": "*"
}
]
}
If you also use RDS, ElastiCache, or other services, add their Describe permissions as well.
Configuration
Create a file that ends in aws_ec2.yml or aws_ec2.yaml (the suffix is required for Ansible to recognize it as an EC2 inventory source):
# inventory/aws_ec2.yml
plugin: amazon.aws.aws_ec2
# AWS profile (optional, uses default if not specified)
# profile: ansible
regions:
- us-east-1
- us-west-2
- eu-west-1
# Only include instances matching these filters
filters:
instance-state-name: running
"tag:ManagedBy": ansible
"tag:Environment":
- production
- staging
# Create groups from instance attributes
keyed_groups:
# Group by AWS region
- key: placement.region
prefix: region
separator: "_"
# Group by availability zone
- key: placement.availability_zone
prefix: az
separator: "_"
# Group by instance type
- key: instance_type
prefix: type
separator: "_"
# Group by the "Role" tag
- key: tags.Role
prefix: role
separator: "_"
# Group by the "Environment" tag
- key: tags.Environment
prefix: env
separator: "_"
# Group by the "Team" tag
- key: tags.Team | default('unassigned')
prefix: team
separator: "_"
# Group by VPC ID
- key: vpc_id
prefix: vpc
separator: "_"
# Conditional grouping based on instance size
- key: "'large' if instance_type.split('.')[1] in ['large', 'xlarge', '2xlarge', '4xlarge'] else 'small'"
prefix: size
separator: "_"
# Create host variables from instance attributes
compose:
# Set ansible_host to private IP (for VPN/VPC access)
ansible_host: private_ip_address
# Or use public IP for direct SSH access:
# ansible_host: public_ip_address
# Set the SSH user based on the OS tag
ansible_user: "'ubuntu' if tags.get('OS', '') == 'Ubuntu' else 'ec2-user'"
# Set SSH key based on environment
ansible_ssh_private_key_file: "'~/.ssh/prod.pem' if tags.get('Environment') == 'production' else '~/.ssh/staging.pem'"
# Create custom variables from instance metadata
instance_name: tags.get('Name', instance_id)
cloud_provider: "'aws'"
datacenter: placement.availability_zone
instance_tags: tags
# Hostname preference order
hostnames:
# Use Name tag as the hostname, fall back to private DNS, then instance ID
- tag:Name
- private-dns-name
- instance-id
# Control how duplicate hostnames are handled
# strict: true means fail if a keyed_groups expression errors
strict: false
# Include only specific instance attributes to reduce memory
# include_extra_api_calls: true # Adds EBS and ENI details
Testing the Inventory
# List all discovered hosts as JSON
ansible-inventory -i inventory/aws_ec2.yml --list
# Show the inventory as a tree graph
ansible-inventory -i inventory/aws_ec2.yml --graph
# Show graph with all variables
ansible-inventory -i inventory/aws_ec2.yml --graph --vars
# Show variables for a specific host
ansible-inventory -i inventory/aws_ec2.yml --host web-server-1
# Ping all discovered hosts
ansible -i inventory/aws_ec2.yml all -m ping
# Ping only production web servers
ansible -i inventory/aws_ec2.yml role_web:&env_production -m ping
Sample --graph output:
@all:
|--@region_us_east_1:
| |--web-server-1
| |--web-server-2
| |--api-server-1
| |--db-primary
|--@region_us_west_2:
| |--web-server-3
| |--web-server-4
|--@role_web:
| |--web-server-1
| |--web-server-2
| |--web-server-3
| |--web-server-4
|--@role_api:
| |--api-server-1
|--@role_database:
| |--db-primary
|--@env_production:
| |--web-server-1
| |--web-server-2
| |--api-server-1
| |--db-primary
|--@env_staging:
| |--web-server-3
| |--web-server-4
|--@type_t3_medium:
| |--web-server-1
| |--web-server-2
| |--web-server-3
| |--web-server-4
|--@type_m5_large:
| |--api-server-1
|--@type_r5_2xlarge:
| |--db-primary
Advanced keyed_groups and compose Examples
The keyed_groups directive creates Ansible groups from instance attributes. The key is a Jinja2 expression evaluated against each host's variables.
keyed_groups:
# Group by a comma-separated tag (e.g., tag "Services" = "nginx,redis,monitoring")
- key: tags.Services.split(',') if tags.Services is defined else []
prefix: service
# Group by AMI ID (useful for tracking which image version hosts run)
- key: image_id
prefix: ami
# Boolean grouping: has_public_ip or no_public_ip
- key: "'has_public_ip' if public_ip_address else 'no_public_ip'"
prefix: network
# Group by launch time (month)
- key: launch_time[:7]
prefix: launched
The compose directive creates host variables from instance attributes:
compose:
# Network information
ansible_host: private_ip_address
public_ip: public_ip_address | default('')
subnet_id: subnet_id
vpc_id: vpc_id
# Instance metadata
instance_name: tags.get('Name', instance_id)
cloud_region: placement.region
cloud_az: placement.availability_zone
cloud_provider: "'aws'"
# Custom connection settings based on instance attributes
ansible_ssh_private_key_file: "'~/.ssh/' + tags.get('KeyName', 'default') + '.pem'"
ansible_python_interpreter: "'/usr/bin/python3'"
# Application-level variables derived from tags
app_version: tags.get('AppVersion', 'latest')
deploy_group: tags.get('DeployGroup', 'default')
Azure Inventory Plugin
Prerequisites
ansible-galaxy collection install azure.azcollection
pip install azure-identity azure-mgmt-compute azure-mgmt-network azure-mgmt-resource
Authenticate via one of these methods:
- Environment variables (Service Principal)
- Azure CLI (
az login) - Managed Identity (when running on Azure)
- Azure AD Workload Identity
Service Principal Authentication
export AZURE_SUBSCRIPTION_ID="your-subscription-id"
export AZURE_CLIENT_ID="your-client-id"
export AZURE_SECRET="your-client-secret"
export AZURE_TENANT="your-tenant-id"
Configuration
Create a file ending in azure_rm.yml:
# inventory/azure_rm.yml
plugin: azure.azcollection.azure_rm
auth_source: auto
# Include only specific resource groups
include_vm_resource_groups:
- production-rg
- staging-rg
- shared-services-rg
# Use simple hostnames (not fully qualified Azure resource IDs)
plain_host_names: true
# Include VMSS instances
include_vmss_resource_groups:
- production-rg
keyed_groups:
# Group by resource group
- key: resource_group | lower
prefix: rg
# Group by Azure location/region
- key: location
prefix: location
# Group by tags
- key: tags.Role | default('untagged')
prefix: role
# Group by OS type
- key: os_profile.system | default('unknown')
prefix: os
# Group by VM size
- key: virtual_machine_size | default('unknown')
prefix: vmsize
# Group by environment tag
- key: tags.Environment | default('untagged')
prefix: env
compose:
# Prefer public IP, fall back to private
ansible_host: public_ip_address | default(private_ip_address, true)
ansible_user: "'azureuser'"
# Custom variables from Azure metadata
azure_vm_size: virtual_machine_size
azure_resource_group: resource_group
azure_location: location
cloud_provider: "'azure'"
# Only include running VMs
conditional_groups:
running: powerstate == "running"
stopped: powerstate == "deallocated"
# Exclude specific VMs
exclude_host_filters:
- powerstate != 'running'
- name.startswith('test-')
Test it:
ansible-inventory -i inventory/azure_rm.yml --graph
ansible-inventory -i inventory/azure_rm.yml --list
GCP Inventory Plugin
Prerequisites
ansible-galaxy collection install google.cloud
pip install google-auth requests
Authenticate via:
- Application Default Credentials (
gcloud auth application-default login) - Service account JSON key file
- GCE metadata server (when running on GCP)
Configuration
Create a file ending in gcp.yml:
# inventory/gcp.yml
plugin: google.cloud.gcp_compute
projects:
- my-gcp-project-id
- my-other-project-id
zones:
- us-central1-a
- us-central1-b
- us-central1-c
- us-east1-b
- us-east1-c
- europe-west1-b
# Only include running instances
filters:
- status = RUNNING
# Authentication
auth_kind: application
# Or specify a service account file:
# auth_kind: serviceaccount
# service_account_file: /path/to/service-account.json
keyed_groups:
# Group by zone
- key: zone
prefix: zone
# Group by machine type (extract just the type name from the full URL)
- key: machine_type | regex_search('[^/]+$')
prefix: type
# Group by network tags
- key: tags.items | default([])
prefix: tag
# Group by labels
- key: labels.environment | default('unlabeled')
prefix: env
# Group by labels (role)
- key: labels.role | default('unassigned')
prefix: role
# Group by project
- key: project
prefix: project
# Group by status
- key: status | lower
prefix: status
compose:
# Use external NAT IP if available, otherwise internal IP
ansible_host: networkInterfaces[0].accessConfigs[0].natIP | default(networkInterfaces[0].networkIP)
ansible_user: "'deploy'"
# Custom variables
gcp_zone: zone
gcp_machine_type: machine_type | regex_search('[^/]+$')
gcp_project: project
cloud_provider: "'gcp'"
internal_ip: networkInterfaces[0].networkIP
external_ip: networkInterfaces[0].accessConfigs[0].natIP | default('')
hostnames:
# Use instance name
- name
# Fall back to internal IP
- networkInterfaces[0].networkIP
strict: false
Test:
ansible-inventory -i inventory/gcp.yml --graph
ansible -i inventory/gcp.yml all -m ping
Filtering Instances
Effective filtering reduces API calls, keeps your inventory focused, and prevents accidentally targeting the wrong instances. Each cloud plugin supports its own filter syntax.
AWS EC2 Filters
AWS uses the EC2 API filter syntax:
filters:
# Instance state
instance-state-name: running
# Single tag value
"tag:Environment": production
# Multiple tag values (OR logic)
"tag:Environment":
- production
- staging
# Multiple tags (AND logic between different tags)
"tag:ManagedBy": ansible
"tag:Environment": production
# Instance type filter
instance-type:
- t3.medium
- t3.large
- m5.large
- m5.xlarge
# VPC filter
vpc-id: vpc-0abc123def456
# Subnet filter
subnet-id:
- subnet-0abc123
- subnet-0def456
# Security group
"instance.group-name": web-servers
Azure Filters
Azure uses include_vm_resource_groups, exclude_host_filters, and conditional_groups:
include_vm_resource_groups:
- production-rg
- staging-rg
exclude_host_filters:
- powerstate != 'running'
- name.startswith('temp-')
- name.endswith('-dev')
conditional_groups:
production_running: powerstate == "running" and tags.get('Environment') == 'production'
needs_patching: tags.get('PatchGroup', '') == 'weekly'
GCP Filters
GCP uses the Compute API filter syntax:
filters:
- status = RUNNING
- labels.environment = production
- labels.managed-by = ansible
# Note: GCP labels do not support underscores, only hyphens and lowercase
Combining Static and Dynamic Inventory
One of Ansible's most powerful features is the ability to use multiple inventory sources simultaneously. You can combine static hosts (on-premises servers, network devices) with dynamic cloud discovery in a single inventory directory. Ansible merges them transparently.
Directory Structure
inventory/
01-static.yml # Static hosts (on-prem servers, network devices)
02-aws_ec2.yml # Dynamic AWS hosts
03-azure_rm.yml # Dynamic Azure hosts
04-gcp.yml # Dynamic GCP hosts
group_vars/
all/
vars.yml # Variables for all hosts
vault.yml # Encrypted secrets for all hosts
webservers/
vars.yml
role_web/
vars.yml # Variables for dynamically grouped web servers
env_production/
vars.yml
vault.yml
dbservers/
vars.yml
host_vars/
bastion.example.com.yml
legacy-db-01.yml
The static inventory file handles on-premises infrastructure:
# inventory/01-static.yml
all:
children:
onprem:
children:
onprem_webservers:
hosts:
onprem-web-01.example.com:
ansible_host: 10.100.1.10
onprem-web-02.example.com:
ansible_host: 10.100.1.11
onprem_dbservers:
hosts:
onprem-db-01.example.com:
ansible_host: 10.100.2.10
ansible_user: dbadmin
network_devices:
hosts:
core-switch-01:
ansible_host: 10.100.0.1
ansible_network_os: ios
ansible_connection: network_cli
firewall-01:
ansible_host: 10.100.0.2
ansible_network_os: asa
ansible_connection: network_cli
bastion:
hosts:
bastion.example.com:
ansible_host: 203.0.113.10
ansible_user: admin
Point Ansible at the directory:
ansible-playbook -i inventory/ site.yml
Ansible processes files in lexicographic order (which is why the numeric prefix is useful). Static variables in group_vars/ and host_vars/ apply to dynamically discovered hosts that match the group or hostname. This is powerful: you can tag an EC2 instance with Role: web, have the dynamic inventory create a role_web group, and define variables for that group in group_vars/role_web/vars.yml.
Cross-Cloud Group Variables
# inventory/group_vars/all/vars.yml
---
ntp_servers:
- 0.pool.ntp.org
- 1.pool.ntp.org
dns_servers:
- 8.8.8.8
- 8.8.4.4
monitoring_endpoint: https://monitoring.example.com/api/v1
log_aggregator: logs.example.com
# inventory/group_vars/role_web/vars.yml
---
nginx_worker_processes: auto
nginx_worker_connections: 4096
nginx_server_name: "{{ inventory_hostname }}"
app_document_root: /var/www/html
health_check_path: /health
health_check_port: 80
# inventory/group_vars/env_production/vars.yml
---
app_environment: production
log_level: warn
enable_debug: false
backup_enabled: true
backup_schedule: "0 2 * * *"
monitoring_interval: 30
Inventory Caching
Querying cloud APIs on every playbook run is slow, especially if you have thousands of instances across multiple regions. Enable caching to store the inventory locally and refresh it periodically.
File-Based Caching
# inventory/aws_ec2.yml
plugin: amazon.aws.aws_ec2
regions:
- us-east-1
- us-west-2
cache: true
cache_plugin: jsonfile
cache_connection: /tmp/ansible_inventory_cache
cache_timeout: 300 # seconds (5 minutes)
Redis-Based Caching (Shared Across Team)
plugin: amazon.aws.aws_ec2
regions:
- us-east-1
cache: true
cache_plugin: redis
cache_connection: redis://redis.internal.example.com:6379/0
cache_timeout: 600
cache_prefix: ansible_inventory_
Install the Redis Python client:
pip install redis
Cache Management Commands
# Force refresh the cache
ansible-inventory -i inventory/aws_ec2.yml --list --flush-cache
# List with cache (uses cached data if available)
ansible-inventory -i inventory/aws_ec2.yml --list
# Playbook with cache flush
ansible-playbook -i inventory/ site.yml --flush-cache
Cache Plugin Comparison
| Cache Plugin | Storage | Use Case | Shared |
|---|---|---|---|
jsonfile | Local filesystem | Single user, simple setups | No |
redis | Redis server | Shared cache across team | Yes |
memcached | Memcached server | High-performance shared cache | Yes |
mongodb | MongoDB | Persistent shared cache | Yes |
yaml | Local YAML files | Human-readable cache for debugging | No |
For CI/CD pipelines, disable caching or set a very short timeout. Stale inventory in CI defeats the purpose of dynamic discovery:
# In CI/CD, either disable caching:
cache: false
# Or use a very short timeout:
cache: true
cache_timeout: 60
Constructed Inventory Plugin
The constructed inventory plugin creates groups and variables based on existing inventory data from any source. It acts as a second pass over your inventory, letting you create logical groups from any combination of attributes.
# inventory/05-constructed.yml
plugin: ansible.builtin.constructed
strict: false
# Create groups based on combined attributes
groups:
# All production web servers regardless of cloud provider
production_web: >-
('role_web' in group_names or 'onprem_webservers' in group_names)
and
('env_production' in group_names or 'production' in group_names)
# All databases across all providers
all_databases: >-
'role_database' in group_names or
'onprem_dbservers' in group_names or
'role_db' in group_names
# Servers that need patching (haven't been updated in 30 days)
needs_patching: >-
tags is defined and
tags.get('LastPatched', '2000-01-01') < (now().strftime('%Y-%m-%d'))
# Large instances across all clouds
large_instances: >-
(instance_type is defined and instance_type.split('.')[1] in ['xlarge', '2xlarge', '4xlarge', '8xlarge'])
or
(virtual_machine_size is defined and 'Standard_D' in virtual_machine_size)
keyed_groups:
# Group by cloud provider (set via compose in each inventory plugin)
- key: cloud_provider | default('onprem')
prefix: cloud
separator: "_"
compose:
# Unified variables across all cloud providers
display_name: instance_name | default(inventory_hostname)
environment: >-
tags.get('Environment', '') if tags is defined
else 'unknown'
Writing Custom Inventory Plugins
When you need to pull inventory from an internal CMDB, a custom API, a database, or any non-standard source, write a custom inventory plugin.
Plugin File Structure
# plugins/inventory/custom_cmdb.py
from ansible.plugins.inventory import BaseInventoryPlugin, Constructable, Cacheable
from ansible.errors import AnsibleParserError
DOCUMENTATION = """
name: custom_cmdb
plugin_type: inventory
short_description: Pull inventory from internal CMDB API
description:
- Queries the internal CMDB REST API to discover hosts
- Supports caching for performance
- Creates groups based on CMDB roles and environments
options:
api_url:
description: CMDB API endpoint URL
required: true
type: str
api_token:
description: Authentication token for the CMDB API
required: true
type: str
env:
- name: CMDB_API_TOKEN
verify_ssl:
description: Whether to verify SSL certificates
type: bool
default: true
timeout:
description: API request timeout in seconds
type: int
default: 30
"""
EXAMPLES = """
# inventory/cmdb.yml
plugin: custom_cmdb
api_url: https://cmdb.internal.example.com
verify_ssl: true
timeout: 30
"""
class InventoryModule(BaseInventoryPlugin, Constructable, Cacheable):
NAME = "custom_cmdb"
def verify_file(self, path):
"""Verify this is a valid inventory source file."""
if super().verify_file(path):
return path.endswith(("cmdb.yml", "cmdb.yaml"))
return False
def parse(self, inventory, loader, path, cache=True):
"""Parse the inventory source and populate inventory."""
super().parse(inventory, loader, path, cache)
self._read_config_data(path)
api_url = self.get_option("api_url")
api_token = self.get_option("api_token")
verify_ssl = self.get_option("verify_ssl")
timeout = self.get_option("timeout")
# Check cache first
cache_key = self.get_cache_key(path)
use_cache = cache and self.use_cache
servers = None
if use_cache:
try:
servers = self._cache[cache_key]
except KeyError:
pass
if servers is None:
servers = self._fetch_servers(api_url, api_token, verify_ssl, timeout)
if use_cache:
self._cache[cache_key] = servers
self._populate_inventory(servers)
def _fetch_servers(self, api_url, api_token, verify_ssl, timeout):
"""Fetch server list from the CMDB API."""
import requests
try:
response = requests.get(
f"{api_url}/api/v1/servers",
headers={
"Authorization": f"Bearer {api_token}",
"Accept": "application/json",
},
verify=verify_ssl,
timeout=timeout,
)
response.raise_for_status()
return response.json()
except requests.RequestException as e:
raise AnsibleParserError(f"Failed to fetch from CMDB: {e}")
def _populate_inventory(self, servers):
"""Populate the Ansible inventory from CMDB data."""
for server in servers:
hostname = server["hostname"]
self.inventory.add_host(hostname)
# Set connection variables
self.inventory.set_variable(hostname, "ansible_host", server["ip_address"])
self.inventory.set_variable(
hostname, "ansible_user", server.get("ssh_user", "deploy")
)
self.inventory.set_variable(
hostname, "ansible_port", server.get("ssh_port", 22)
)
# Set custom variables
for key, value in server.get("metadata", {}).items():
self.inventory.set_variable(hostname, f"cmdb_{key}", value)
# Group by role
role = server.get("role", "ungrouped")
self.inventory.add_group(role)
self.inventory.add_host(hostname, group=role)
# Group by environment
env = server.get("environment", "unknown")
self.inventory.add_group(env)
self.inventory.add_host(hostname, group=env)
# Group by datacenter/location
dc = server.get("datacenter", "unknown")
self.inventory.add_group(f"dc_{dc}")
self.inventory.add_host(hostname, group=f"dc_{dc}")
# Group by OS
os_name = server.get("os", "unknown")
self.inventory.add_group(f"os_{os_name}")
self.inventory.add_host(hostname, group=f"os_{os_name}")
Enable it in ansible.cfg:
[defaults]
inventory_plugins = plugins/inventory
[inventory]
enable_plugins = custom_cmdb, amazon.aws.aws_ec2, azure.azcollection.azure_rm, host_list, auto
Create the inventory source file:
# inventory/cmdb.yml
plugin: custom_cmdb
api_url: https://cmdb.internal.example.com
verify_ssl: true
timeout: 30
# api_token is read from CMDB_API_TOKEN environment variable
Integrating Dynamic Inventory with CI/CD
GitHub Actions
# .github/workflows/ansible-deploy.yml
name: Ansible Deploy
on:
push:
branches: [main]
workflow_dispatch:
inputs:
target_group:
description: "Inventory group to target"
required: true
default: "role_web"
jobs:
deploy:
runs-on: ubuntu-latest
permissions:
id-token: write
contents: read
steps:
- uses: actions/checkout@v4
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789012:role/ansible-deploy
aws-region: us-east-1
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.12"
- name: Install dependencies
run: |
pip install ansible boto3
ansible-galaxy collection install amazon.aws
ansible-galaxy install -r requirements.yml
- name: Verify inventory discovery
run: |
ansible-inventory -i inventory/ --graph
echo "---"
ansible-inventory -i inventory/ --list | python3 -c "
import sys, json
data = json.load(sys.stdin)
hosts = data.get('_meta', {}).get('hostvars', {})
print(f'Discovered {len(hosts)} hosts')
"
- name: Run playbook
env:
ANSIBLE_VAULT_PASSWORD: ${{ secrets.VAULT_PASSWORD }}
ANSIBLE_HOST_KEY_CHECKING: "False"
run: |
echo "$ANSIBLE_VAULT_PASSWORD" > .vault_pass
chmod 600 .vault_pass
ansible-playbook -i inventory/ site.yml \
--vault-password-file .vault_pass \
--limit "${{ inputs.target_group || 'role_web' }}"
rm -f .vault_pass
Terraform Integration
When using Terraform to provision infrastructure, you can generate Ansible inventory from Terraform state. However, the better approach with dynamic inventory is to simply tag your Terraform-managed resources:
resource "aws_instance" "web" {
count = 3
ami = "ami-0abcdef1234567890"
instance_type = "t3.medium"
subnet_id = aws_subnet.private[count.index].id
tags = {
Name = "web-server-${count.index + 1}"
Environment = var.environment
Role = "web"
ManagedBy = "ansible"
Team = "platform"
Project = "main-app"
}
}
The dynamic inventory plugin discovers these instances automatically based on the tags. No inventory file generation needed.
Practical Example: Multi-Cloud Web Server Fleet
Here is a complete, end-to-end example managing web servers across AWS and Azure with dynamic discovery, group variables, and a production playbook.
Inventory Directory
inventory/
01-static.yml
02-aws_ec2.yml
03-azure_rm.yml
05-constructed.yml
group_vars/
all/
vars.yml
production_web/
vars.yml
vault.yml
Constructed Groups
# inventory/05-constructed.yml
plugin: ansible.builtin.constructed
strict: false
groups:
production_web: >-
('role_web' in group_names or 'role_webserver' in group_names)
and 'env_production' in group_names
Group Variables for All Production Web Servers
# inventory/group_vars/production_web/vars.yml
---
nginx_port: 80
nginx_ssl_port: 443
nginx_worker_processes: auto
nginx_worker_connections: 4096
nginx_server_name: "{{ inventory_hostname }}"
app_document_root: /var/www/html
app_version: "{{ lookup('env', 'APP_VERSION') | default('latest', true) }}"
monitoring_enabled: true
log_level: warn
The Playbook
# configure-webservers.yml
---
- name: Configure web servers discovered from cloud providers
hosts: production_web
become: true
serial: "30%"
max_fail_percentage: 10
pre_tasks:
- name: Display discovered hosts
debug:
msg: >
Configuring {{ inventory_hostname }}
({{ ansible_host }})
from {{ cloud_provider | default('unknown') }}
in {{ datacenter | default('unknown') }}
- name: Wait for SSH to be available
wait_for_connection:
timeout: 120
sleep: 5
- name: Gather facts
setup:
- name: Validate host is in expected state
assert:
that:
- ansible_distribution in ['Ubuntu', 'Debian']
- ansible_memtotal_mb >= 1024
fail_msg: "Host does not meet minimum requirements"
tasks:
- name: Update apt cache
apt:
update_cache: true
cache_valid_time: 3600
- name: Install required packages
apt:
name:
- nginx
- curl
- jq
- unzip
state: present
- name: Deploy Nginx configuration
template:
src: templates/nginx.conf.j2
dest: /etc/nginx/sites-available/default
owner: root
group: root
mode: "0644"
validate: "nginx -t -c %s"
notify: Reload Nginx
- name: Create document root
file:
path: "{{ app_document_root }}"
state: directory
owner: www-data
group: www-data
mode: "0755"
- name: Ensure Nginx is running
service:
name: nginx
state: started
enabled: true
- name: Verify Nginx is responding
uri:
url: "http://localhost:{{ nginx_port }}/health"
status_code: 200
register: health_check
retries: 3
delay: 5
until: health_check.status == 200
handlers:
- name: Reload Nginx
service:
name: nginx
state: reloaded
Running It
# Preview which hosts will be targeted
ansible-inventory -i inventory/ --graph production_web
# Dry run
ansible-playbook -i inventory/ configure-webservers.yml --check --diff
# Apply
ansible-playbook -i inventory/ configure-webservers.yml
# Apply with fresh inventory (no cache)
ansible-playbook -i inventory/ configure-webservers.yml --flush-cache
# Limit to a specific cloud or region
ansible-playbook -i inventory/ configure-webservers.yml --limit cloud_aws
ansible-playbook -i inventory/ configure-webservers.yml --limit region_us_east_1
# Limit to a specific host for debugging
ansible-playbook -i inventory/ configure-webservers.yml --limit web-server-1 -vvv
Rolling Deployments
The serial and max_fail_percentage parameters in the playbook enable safe rolling deployments across your fleet:
| Parameter | Value | Effect |
|---|---|---|
serial: "30%" | 30% of hosts at a time | Deploy to roughly a third of hosts, then the next third |
serial: 5 | 5 hosts at a time | Fixed batch size |
serial: [1, 5, "50%"] | Escalating | 1 host first, then 5, then half the remaining |
max_fail_percentage: 10 | 10% | Abort if more than 10% of hosts fail |
This ensures that a bad deployment does not take down your entire fleet simultaneously. If the first batch fails, Ansible stops before touching the rest.
Troubleshooting Dynamic Inventory
Common Issues and Solutions
"Unable to parse" error -- The inventory file suffix is wrong. AWS needs aws_ec2.yml, Azure needs azure_rm.yml, GCP needs gcp.yml.
No hosts discovered -- Check filters and credentials:
# Verbose inventory listing
ansible-inventory -i inventory/aws_ec2.yml --list -vvv
# Check if boto3 can connect
python3 -c "import boto3; print(boto3.client('ec2', region_name='us-east-1').describe_instances()['Reservations'])"
Slow inventory runs -- Enable caching or reduce the number of regions. Each region requires a separate API call.
Hosts in wrong groups -- Debug keyed_groups by examining host variables:
ansible-inventory -i inventory/aws_ec2.yml --host web-server-1
Permission denied -- Verify your IAM policy includes the necessary Describe permissions for the services you are inventorying.
Duplicate hostnames -- Multiple instances with the same Name tag cause collisions. Use hostnames with a fallback:
hostnames:
- tag:Name
- instance-id # guaranteed unique
When a new instance is launched with the appropriate tags, the next playbook run discovers it automatically and applies your configuration. No inventory file to update, no manual steps, no drift. That is the power of dynamic inventory.
Related Articles
Ansible Fundamentals: Your First Playbook to Production
Learn Ansible from scratch — install Ansible, write your first playbook, understand modules, manage inventories, and automate server configuration.
Ansible Roles and Galaxy: Structuring Automation at Scale
Structure your Ansible automation with roles for reusability. Learn role directory structure, Galaxy usage, dependencies, and testing with Molecule.
Ansible Vault: Encrypting Secrets in Your Automation
Encrypt sensitive data with Ansible Vault — encrypt files, inline variables, multi-password setups, and integrate with external secret managers like HashiCorp Vault.