Cloud automation with Ansible series: Dynamic inventory

June 2, 2021 - Words by Sašo Stanovnik

June 2, 2021
Words by Sašo Stanovnik

This post was originally published on the XLAB Steampunk blog.

If you have any decent amount of servers in the cloud, you want to use automation to create and manage them. Even with few servers, having your infrastructure specified as code serves the much-needed dual purpose of documenting exactly what is necessary for your setup and giving you the ability to deploy and redeploy your stack quickly and reliably.

Ansible is a great tool for this! You can provision your infrastructure and then deploy applications onto it with a single tool, at the same time. In a previous post, we showed how to provision your infrastructure on AWS using Ansible. In this post, the first in the Cloud automation with Ansible series, we’ll be looking at the glue between provisioning and deployment. That is Ansible’s dynamic inventory.

What is a dynamic inventory?

There are generally two steps of deploying applications: provisioning whatever resources the application needs and then deploying and configuring the application on top of them. What binds the two steps together is an enumeration of resources, which in Ansible is called an inventory.

The inventory in Ansible is dynamic, which means Ansible itself figures out what resources–servers–exist at runtime. This is in contrast to having a static inventory, sometimes called a local state, which is a single source of truth for everything infrastructure-related—if it isn’t there, it’s not real.

Straight off the bat this tingles one our arachno-senses. What if something changes out of our control? Well, we could argue no such changes should be allowed, since there should be a clear process of creating infrastructure resources. However, the real world is seldom this friendly and unexpected things happen all the time.

Let’s look at how a dynamic inventory can be just as practical, if not more, as a static inventory. With the usual caveat about static analysis not being available, of course.

In this blog post, we’re going to look at how we parse a set of machines on DigitalOcean and how we can use dynamic inventory configuration to select a subset of machines for deployment. We’ll be going through everything from scratch, so feel free to follow along to play around with different parameters!

What you’ll need to start

We’ll first install ansible-core into a new virtual environment to keep things clean. Then, we’ll install the community.digitalocean collection so we have access to its content. Finally, we export the DO_API_TOKEN environment variable so the playbooks can authenticate against the API and we’re not hardcoding secrets into our playbooks.

# this makes things easy to clean up
mkdir steampunk-trials && cd steampunk-trials/
python3 -m venv .venv && source .venv/bin/activate

pip install -U pip wheel
pip install ansible-core==2.11.0
ansible-galaxy collection install community.digitalocean

export DO_API_TOKEN=<YOUR API TOKEN>

With this, we’re ready to start!

Creating some machines

In order to do anything remotely useful with inventories, let’s create two droplets. We’ll be doing this through a playbook, of course. The Ansible way.

- hosts: localhost
  vars:
    your_pubkey: YOUR_PUBKEY
  tasks:
    - community.digitalocean.digital_ocean_sshkey:
        name: stempunk-pubkey
        state: present
        ssh_pub_key: "{{ your_pubkey }}"
      register: pubkey

    - community.digitalocean.digital_ocean_droplet:
        name: "steampunk-{{ item.type }}-{{ item.index }}"
        tags: "{{ item.tags }}"
        unique_name: true
        ssh_keys:
          - "{{ pubkey.data.ssh_key.fingerprint }}"
        size: s-1vcpu-1gb
        region: fra1
        image: centos-8-x64
        state: present
      loop:
        - type: appserver
          index: 1
          tags:
            - steampunk-test
            - app
        - type: dbserver
          index: 1
          tags:
            - steampunk-test
            - db

There are two tasks, both creating identical virtual machines. Since this is a proof of concept for an inventory, we won’t be running anything on them, so we don’t need to do anything special, we only need the machines to be accessible. The only difference between the two tasks are the tags. We pretend the first one is an application server and the second is a database server. This way we can show off grouping functionalities later. We also tag both with a “project name” so they’re easier to identify and delete later. Run it using the usual:

$ ansible-playbook playbook.yml

Finally getting an inventory

To use a dynamic inventory in Ansible, there needs to be an inventory plugin written for the provider. Fortunately, the community DigitalOcean collection has the community.digitalocean.digitalocean inventory plugin!

We need to use a configuration file to both tell Ansible what to use and how to use it. Here is the configuration file we’ll be using, named digitalocean.yml.

plugin: community.digitalocean.digitalocean

The contents are quite simple. The first line, specifying the plugin, instructs Ansible what inventory plugin to load and use. All subsequent lines (we’ll add those later) DigitalOcean-specific plugin options. We’ll be leaving everything at its default settings.

We don’t need to add the API token here explicitly, as we’ve exported the DO_API_TOKEN variable, which the inventory plugin is able to grab and use.

Inventory configuration files choose inventory plugins using a combination of the filename and the plugin variable. For the community.digitalocean.digitalocean inventory plugin, the configuration file name must end with (do_hosts|digitalocean|digital_ocean).(yaml|yml).

To get an inventory without running any playbooks, we use ansible-inventory.

$ ansible-inventory -i digitalocean.yml --graph --vars

@all:
  |--@ungrouped:
  |  |--steampunk-appserver-1
  |  |  |--{do_id = 243276125}
  |  |  |--{do_name = steampunk-appserver-1}
  |  |  |--{do_networks = {'v4': [{'ip_add [...] ': []}}
  |  |  |--{do_region = {'name': 'Frankfur [...] 6gb']}}
  |  |  |--{do_size_slug = s-1vcpu-1gb}
  |  |--steampunk-dbserver-1
  |  |  |--{do_id = 249985641}
  |  |  |--{do_name = steampunk-dbserver-1}
  |  |  |--{do_networks = {'v4': [{'ip_add [...] ': []}}
  |  |  |--{do_region = {'name': 'Frankfur [...] 6gb']}}
  |  |  |--{do_size_slug = s-1vcpu-1gb}

The --graph and --vars flags are there to create an inventory graph instead of a simple list and to display host variables alongside the hosts. Because the account we’re using contains only the two droplets, we only get the two hosts in the output.

You can see the tags attached to the hosts, as well as their names. A very important thing to note here is that this inventory is nearly unusable as it is. Why? It’s because we have no way of connecting to the machines! We’re missing an ansible_host variable that would instruct Ansible on where the machine actually lives.

This is a bit of a complication at first glance, but we think this is perfect for making things explicit. Our example is simple in that we have two machines with two external IP definitions, but this is not always the case, and Ansible (or, well, the collection developers) makes you decide on how you are going to connect to the machines yourself. Imagine these two machines having no externally-accessible IP address, but you have a VPN set up to the cloud network, so you could just use those directly. The inventory plugin has no way of knowing those details. Similarly, for executing via jump hosts or directly in a cloud machine, you decide how to connect.

Connecting to the machines

So what we need to do is instruct the inventory plugin to define ansible_host to the external IPs we have created. The user to connect as can also be specified, along with anything else you would like. For flexibility, the compose section is rather verbose. We also limit the attributes from the defaults somewhat so our printouts are a bit more readable.

plugin: community.digitalocean.digitalocean

attributes:
  - id
  - name
  - tags
  - networks

compose:
  ansible_host: do_networks.v4 | selectattr('type','eq','public') | map(attribute='ip_address') | first
  ansible_user: "'centos'"
  ansible_ssh_common_args: "'-o StrictHostKeyChecking=no'"

The compose key adds variables to each host. Key names are variable names, while their values are Jinja2 expressions. Here, we use multiple variables and filters to, in sequence,

select IPv4 address definitions,
filter them by only including those whose type is public,
extracting the actual IPv4 address out of the complete definition and lastly
selecting the first address out of the bunch.

You can see how flexible this is—you can fit it to anything you want. If you need any other variables, you can set them in the compose section in the same way. We’re doing this to disable host key checking. This should not be done for real workloads, but as we’re only playing around, it’s completely fine.

Let’s try connecting.

$ ansible -m ping -i digitalocean.yml all

steampunk-dbserver-1 | SUCCESS => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/libexec/platform-python"
    },
    "changed": false,
    "ping": "pong"
}
steampunk-appserver-1 | SUCCESS => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/libexec/platform-python"
    },
    "changed": false,
    "ping": "pong"
}

It works!

Remember how we’ve deployed an “application” and “database” server? We can make Ansible group these servers up for us based on the tags.

plugin: community.digitalocean.digitalocean

attributes:
  - id
  - name
  - tags
  - networks

keyed_groups:
  - key: do_tags | lower
    prefix: ""
    separator: ""

compose:
  ansible_host: do_networks.v4 | selectattr('type','eq','public') | map(attribute='ip_address') | first
  ansible_user: "'centos'"
  ansible_ssh_common_args: "'-o StrictHostKeyChecking=no'"

Running ansible-inventory now produces a slightly different output, where the hosts are grouped.

$ ansible-inventory -i digitalocean.yml --graph --vars

@all:
  |--@app:
  |  |--steampunk-appserver-1
  |  |  |--{ansible_host = 207.154.233.128}
  |  |  |--{ansible_ssh_common_args = -o StrictHostKeyChecking=no}
  |  |  |--{do_id = 243296785}
  |  |  |--{do_name = steampunk-appserver-1}
  |  |  |--{do_networks = {'v4': [{'ip_add [...] ': []}}
  |  |  |--{do_tags = ['steampunk-test', 'app']}
  |--@db:
  |  |--steampunk-dbserver-1
  |  |  |--{ansible_host = 46.101.215.219}
  |  |  |--{ansible_ssh_common_args = -o StrictHostKeyChecking=no}
  |  |  |--{do_id = 243296841}
  |  |  |--{do_name = steampunk-dbserver-1}
  |  |  |--{do_networks = {'v4': [{'ip_add [...] ': []}}
  |  |  |--{do_tags = ['steampunk-test', 'db']}
  |--@steampunk_test:
  |  |--steampunk-appserver-1
  |  |  |--{ansible_host = 207.154.233.128}
  |  |  |--{ansible_ssh_common_args = -o StrictHostKeyChecking=no}
  |  |  |--{do_id = 243296785}
  |  |  |--{do_name = steampunk-appserver-1}
  |  |  |--{do_networks = {'v4': [{'ip_add [...] ': []}}
  |  |  |--{do_tags = ['steampunk-test', 'app']}
  |  |--steampunk-dbserver-1
  |  |  |--{ansible_host = 46.101.215.219}
  |  |  |--{ansible_ssh_common_args = -o StrictHostKeyChecking=no}
  |  |  |--{do_id = 243296841}
  |  |  |--{do_name = steampunk-dbserver-1}
  |  |  |--{do_networks = {'v4': [{'ip_add [...] ': []}}
  |  |  |--{do_tags = ['steampunk-test', 'db']}
  |--@ungrouped:

Awesome! We can now execute Ansible commands, or playbooks, on a subset of hosts, defined completely by their tags. Let’s ping only the application servers. We’re using a simple ping because the point here is the inventory, not deploying applications.

$ ansible -m ping -i digitalocean.yml app

steampunk-appserver-1 | SUCCESS => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/libexec/platform-python"
    },
    "changed": false,
    "ping": "pong"
}

Now, let’s modify the playbook we initially used to include one more application server.

...
      loop:
        - type: appserver
          index: 1
          tags:
            - steampunk-test
            - app
        - type: appserver
          index: 2
          tags:
            - steampunk-test
            - app
        - type: dbserver
          index: 1
          tags:
            - steampunk-test
            - db
...

And let’s now create the new droplet and run the same ping command against all application servers.

$ ansible-playbook playbook.yml
$ ansible -m ping -i digitalocean.yml app

steampunk-appserver-1 | SUCCESS => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/libexec/platform-python"
    },
    "changed": false,
    "ping": "pong"
}
steampunk-appserver-2 | SUCCESS => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/libexec/platform-python"
    },
    "changed": false,
    "ping": "pong"
}

The command was executed against both servers, without any inventory modification necessary!

Cleaning up

Cleaning up after ourselves is quite simple with Ansible. Change all occurrences of state: present in the playbook into state: absent and just run it again! In our case, we also have to remove the ssh_keys definition, since it includes a variable sourced from the previous task, and it won’t be available on deletion. That’s it!

Ansible’s dynamic inventory is stateless

Ansible’s dynamic inventory operates on state. That state is remote, what actually exists. Apart from local caching, that is always the ground truth. No matter how instances are created, modified or deleted, Ansible inventory scripts never go out of sync.

We’ve shown a very simple example of using dynamic inventories. New possibilities arise when you have more machines, since you can codify complex scenarios. You can bind infrastructure provisioning with application deployment using framework-agnostic tags, so you could even mix and match the tools you use for both.

Ansible can be used for more than just infrastructure management! If you are interested in learning more, you can check out this post about why cloud automation is its forte.

This is the first part of a multi-part series on cloud automation with Ansible. Stay tuned for a writeup on inventory caching, very useful with complex workflows on many nodes.