This post was originally published on the
XLAB Steampunk blog.
Module documentation is an entry point for Ansible playbook authors. So it is vitally important that we keep the said documentation in sync with the module implementation or risk getting angry bug reports. But here lies the problem.
Maintaining the synchronization between documentation and implementation is not as straightforward as it could be. Why? Because each Ansible module contains two copies of parameter-related information. Module maintainers must first document each module parameter and then copy a slightly transformed description into the parameter validator.
Currently, there is no way to remove the information duplication from the Ansible modules. So we decided to do the next best thing: automate the heck out of the copy-to-the-validator part. Laziness for the win ;)
A quick note on terminology
Arguments, options, and parameters in this post all have the same meaning. They represent the data that the Ansible playbook author passes to the Ansible module.
So why use different words to describe the same thing? Because the rendered Ansible documentation uses the name parameters, the documentation block in Ansible modules uses options, and validation-related code calls them arguments. Because why not ;)
Where does it go wrong
Each Ansible module is composed of different sections, but today we
are only interested in two of them: the [documentation block][doc] and the
AnsibleModule
instantiation.
Documentation block
The documentation section is an inlined YAML document that, among other things, also contains an options key where all module parameter descriptions live. For example, this is how we would describe options for a simple module:
DOCUMENTATION = """
module: some.awesome.thing
short_description: Manage resources
options:
name:
description:
- Resource name.
type: str
required: true
state:
description:
- Resource's desired state.
type: str
choices: [ present, absent ]
default: present
"""
If we were to render the previous parameter description into HTML, we would get back a table row similar to this one:
And people who prefer the terminal over web browsers (hello brothers and
sisters) can use the ansible-doc
utility to print this information to
console.
In a perfect world, Ansible playbook authors would first read the API documentation and then write a task without making any mistakes. But unfortunately, we live in a world where mistakes do happen, and most people only read the documentation when something goes wrong. And it is hard to blame them for that because let us face it: most of the documentation that we, developers, write is crap ;)
AnsibleModule
instantiation
Because we want to catch at least some mistakes in Ansible playbooks, Ansible
modules validate their parameters. For our sample options from the
documentation example above, we would instantiate AnsibleModule
like this:
def main():
argument_spec = dict(
name=dict(
type="str", required=True,
),
state=dict(
type="str", default="present", choices=["present", "absent"],
),
)
module = AnsibleModule(
argument_spec=argument_spec,
)
And just like that, we duplicated some of the information, broke the DRY principle, and set ourselves on a path of documentation and validation desynchronization.
Dealing with duplication - the Ansible way
So, how is Ansible currently dealing with the problem of information duplication? In short, it does not. What Ansible does offer is a way of detecting the desynchronization between the two copies of information.
The validate-modules sanity test will report the discrepancies between
the API documentation and argument specification, but we still need to resolve
them manually. For example, if we would change the default value for the
state parameter to absent
in the argument specification, Ansible’s sanity
test would report back with this error:
$ ansible-test sanity --requirements --test validate-modules
ERROR: plugins/modules/thing.py:0:0: doc-default-does-not-match-spec:
Argument 'state' in argument_spec defines default as ('absent') but
documentation defines default as ('present')
This detect-and-fix approach works pretty well if we are working with existing Ansible modules. Parameter changes are usually small in such scenarios, which keeps the error message count low. But things start to go downhill if we are writing new Ansible modules. In cases where the Ansible module contains a non-trivial amount of parameters, we can quickly end up with hundreds of errors.
Introducing argument specification generator
Since we are adding new modules to existing Ansible collections quite often, we spend a considerable fraction of development time dealing with the initial desynchronization of information.
That prompted us to develop the ansible-argspec-gen tool that will generate argument specification directly from the module documentation. And yes, developers should not be allowed to name things ;)
Here is how the ansible-argspec-gen
tool works:
- it starts by extracting the Ansible module’s documentation,
- then, it generates the argument specification from the extracted documentation,
- and finally, it updates the module’s source code between the markers.
Let us see how this would work on our sample module. Once we add markers to the module’s source code, we will end up with something like this:
def main():
# AUTOMATIC MODULE ARGUMENTS
argument_spec = dict(
name=dict(
type="str", required=True,
),
state=dict(
type="str", default="present", choices=["present", "absent"],
),
)
# AUTOMATIC MODULE ARGUMENTS
module = AnsibleModule(
argument_spec=argument_spec,
)
Note that we are using a default marker text in our example, but you can
customize it via the --marker
argument. Now we are ready to run the
generator:
$ ansible-argspec-gen sample.py
Once the previous command terminates, our module will look like this:
def main():
# AUTOMATIC MODULE ARGUMENTS
argument_spec = {
"name": {"required": True, "type": "str"},
"state": {
"choices": ["present", "absent"],
"default": "present",
"type": "str",
},
}
# AUTOMATIC MODULE ARGUMENTS
module = AnsibleModule(
argument_spec=argument_spec,
)
Magic ;) But there is more. If we supply the --diff
switch to the tool, it
will also print the module changes to the console:
$ ansible-argspec-gen --diff sample.py
--- sample.py.old
+++ sample.py.new
@@ -20,14 +20,14 @@
def main():
# AUTOMATIC MODULE ARGUMENTS
- argument_spec = dict(
- name=dict(
- type="str", required=True,
- ),
- state=dict(
- type="str", default="present", choices=["present", "absent"],
- ),
- )
+ argument_spec = {
+ "name": {"required": True, "type": "str"},
+ "state": {
+ "choices": ["present", "absent"],
+ "default": "present",
+ "type": "str",
+ },
+ }
# AUTOMATIC MODULE ARGUMENTS
module = AnsibleModule(
The return code of the program indicates what happened during the run:
0
means that nothing changed.1
means that the tool updated at least one module.2
means that an error occurred during the execution.
And what we get if we combine the status code with the --dry-run
switch? A
check for our continuous integration pipeline that makes sure developers do not
forget to run the tool. You are welcome ;)
The tool can do a few other tricks, like extracting various constraints from the documentation. But we will leave this information for another post. In the mean time, you can play with a sample module that we put into a GitHub gist.
Where can you start
The proper way of solving the issue at hand would be to remove the duplication from the code. But this is not entirely trivial to implement since the information from documentation fragments is not available to modules at runtime.
Thus, no matter how ugly the argument specification generator may look at first sight, it is currently the best weapon against information desynchronization.
Does all this sound complicated? Avoid getting your hands dirty and reach out. Get a high-quality Ansible integration in a fraction of the time with the help of our team. We are ready to do the heavy lifting for you.
Cheers!