Introduction to configuration management tools

9. april 2021 - Avtor Gregor Berginc

9. april 2021
Avtor Gregor Berginc

This post was originally published on the XLAB Steampunk blog.

In this article, you’ll learn the basics of Configuration Management tools so that you can decide which one you need. Whether you’re looking to automate your infrastructure management or are just browsing for future reference, you’ll walk away with a handful of useful takeaways and insights.

We’ll start with what configuration management is and what it’s used for. Once you understand its application, we’ll compare a few existing configuration management tools and discuss their differences and overlapping features.

What is Configuration Management?

Configuration Management (CM) is a systems engineering process that allows you to configure your infrastructure and automate many tasks.

Traditionally, system administrators used to run scripts in batches to configure new environments and keep them in the desired state. This intricate process required manual documentation of every change, so human error was not uncommon.

Then CM came into the picture. Think of it as an up-to-date inventory for all of your assets, which include:

Databases
Applications
Shared cache servers
A load balancer
Monitoring tools

Now think about all of the config files on these servers (ini, xml, conf, yaml). What about the packages, dependencies, and third-party libraries that each server might require?

What if there was a tool that could aggregate all of that configuration, wrap it around a central config file and always stay vigilant of any changes?

In the last decade, a plethora of configuration management tools hit the software space. Tools like Ansible, Puppet, Salt, Chef, Terraform and many more offer a wide range of capabilities to ease your infrastructure development.

While they have many overlapping features, each software offers a unique set of tools.

Why is Configuration Management important?

In the era of microservices, we shouldn’t configure each service manually: it’s time-consuming and prone to human error. Configuration management allows you to centralize and automate your infrastructure configuration.

An average web application has at least one database, a shared cache server, and a few web app instances. When scaling horizontally, it’s critical to be able to change configuration quickly and revert in the case of disaster.

To minimize user impact, the architecture should be resilient and immune to any outages.

CM enables you to tame your infrastructure, reduce operation costs and cut downtime. Think of configuration management as an architectural framework of a house - it provides a strong base for streamlined development.

Main benefits of Configuration Management

Organization. Instead of piecing together scattered scripts, store your configurations in a central location. This prevents firefighting and unnecessary downtime. Controlling your entire infrastructure from one place enables effective collaboration between teams. CM turns the most cumbersome infrastructures into flexible architectures. It tracks changes, monitors performance, and enforces consistency.

Cost efficiency. Reduce the risk of security breaches by tracking architectural changes. Cut costs by quickly scaling down when possible. CM allows you to avoid wasteful duplication and eliminate redundant assets in your cluster. Automating configuration can save you time and effort as your architecture grows. Focus on infrastructure improvement, not maintenance.

Rapid environment deployment. Repeatable configurations let you whip up a sandbox cluster in a matter of minutes. Test new configuration updates and roll them out in an incremental fashion without worrying about breaking something.

Stable infrastructure development. Infrastructure as Code (IaC) allows for incremental and trackable updates to your infrastructure. Set up robust environments with just a few lines of declarative code. Track your infrastructure changes as part of the Software Development Life Cycle (SDLC) to keep your system agile and your users happy.

Hand a chunk of operations work to developers. By definition, development seeks change while operations strives for stability. Even though DevOps merges these two IT branches and helps to balance out their conflicting interests, letting developers have control over their own architecture will only streamline the overall development.

What are the Configuration Management tools?

Now that we’ve discussed the software Configuration Management (CM) concept, let’s clarify the idea of CM tools.

These are the tools that ensure consistency across the cluster of physical and logical assets. Designed to automate the deployment, they facilitate continuous integration and handle resource management across your infrastructure.

Back in the day, manual configuration management was fraught with risks like disruptions to application availability and data corruption. To alleviate these, sysadmins strived to automate as many processes as possible by writing scripts and executing them in batches. This is what led to the rise of Configuration Management tools.

The first versions of major CM tools came out in the early 2000s. They mainly relied on an imperative approach of running batch scripts to bring environments to the desired state.

Within a decade, CM tools evolved into automation wizards and orchestration maestros. Batch scripts have now been organized into a set of configuration files that can spin up the entire environment with a simple keystroke.

Choosing a Configuration Management tool

The variety of CM tools can overwhelm even the most experienced sysadmin. To pick the tool that’s right for you, consider the following:

The community behind the technology. More established and mature technologies seem to have a larger community of supporters. If you don’t plan on buying enterprise support, this factor is essential - it might be your last and the only line of defense when you run into problems. Although older technologies might appear more mature and time-proven, they tend to be more cumbersome.

Purpose. How many existing systems will the CM tool need to configure and integrate with? Will most of your environments serve operations or development purposes? A system designed for operations staff will differ from one designed for developers.

Deployment frequency. Are you an agile development team that needs to ship out changes several times a week? Or do you have more established applications that require stability?

User-friendly interface. Consider how comfortable your team members are with Linux and the command line. Do they need a rich Graphical User Interface or are they fine with a bare-bone terminal? What previous technologies have they worked with?

Organizational size. If your organization is large and has a complex network of applications, choose a CM tool more conservatively. A mature, widely-supported and highly secure product like Ansible would be the top choice. It has a proven track record and a wide community of supporters.

There are many CM tools out there. However, as this is just an introduction to configuration management, we’ll limit our choices to the few that stand out the most (in our humble opinion).

Top 5 Configuration Management tools

Many CM tools have overlapping features, but each stands out in its own special way. We’ll list the most prominent ones here to help you make the right decision.

Ansible

With a market share of 27%, Red Hat’s Ansible has the biggest community to help you propel your development in times of architectural adversity. Being the most versatile CM tool on the market, Ansible is trusted by Intel, Atlassian, Cisco, Twitter, Verizon, and even NASA.

Best suited for automation, deployment and provisioning, this hefty CM tool can handle the most complex IT workflows. Many benefits of Ansible make it the top choice not only for large enterprises, but also for small organizations and even startups.

Agentless. Ansible’s push-based configuration approach leaves more resources for your applications or simply allows you to remain lightweight. The control node securely provisions all of its managed nodes or hosts through SSH and doesn’t need to run any agents on remote servers. This makes it convenient to add existing resources under configuration management.

Readable syntax. Ansible’s Domain Specific Language is based on YAML and is extended with templating ability. It lets you parameterize your playbooks and keep sensitive data out of plain-text configuration until it hits its final destination. The inventory file, which is used by Ansible as a map to the infrastructure, can use a syntax similar to that of “ini”. However, ini-style inventory is really only suitable for small scale deployments and hello world demos. In all other use cases, we would probably use one either YAML format or dynamic inventory.

Wide selection of modules. When you install Ansible community edition, you get access to over 4,000 modules that let you network with Windows and Linux servers. If you ever need additional functionality, one of these modules most likely offers it already.

Reusability. Ansible can run your existing shell scripts so that you won’t have to rewrite them. If you already have a semi-automated architecture, Ansible can streamline your configuration management.

Free. Ansible is an open-source and free software backed by a large community. Many forums, stack overflow threads, and descriptive documentation will always be within your reach should you run into any problems along your configuration journey.

If you want to get up and running quickly while still ensuring consistency and reliability, there is a big chance that Ansible could be your tool.

Puppet

Before Ansible hit the market a little less than a decade ago, Puppet was the king of configuration management.

Unlike Ansible, Puppet takes a master-agent approach, requiring an agent on every managed node. The master stores the code to define the desired state of the entire infrastructure and delivers the individual configuration catalogs to each agent. The node then brings itself to the desired state by applying the configuration shipped from the master node.

It’s worth mentioning that the domain-specific language used to describe the Puppet’s system configuration is more complex than the one used by Ansible, so it does take some time to grasp. Advanced and real-time tasks require the use of Ruby and input from CLI.

In addition to the server and agent, Puppet has:

PuppetDB - stores all of the data generated by Puppet (for example, facts, catalogs, and reports).
Hiera - a tool used to separate the data from the code and place it in a centralized location. This allows you to specify variations, make your code testable, and validate all edge cases of your parameters.
Facter - this is Puppet’s inventory tool, which gathers facts about the agent node (hostname, IP, OS) and reports it back to the master.

Puppet requires a bit more manual configuration, including specifying DNS entries to resolve IP addresses. Additionally, restoring previous states is a bit harder with Puppet because it doesn’t have an intuitive way of reverting config changes.

Chef

In the last decade, Chef has become the tool of choice for many startups and small organizations. These are the companies that often lack the dedicated DevOps and operations teams that larger businesses have access to. Unlike Ansible and Puppet, Chef uses Ruby to write configuration files called “recipes”. This empowers developers to code the desired state. The ability to write configurations with Ruby makes Chef the developer’s choice.

Like Puppet, Chef’s architecture follows a master-agent pattern. However, besides running agent software on each managed node, Chef also introduces another component called Chef Workstation. This acts as a hub for the infrastructure code and stores all of the configuration before pushing it to the central server.

Although Chef is more distributed and has a powerful way of writing configurations, it is slower to set up and more complex than Puppet. But while Puppet lets you describe your configuration, Chef lets you code it.

If you have more experience with system administration, consider Puppet. However, if your team consists mainly of developers, Chef might suit you better.

Salt

Salt is a Python-based open-source software for event-driven IT automation and configuration management.

Just like Ansible, Salt was originally built as an execution engine, so it can execute commands on remote machines using modules. By default, Salt isn’t agentless and it requires agents (called minions) running on remote hosts.

These minions are intelligent agents who pull their configuration from the master node. They also fire change events and post them to the message queue. Once posted, the event becomes available for the master to pick up and deal with. This event-driven approach has given Salt a unique reputation for being a fast and reactive tool.

Salt uses human-readable syntax, so there’s no need to learn a new domain-specific language or maintain cumbersome Ruby code. Like Ansible, Salt is Python-based and uses YAML to describe configurations.

Salt’s fast communication (due to ZeroMQ message bus) in tandem with its reactive architecture makes it the perfect tool for detecting configuration drifts and unexpected changes.

Every feature in Salt is pluggable and extensible, meaning that you can develop your own changes without needing to maintain a parallel fork of the main project.

Terraform

Even though Terraform has many overlapping features with the tools mentioned above, it’s not a pure CM tool, but rather an infrastructure provisioner and orchestrator. Because of this, Terraform provisions your infrastructure and leaves the configuration of individual servers to CM tools.

One of the main advantages of using Terraform is that it provides support for immutable infrastructure. This means that any updates will produce a new version of the base image, preventing any possibility of configuration drift (subtle differences between servers that are hard to diagnose or reproduce).

A perfect use case for Terraform would be deploying fully-configured Docker containers. Since each Docker image contains all of the necessary dependencies and libraries, you could tell Terraform to bluntly place it on a virtual machine instance without worrying about breaking a thing.

Terraform can be used in collaboration with pure CM tools rather than replacing them. For example, you could use Terraform to ‘wire up’ the infrastructure and then use Ansible to deploy your applications to the existing servers.

Upon failure of individual components, Terraform will automatically restore the desired state of the entire architecture. With this in mind, it’s best suited for environments that need consistent configuration and an invariable state.

Summary

All of the listed tools perform configuration management and orchestration to varying degrees. CM tools install and manage software on existing servers, while orchestrators focus on infrastructure.

Puppet, Chef, and Salt manage individual nodes by running agents on remote hosts. Ansible is completely agentless and pushes configurations through SSH.

Meanwhile, Terraform is more of an orchestrator; it has some configuration management capabilities, but it is primarily meant for orchestration. Therefore, it often complements one of the CM tools that we’ve discussed above.

Choosing the right tool for the job is always a challenge. However, if you’re not using any configuration management tool yet, it’s best to start with Ansible and leverage its immense community for fast implementation.

Did we answer your questions about Configuration Management and help you find the appropriate tools for you? We have a lot more in store and we wouldn’t want for you to miss out on similar useful articles. Subscribe to our newsletter and get them delivered straight to your inbox.