Tidy Cloud AWS issue #27 - infrastructure management and automation
Welcome to the next issue of the Tidy Cloud AWS bulletin! In this issue, I will focus a bit on infrastructure as code management and automation concerns.
Managing infrastructure as code
When we talk about practices around infrastructure as code, we include version control. It is a foundational part of these practices. But beyond that, it may sometimes be sketchy about what we use and do to manage any non-trivial setup of infrastructure as code.
How do we manage the state?
How do we manage configurations and releases of our infrastructure?
How do we manage access to our infrastructure definitions and state?
How do we keep the infrastructure consistent and follow policies defined?
How do we get visibility and understanding of our infrastructure?
How to we empower users of the infrastructure services in a good way?
These are questions we may ask when we are working with our infrastructure, in AWS or elsewhere.
If you use CloudFormation, or any tool that uses CloudFormation under the hood, then state storage is handled for you as a stack. CloudFormation keeps track of the state itself and can also detect state drift. You do not have any options here.
If you use other tools such as Terraform, Pulumi, or Crossplane, then state is something you have to care about a bit more and you have options on how to handle it.
For example, with Terraform, you can store the state in a local file (bad idea!). You can also store it in an S3 bucket or various other storage options. In the simplest case, it is just a storage location. In other cases, the solution adds additional management options, which may address some of the other concerns as well.
When you manage infrastructure, you also have a state to manage. You want your infrastructure state to be modular, to reduce blast radius in case something happens.
Keeping everything, or even everything in one environment in a single stack/state object is not a good idea.
Instead, group infrastructure based on lifecycles. For example, keep the networking infrastructure separate from application services.
Responsibilities and dependencies become more clear also with more modular infrastructure.
AWS themselves have limited support in this area, although there are emerging services, such as AWS Proton, that try to address part of that problem.
Doing infrastructure tasks manually is error prone, and time-consuming. When processes are in place, you can automate them.
You can use regular CI/CD tools, same as for application software. It will need to be adapted to handle the stateful nature of infrastructure, though.
Testing infrastructure as code is more challenging than testing application software. Many tests can only be done by provisioning the infrastructure, which is time-consuming.
The permissions needed to deploy infrastructure also add to the security concerns.
There is a quite nice article on the Gruntwork website about challenges with infrastructure pipelines. Most of it is relevant regardless of what tools are used.
Infrastructure management platforms and tools
Various tools have emerged to help address these challenges. Some of these tools are just focused on Terraform, while others may cover multiple IaC tools.
Hashicorp offers its Terraform Cloud and Terraform Enterprise, Pulumi has its corresponding Pulumi Service. Some 3rd party solutions include Spacelift, Env0, Scalr and Cloudify
There is a nice blog post comparing some of these tools in Four Great Alternatives to Terraform Cloud.
AWS themselves has room for improvement in this area. Also, Amazon has an organisational model with service teams that own everything, from networking elements to the application code. You can see this model reflected in some of their tools.
AWS tools that tries to address some parts include AWS Copilot and AWS CDK and AWS CDK Pipelines. There are also 3rd party tools such as Sceptre.
These more modern AWS tools address some of these issues, and the experience is less painful than the default experience with CloudFormation.
There is a need that many solutions try to address in various ways, and I think it’s worthwhile to investigate before deciding to build everything from scratch.
You can find older bulletins and more at Tidy Cloud AWS. You will also find other useful articles around AWS automation and infrastructure-as-software.
Until next time,