Is AWS Cloud Development Kit (CDK) right for you?
If you are looking to automate your AWS infrastructure provisioning, you may have heard of AWS Cloud Development Kit or AWS CDK for short. This is one of a multitude of tools that are offered when it comes to automating your infrastructure creation and management if you are using AWS as a cloud provider.
I will present the reasons for when I think AWS CDK is a good choice and to what extent it may be used. This will not be a comparison of features for different solutions, but more around an organization, way of working, and philosophy. Will AWS CDK fit the way you work and organize, or might be striving for?
Other tools in this space which you may have look at or are using already may include AWS CloudFormation, Terraform, and Pulumi. These are general tools to provision and manage cloud infrastructure. AWS CloudFormation is focused on AWS cloud infrastructure, while both Terraform and Pulumi are more agnostic in terms of providing support.
You have also have looked at more specialized infrastructure provisioning tools, such as Serverless Framework or ECS Copilot and wonder whether a more general tool is preferable or not, since the more general tool may be more complex to use?
But first, a summary of what AWS CDK is.
What is AWS Cloud Development Kit?
AWS Cloud Development Kit (CDK) is both a command-line tool and a framework for provisioning and managing cloud infrastructure, primarily AWS cloud infrastructure. It accomplishes this by describing the desired state of cloud infrastructure by using general programming languages. This means that program statements will describe your networks, servers, databases, load balancers, or any other infrastructure entities or services that you may use as part of your (AWS) cloud-based solution.
This follows the idea of infrastructure as code, which is a practice where infrastructure can be managed and provisioned in the same way as software applications. The concept does not require the “code” to be written in a programming language, but that it can be read and processed by machines or software, and that it is managed with software development lifecycle-type tools, such as version control systems, testing tools, etc.
This also allows for making it possible to automate the processing, and to make that process consistently repeatable and with updated documentation - the “code” is part of the documentation.
AWS Cloud Development Kit is a framework built on top of AWS CloudFormation. When you run the CDK code, you generate CloudFormation templates and/or deploy these templates as CloudFormation stacks (depending on what you do through the CDK command-line tool). As such, AWS CDK interoperates quite well will AWS CloudFormation.
AWS CDK supports multiple programming languages, including Typescript, Python, .NET/C#, and Go in developer preview.
The ideal organization for AWS CDK
What kind of organization fits well with AWS CDK? It may not come as a surprise that the type of organization that AWS (and Amazon) themselves run fits well with the philosophy of AWS CDK. Within AWS and Amazon, the concept of two-pizza teams are in place and they embrace the idea of you build it, you run it.
This means that the organization responsible for the software solutions consists of reasonably small teams that are responsible for both building and operating the software solutions they are responsible for. This is part of embracing the DevOps culture within the organization - shared responsibility and increased collaboration between development and operations.
The idea with AWS CDK is that not only should the application code be owned, managed, and operated by a team, but also the, directly and indirectly, supporting infrastructure. With Cloud infrastructure, this is entirely possible to treat in the same way as application software.
In this context, directly supporting infrastructure are servers, databases, and other infrastructure components that are used by the application software. For indirectly supporting infrastructure, that includes for example infrastructure used to run continuous integration and continuous delivery pipelines.
In the AWS CDK philosophy, all of this can be programming language code, and also, all of it can be maintained in the same place, by the same team. This has a few implications which may not be obvious at first (it certainly was not for me) but should explain why some examples and features in AWS CDK are constructed the way they are. If you see something in AWS CDK that seems a bit stupid, try to see it in the light of this philosophy.
Configuration as actual code
One of these areas that did not make sense to me at first was that there were multiple examples where different environments had explicit code segments in the code itself. It looked like hard-coded values - and these are bad. Why would they do it like that?
But the practice of configuration files comes from situations where developers are not the same as the ones that run the solution and those who run the code may not know or understand the code used. If the same team runs and operates both infrastructure and application code, then the configuration may be represented as code as well. After all, we should treat configurations with the same rigor as the actual code itself.
If we treat configuration as actual code, then the same type of tooling used for software development may be used for configuration as well. It also means less code to translate and validate configuration in formats such as YAML, TOML.
CI/CD as actual code
Another area here is the continuous integration & continuous delivery pipelines that a team may use to automate test, delivery, and deployment of the software solution. AWS CDK has support to defines such pipelines in code and is part of the same source code repository as the application infrastructure (and application) code itself. This feature is referred to as CDK Pipelines. At this time of writing, this particular feature is still in developer preview.
Again, this is something that can work well if the solution team itself is responsible for their CI/CD pipeline and this is not delegated to a different part of the organization.
Test your infrastructure
One fairly common complaint with infrastructure-as-actual-code approaches is that it more difficult to understand exactly what infrastructure may be provisioned or changed, just by looking at the code. Instead, tools such as AWS CloudFormation and Terraform are preferable since it would be easier to understand the declarative configuration descriptions there.
This is true.
General programming language code is much more versatile in expressing different constructs than a plain configuration file format. At a high level, programming language code can describe something more succinctly, but it may be harder to understand the low-level implications of that code. Configuration file format may have more difficulty in expressing high-level constructs but maybe more clear when it comes to low-level details what the desired state is or will be.
For any non-trivial amount of infrastructure, it will become difficult, if not impossible, to hold a mental image of all the implications of any changes in the head. We need to test our infrastructure “code”, regardless of whether it is actual code or some configuration format. But this is an area I think many of us may still be pretty bad.
For CloudFormation and Terraform, this pretty much requires dedicated software tools for these particular formats. For AWS CDK and other tools that use actual programming languages, this may leverage already existing testing tools for application software development. It does not eliminate the need for specific tools for infrastructure code testing, but there are more tools to consider in that regard.
AWS CDK comes with some tools for CDK testing, although the support varies with the programming languages supported. Again, performing testing of the infrastructure with AWS CDK is in line with a team working on both infrastructure and application code.
What if my organization does not consist of DevOps-oriented two-pizza teams?
If your organization does not fit in with the ideal model and philosophy around AWS CDK, should you still consider it?
If your infrastructure is (or will be) mainly AWS cloud infrastructure and you at least strive for a DevOps-type way of working, then yes. But you to be careful how to adopt it - there will likely be more work if separate teams are managing the infrastructure and the application code.
Any infrastructure-as-code provisioning tool may be a bit of a struggle in an organization that is very far from a DevOps culture - a tool that uses actual programming language code may though increase the opportunities to shoot yourself in the foot in that context.
If the people that work with the infrastructure are not comfortable or familiar with writing software on a daily/weekly basis, another option will likely work out better.
AWS CDK and AWS CloudFormation vs TerraForm and Pulumi
For provisioning infrastructure in AWS, there are other tools besides those provided by AWS themselves. This includes Terraform and Pulumi. Both of these are not tied to any particular public cloud provider, or not even to public cloud providers only. Any kind of Software-as-a-service (SaaS) provider that can provide some service or infrastructure via programming interfaces can in theory be provisioned by these tools. Terraform has a long list of providers, and Pulumi can use Terraform providers in addition to its providers.
If you do not have a significant majority of your (cloud/SaaS) infrastructure on AWS, then these other options may be worth considering if it is important to automate the non-AWS infrastructure provisioning. You can technically build custom resources with AWS CDK/CloudFormation to handle that, and AWS provides for example the CloudFormation CLI to facilitate that kind of development. But more cloud-agnostic tools like Terraform and Pulumi may already have support available.
Another key difference is that the provisioning engine and the state of the deployment are stored within AWS by CloudFormation when you use AWS CDK and/or AWS CloudFormation. This is all taken care of there. For Terraform and Pulumi you either use some storage that you have set up yourself for the state, or you can use their (paid) services (Terraform Cloud/Enterprise, Pulumi for Teams). Again, this will be an issue that becomes important if you want to manage a significant amount of non-AWS infrastructure as code.
While Terraform is similar to AWS CloudFormation in that it uses a largely configuration file format (HCL) there is also ongoing work to develop a CDK for Terraform, which takes the very same idea of AWS CDK - even some of the underlying code is the same. It is still in the alpha stage at the time of writing though and too early to consider for production use.
AWS has some good material about AWS CDK, which includes:
- Working backwards: The story behind the AWS Cloud Development Kit
- Best practices for developing Cloud applications with AWS CDK
- AWS CDK documentation
- CDK Workshop - online training/workshop to get familiar with AWS CDK
Also check out community-based resources:
- CDK Day - 2021 is the second time for the one-day event for everything CDK
- Awesome CDK
- cdk.dev - Community-driven website, newsletter and Slack for all things CDK
In this text, I have touched on a few areas when AWS CDK may be an appropriate choice for an infrastructure provisioning tool. This boils much down to the way of working and organization and less about specific features of the AWS CDK. Only when it seems to have a reasonably good fit in terms of (current or near-future) organization and philosophy, should more in-depth work be done.
I love using AWS CDK and have had projects where it fits very well. But there have also been cases in hindsight, other options would have been better - this has always largely been a question of organization and philosophy.