In a previous post, we looked at how to use CloudFormation Macros to provide a simpler DSL around CloudFormation or to provide company-wide defaults around particular resources.
However, sometimes you need more than what CloudFormation currently offers. Perhaps CloudFormation doesn't have support for a resource that you need. Or maybe you want to use a third-party resource, like Auth0 or Algolia, in your application.
In this post, we'll learn about CloudFormation custom resources. Custom resources greatly expand what you can do with CloudFormation as you can run custom logic as part of your CloudFormation deployment.
And with custom logic, you can do anything you want. *twirls moustache*
But custom resources can be complicated, and using them incorrectly can wreak havoc on your CloudFormation stack. In this post, we'll learn when, why, and how to use custom resources.
This post covers:
But background information will only take you so far. There's no subsitute for hands-on learning, so this post also includes two walkthroughs of creating custom resources:
Adding third-party resources to CloudFormation by provisioning a Github webhook
Extending AWS offerings and handling slow resources by provisioning an ACM certificate
This is a heavy one, so let's get started!
What are CloudFormation custom resources and when should I use them?
CloudFormation custom resources are bits of logic to run during the provisioning phase of your CloudFormation template. They allow you to extend CloudFormation to do things it could not normally do.
CloudFormation custom resources work by firing a webhook while processing your CloudFormation template. Your handler will receive this webhook and run any logic you want.
Because you are in charge of writing the logic in your custom resource handler, you have significant power in what you can do with CloudFormation custom resources.
Generally, CloudFormation custom resource behavior falls into one of the following buckets:
Provisioning AWS resources that are not supported by CloudFormation.
While CloudFormation coverage is pretty good, there are still gaps in support for resources. You can use custom resources to add in support for missing resources, allowing you to maintain infrastructure-as-code even where AWS doesn't allow it.
A few examples in this bucket are:
Tip: If you want to see other AWS resources that are unsupported in CloudFormation, check the reponses to this Twitter thread. I'm particularly grateful for Ben Bridt's CloudFormation Gaps repository on Github.
Provisioning non-AWS resources with CloudFormation.
The second reason to use custom resources is to add infrastructure-as-code properties to non-AWS resources.
AWS is the Wal-Mart of the cloud, offering you a wide selection of resources in a single place. However, there are times when you need to use non-AWS solutions in your architecture. This is usually for one of two reasons.
First, AWS may not offer a solution that you need. Examples here include an incident response platform, such as PagerDuty or certain types of database offerings, such as a time-series database (while Timestream is still in preview).
Second, AWS may offer a solution in a category but perhaps a third-party solution better fits your needs. Examples here include:
Auth0 over AWS Cognito for identity;
Algolia instead of AWS's hosted Elasticsearch for search;
GitHub vs. AWS CodeCommit for source code repositories.
Using custom resources in this way nudges CloudFormation a little closer to Terraform. Like Terraform, you can provision resources across providers. However, you still retain the service-based nature of CloudFormation.
Performing provisioning steps not related to infrastructure.
A third category is to perform provisioning steps that aren't strictly infrastructure-related.
The core example here is running relational database initialization or migration scripts. When deploying a new version of your application, you want to ensure that your database tables are created or that any recent migrations have been applied. This is a one-time operation on each deployment, but there's not a native
AWS::Database::Script
resource in CloudFormation.With custom resources, you could write a script in a Lambda function that is triggered after your RDS database is configured to execute any migration scripts needed.
A second option in this category could be to bust a cache on the deployment of new code.
Any. Thing. You. Want.
The beauty (and danger) of custom resources is that you control the code, so you can do anything you please.
Want to record a successful deployment in your deployment management system? You can do it.
Want to use the ApproveAPI to require manual approval before starting a deploy? No problem.
One of my favorite examples of innovative custom resource usage is from Chase Douglas at Stackery where he mentions running a smoke test in a custom resource as the very last step in a deploy. If the smoke test fails, it rolls back the entire deployment.
These use cases are neat but remember that with great power comes great responsibility. Think carefully about how far you want to extend CloudFormation's capabilities.
How to use CloudFormation custom resources
Now that we know what custom resources are and when you might use them, let's see how to use custom resources.
We'll break this section into two parts. First, we'll see the overall architecture of custom resources and how they interact with other CloudFormation stacks. Then we'll do a deeper dive into the mechanics of writing a custom resource handler.
CloudFormation custom resource architecture
To use a CloudFormation custom resource, you'll need to do three things:
Write the logic for your custom resource;
Make your custom resource logic available by deploying to an AWS Lambda function or by subscribing to an SNS topic.
Use the custom resource in your CloudFormation template that references the Lambda function or SNS topic.
To use a custom resource in a CloudFormation stack, you need to create a resource of either type AWS::CloudFormation::CustomResource
or Custom::<YourName>
. I prefer using the latter as it helps to identify the type of custom resource you're using.
Here's an example use of a custom resource:
Resources:
GithubWebhook:
Type: "Custom::GithubWebhook"
Version: "1.0"
Properties:
ServiceToken: arn:aws:lambda:us-east-1:123456789012:function:GithubCustomResource
Repo: alexdebrie/test-repo
Events: "push, pull_request"
Endpoint: https://webhook.api.com
Notice that the resource type is Custom::GithubWebhook
, which is not a resource type provided natively by CloudFormation.
As inputs to your custom resource, you must provide a ServiceToken
property. The ServiceToken is an ARN of either an AWS Lambda function or an SNS Topic that will receive your custom resource request. You may also include additional properties to send into your custom resource for configuration.
Writing a custom resource handler
Most of the tricky bits around custom resources is in actually writing the handler. There are a few "gotchas" which can leave your CloudFormation stack in a bad state.
In this section, we'll cover the custom resource programming model, the three event types for custom resources, and the inputs and outputs to your invocations.
Custom resource programming model
Custom resources are implemented in an asynchronous, callback-style programming model. It's important to understand what that means for your custom resource and its failure modes.
When your custom resource is invoked by CloudFormation, it won't hang around waiting for a response. As part of the payload to your custom resource, it will include a presigned S3 URL. When your custom resource is done processing, it should use the presigned S3 URL to upload a JSON object containing the output of the custom resource.
This asynchronous model makes it easier and faster for CloudFormation to provision many resources in a stack in parallel, but it also adds complexity. Rather than returning a simple response in your Lambda function, you need to save your output to S3. Forgetting to do so or saving the data incorrectly will cause CloudFormation to hang until it times out.
Event types
In writing a custom resource handler, you'll need to handle three different actions:
Create: A Create event is invoked whenever a resource is being provisioned for the first time, either because a new stack is being deployed or because it was added to an existing stack;
Update: An Update event is invoked when the custom resource itself has a property that has changed as part of a CloudFormation deploy.
Delete: A Delete event is invoked when the custom resource is being deleted, either because it was removed from the template as part of a deploy or because the entire stack is being removed.
Your handler function must be able to handle each of these event types and know how to return a proper response to avoid hanging your deployment.
Custom resources inputs and outputs
When your custom resource is invoked, it will include a payload similar to the following:
{
"RequestType": "Create",
"RequestId": "9db53695-b0a0-47d6-908a-ea2d8a3ab5d7",
"ResponseURL": "https://...",
"ResourceType": "Custom::GithubWebhook",
"LogicalResourceId": "GithubWebhook",
"StackId": "arn:aws:cloudformation:us-east-1:955617200811:stack/github-webhook-test-3/1351a360-4fd0-11e9-b201-0a20b68b404c",
"ResourceProperties": {
"Repo": "alexdebrie/test-repo",
"Events": ["push", "pull_request"],
"Endpoint": "https://webhook.api.com"
}
}
A few notable points:
The request type -- Create, Update, or Delete -- is shown in the
RequestType
parameter.The
ResponseURL
parameter includes the presigned S3 URL for you to send your output.The
ResourceProperties
parameter includes all of the properties passed into your resource in the template.
If the request type is Update
or Delete
, the payload will also include a PhysicalResourceId
parameter. This is an identifier for the resource you create and is particularly important in Update
scenarios. Check out the Tips and Tricks section below for more information on the PhysicalResourceId.
For the output that you write to the presigned S3 URL, it should look similar to the following:
{
"Status": "SUCCESS",
"RequestId": "9db53695-b0a0-47d6-908a-ea2d8a3ab5d7",
"LogicalResourceId": "GithubWebhook",
"StackId": "arn:aws:cloudformation:us-east-1:955617200811:stack/github-webhook-test-3/1351a360-4fd0-11e9-b201-0a20b68b404c",
"PhysicalResourceId": "GitHubWebhookZZ97363670ZZalexdebrie/alexdebrie.com",
"Data": {
"Id": "97363670"
}
}
Two important notes here:
The
Status
property indicates whether the custom resource succeeded or failed. You should provideSUCCESS
for a successful run orFAILED
for an unsuccessful run. If the run was unsuccessful, you may include a reason with theReason
property.The
Data
property allows you to return outputs that can be referenced by other resources using theFn::GetAtt
function in CloudFormation.
There's a lot to take in with the custom resources, so check out the two examples below for a more complete walkthrough.
Tips and Tricks for writing Custom Resources
Below are a few key tips for writing resilient custom resources:
Catch every exception to prevent hanging CloudFormation stacks
Remember that custom resources use an asynchronous, callback-driven model. If your custom resource handler has an uncaught error that prevents it from writing a result to S3, your CloudFormation stack will remain in the
CREATE_IN_PROGRESS
stage until it times out.Use a helper library
Managing a custom resource can be tricky, both due to the exception problem noted above and because you need to write your data to S3 using a presigned S3 URL.
Fortunately, there are a number of libraries that ease the burden of writing custom resources. A few of them are:
custom-resource-helper: a Python-based library provided by AWS that uses decorators;
cfn-wrapper-python: another Python-based library that was the inspiration for custom-resource-helper. Written by Ryan Scott Brown, an all-around AWS wizard.
cfn-lambda: For our Node.js friends,
cfn-lambda
provides an easy way to build custom resources with JavaScript.cfn-custom-resource: Another Python-based library, this one uses classes over decorators. Created by Ben Kehoe, robot hacker and the Godfather of serverless architecture.
While all of these libraries are solid, the two examples below use the
custom-resource-helper
library.Understand how the Physical Resource Id works
After creating or updating your custom resource, you'll need to return a
PhysicalResourceId
property. This property is important, as it can be used to identify a created resource apart from its input properties.In the Github webhook example below, we use the Physical Resource Id to encode the Id of the GitHub webhook. You cannot look up a GitHub webhook without the Id, so it would be difficult to perform an update operation on an existing webhook without that Id.
Encoding the webhook Id into the Physical Resource Id allows us to identify and update an existing webhook when its input properties change.
Use AWS Lambda for your handler
While you can use an SNS topic as the ingest mechanism for custom resource requests, I recommend using Lambda functions unless you have a strong need otherwise.
A custom resource is basically a webhook, and webhooks are one of the core use cases for AWS Lambda. You won't have any management burden associated with it, and your custom resource is essentially free given Lambda's pricing structure.
Walkthrough: Provisioning a Github Webhook with CloudFormation
We've done a lot of background on custom resources, but there's no substitute for actually walking through some examples.
In this first example, we'll use CloudFormation to provision a Github webhook. This falls into the second use case we discussed for when to use custom resources -- Provisioning non-AWS resources with CloudFormation. A custom resource gives us the same infrastructure-as-code mechanics that we love even with non-AWS resources.
Custom resource logic
We will use the custom-resource-helper library to assist in building our logic. It helps with a few things:
Capturing errors and handling failures gracefully;
Writing output to the S3 presigned URL;
Logging output for easier debugging;
Easy polling for long-running provisioning tasks.
A skeleton file for starting with the custom-resource-helper
is as follows:
from crhelper import CfnResource
helper = CfnResource(
json_logging=False,
log_level='DEBUG',
boto_level='CRITICAL'
)
def handler(event, context):
helper(event, context)
@helper.create
def create(event, context):
logger.info("Got Create")
# Items stored in helper.Data will be saved
# as outputs in your resource in CloudFormation
helper.Data.update({"test": "testdata"})
return "MyResourceId"
@helper.update
def update(event, context):
logger.info("Got Update")
return "MyNewResourceId"
@helper.delete
def delete(event, context):
logger.info("Got Delete")
You'll create a CfnResource
object with some options. In your Lambda's entrypoint handler()
function, you pass the event
and context
to the CfnResource for handling all control flow.
Then, for each of the Create
, Update
, and Delete
request types, you make a function wrapped with a decorator to handle the request. The custom-resource-helper
library will call the proper function depending on the request type.
Posting the full logic here would get a little verbose, so I'll spare your eyeballs. You can see the handler logic here, and it's fairly basic -- around 120 lines of code.
I do want to call out one aspect. A Github webhook is tied to a particular repo and is identified by a unique Id provided by Github. Thus, there's a little bit of state involved with maintaining this resource to ensure proper updates and deletes.
To handle this state, I used the PhysicalResourceId
property that is returned by the custom resource to our CloudFormation template. This will be passed in for future updates and deletes, so I can tell if the resource has fundamentally changed (e.g. by changing the repository to which it applies). I can also use it to store the Id for updating or deleting a particular webhook.
For now, I'm just encoding the data as GithubWebhookZZ{Id}ZZ{Repo}
. I use ZZ
as a cheap separator, partly because I initially misread the instructions on what characters were allowed in a Physical Resource Id. 😁 A more standard approach might use other characters as separators (e.g. $
, _
, or -
).
Deploying the custom resource
To deploy the custom resource, I use the Serverless Framework. My serverless.yml
file looks as follows:
service: gh-custom-resource
provider:
name: aws
runtime: python3.7
stage: dev
region: us-east-1
environment:
GITHUB_TOKEN: "" # <-- Add your token here!
functions:
githubWebhook:
handler: handler.handler
resources:
Outputs:
GitHubWebhookFunction:
Description: "ARN for Github Webhook custom resource function"
Value: !GetAtt GithubWebhookLambdaFunction.Arn
Export:
Name: "GithubWebhookFunction"
plugins:
- serverless-python-requirements
It deploys a single function, then registers the ARN of that function as a CloudFormation export so that I can import the value into another CloudFormation stack in my account.
Note that you'll need to provision your own Github token before deploying.
Using the custom resource in another template
Once the custom resource is deployed and exported, we can easily use it in another template.
Here's an example CloudFormation template for using our custom webhook:
AWSTemplateFormatVersion: "2010-09-09"
Description: Example template for using the Github Webhook custom resource
Parameters:
REPO:
Type: String
Description: The Github repository for which the webhook is configured
EVENTS:
Type: CommaDelimitedList
Description: Events for which you want to subscribe
Default: "push, pull_request"
ENDPOINT:
Type: String
Description: The endpoint to which events will be sent
Resources:
GithubWebhook:
Type: "Custom::GithubWebhook"
Version: "1.0"
Properties:
ServiceToken: !ImportValue GithubWebhookFunction
Repo: !Ref REPO
Events: !Ref EVENTS
Endpoint: !Ref ENDPOINT
Note that we are provisioning a single resource in the Resources
section. The ServiceToken
is the only required property, and we use the ImportValue
CloudFormation function to use the exported value from our other stack.
We can deploy this template using the following command:
aws cloudformation deploy \
--template-file template.yaml \
--stack-name github-webhook-test \
--parameter-overrides REPO=alexdebrie/alexdebrie.com ENDPOINT=http://requestbin.fullcontact.com/z0azobz0
Make sure you paste in your own unique values for REPO
and ENDPOINT
in the parameter overrides.
After a few minutes, you should see the webhook configured in your repository:
Boom! 💥 Github webhooks infrastructure-as-code!
Walkthrough: Provisioning and Validating an ACM Certificate
Hat tip to Richard Boyd for his assistance with this example. Check out his blog here.
One example isn't quite enough, so let's do another. In this second example, we're going to use a custom resource to provision and validate an SSL certificate with AWS Certificate Manager.
This use case fits more into either the first or third bucket mentioned above. This could be considered provisioning an AWS resource for which there is not CloudFormation support (first bucket), but there is CloudFormation support for creating an ACM Certificate. There's just not support for validating that certificate. That might put it more in the third bucket -- performing provisioning steps not related to infrastructure.
Tomato, to-mah-to -- the important thing is that we can automate something that was previously manual.
In this example, we also see how the custom-resource-helper
helps us with long-running provisioning steps that may rely on waiting for other systems to complete a task.
Let's get started.
Custom resource logic -- polling for slower resources
I'm only going to highlight the important parts of the logic here. Feel free to check out all the custom resource code here.
In our create()
function for our custom resource, we'll be doing the following things:
Requesting an ACM certificate and specifying DNS validation;
Creating the DNS record in Route53 to validate our certificate;
Waiting for the certificate to be marked verified in ACM.
Notably, there's a potentially large gap between steps 2 and 3. ACM states it can take up to 30 minutes for the DNS record to propogate and for the certificate to be verified.
With Lambda, this is a problem. The max duration for Lambda is only 15 minutes. 😱 How can we handle this?
Fortunately, the custom-resource-helper
library makes it easy. In addition to the normal create()
function, you can add an optional poll_create()
function. The syntax is as follows:
@helper.create
def create(event, context):
# All your normal create logic here
# Add the certificate arn to the
# Data object on the helper.
helper.Data.update({"Arn": cert_arn})
return
# In the poll_create function, check
# to see if the certificate is validated.
@helper.poll_create
def poll_create(event, context):
cert_arn = event['CrHelperData']['Arn']
acm = _client(event, "acm")
validated = _await_validation(cert_arn, acm)
if validated:
return True
return False
I have both a create()
function and a poll_create()
function. The create()
function will be run first when I get a Create request for my custom resource. In addition to running its logic, it will also create a CloudWatch Scheduled Event that will re-trigger my function in two minutes.
That re-trigger will run the poll_create()
function. If I return a truthy value from that function, it will tear down the CloudWatch Scheduled Event and write the custom resource output to the presigned S3 URL.
If I return a falsey value, the function will be retriggered in 2 minutes to try again.
Let's walk through an example.
Imagine that validating my certificate takes 5 minutes. The flow would look as follows:
The initial request would come in and run the
create()
logic. This makes the request to create the certificate and add the DNS record. Additionally, thecustom-resource-helper
library configures a CloudWatch Scheduled Event to trigger this function in two minutes with the same input. After all this happens, the function finishes while the CloudFormation stack is still awaiting a response.Two minutes later, the function is triggered again. This time it runs the
poll_create()
function. The certificate still isn't validated, so the function completes without writing a result to S3.Two minutes later, the function is triggered a third time. It runs the
poll_create()
logic again but the certificate still isn't ready.Two minutes later, the function is triggered a fourth time. It runs the
poll_create()
logic again. This time, the certificate is ready. Thecustom-resource-helper
tears down the CloudWatch Scheduled Event so that it won't trigger again, then it writes the custom resource's output to the presigned S3 URL.
This polling logic is extremely helpful. You won't be paying for idle compute in your Lambda function, and you don't need to worry about hitting the Lambda timeout. Hurrah!
Deploying and usage
There are instructions in the Github repo for deploying the custom resource. It's using AWS SAM to deploy the stack, but the principles are similar -- deploy a Lambda function and register the function's ARN as an Export.
Once your function is deployed and registered, you can use the following stack to test it out:
AWSTemplateFormatVersion: "2010-09-09"
Description: Example template for using the ACM custom resource
Parameters:
DOMAIN:
Type: String
Description: Domain used for certificate
RECORD:
Type: String
Description: Record used for certificate
Resources:
ACMCertificate:
Type: "Custom::ACMCertificate"
Version: "1.0"
Properties:
ServiceToken: !ImportValue ACMRegisterFunction
Region: !Ref "AWS::Region"
HostedZoneName: !Ref DOMAIN
RecordName: !Ref RECORD
It takes DOMAIN
and RECORD
parameters to indicate the certificate you want to provision.
You can deploy the template using the following command:
aws cloudformation deploy \
--template-file template.yaml \
--stack-name acm-register-test \
--parameter-overrides DOMAIN=<DOMAIN> RECORD=<RECORD>
Make sure to use your own values for DOMAIN and RECORD.
For example, if you wanted to create a certificate for api.my-app.com, you would use:
aws cloudformation deploy \
--template-file template.yaml \
--stack-name acm-register-test \
--parameter-overrides DOMAIN=my-app.com RECORD=api
Your stack will likely take about 5-10 minutes to complete. After that, you should see a verified ACM certificate in the AWS console!
Conclusion
CloudFormation custom resources are awesome for filling gaps in the CloudFormation ecosystem or for bringing third-party resources under the CloudFormation umbrella.
In this post, we learned what custom resources are and when you would want to use them. Then, we learned about the workflow for creating and using CloudFormation custom resources, as well as some tips and tricks. Finally, we walked through two examples of custom resources.