It’s that time of year again to reflect and think about what’s coming up in the future.
This year was an interesting year, both personally and professionally.
Personal
We’re approaching our fourth anniversary of moving to Yakushima (it’ll be exactly four years this coming February), and we’ve been able to make more friends and acquaintances on the island. I’ve also taken up running as a hobby and have gotten involved in the local running community as well.
I’ve written about how to host a single page application (SPA) on AWS using CloudFront and S3 before, using the CloudFront “rewrite not found errors as a 200 response with index.html” trick.
Recently, working on a few serverless apps, I’ve realized that this trick, while quick, isn’t perfect. The specific case where it broke down was when the API is configured as a behavior on CloudFront (I usually scope the API to /api on the same domain as the frontend, so CORS and OPTIONS requests aren’t necessary). If the API returned a 404 Not Found response, CloudFront would rewrite it to 200 OK index.html, and the front-end application would get confused. Unfortunately, CloudFront doesn’t support customized error responses per behavior, so the only way to fix this was to use Lambda@Edge instead.
There are a few ways to run WordPress “serverless” on AWS. I’m going to talk about running WordPress on Lambda for this article. If you’re interested in how you can run WordPress serverless-ly on Fargate, I’m working on a post about that too.
Throughout these past 4 years since AWS ECS became generally available, I’ve had the opportunity to manage 4 major ECS cluster deployments.
Across these deployments, I’ve built up knowledge and tools to help manage them, make them safer, more reliable, and cheaper to run. This article has a bunch of tips and tricks I’ve learned along the way.
Note that most of these tips are rendered useless if you use Fargate! I usually use Fargate these days, but there are still valid reasons for managing your own cluster.
The bots should announce, “I’m not a person, or if I am, I’m not allowed to act like one.”
Or, if there’s no room or time for that sentence, perhaps a simple bot at the top of the conversation. That way, we can save our human emotions for the humans who will appreciate them.
I’ve been working on setting up autoscaling settings for ECS services recently, and here are a couple notes from managing auto-scaling for ECS services using Terraform.
I’m currently working on migrating a Rails application to ECS at work. The current system uses a heavily customized Capistrano setup that’s showing its signs, especially when deploying to more than 10 instances at once.
While patiently waiting for EKS, I decided to use ECS over manage my own Kubernetes cluster on AWS using something like kops. I was initially planning on using Lambda to create the required task definitions and update ECS services, but native CodePipeline deploy support for ECS was announced right before I started planning the project, which greatly simplified the deploy step.
A few years ago when I was doing client work, we would regularly host clients’ sites and apps for them. During this time, I was responsible for both development and keeping them up and running as much as possible. Most of the money being in new development, it was difficult to assign priority to improving the operations of existing applications. In this period, I wanted an “operations person” to teach me how to make new applications that would need minimal operations support from the beginning. Failing this, I decided to become “the operations person” myself.
We use fluentd to process and route log events from our various applications. It’s simple, safe, and flexible. With at-least-once delivery by default, log events are buffered at every step before they’re sent off to the various storage backends. However, there are some caveats with using Elasticsearch as a backend.
Currently, our setup looks something like this:
The general flow of data is from the application, to the fluentd aggregators, then to the backends – mainly Elasticsearch and S3. If a log event warrants a notification, it’s published to a SNS topic, which in turn triggers a Lambda function that sends the notification to Slack.
Don’t forget to update the KMS Key Policy, too. I spent a bit of time trying to figure out why it wasn’t working, until CloudTrail helpfully told me that the kms:GenerateDataKey permission was also required. Turn it on today, even if you don’t need the auditing. It’s an excellent permissions debugging tool.