Categories
AWS

Rails on AWS: Do you need nginx between Puma and ALB?

When I set up Rails on AWS, I usually use the following pattern:

(CloudFront) → ALB → Puma

I was wondering: Is it always necessary to put nginx between the ALB and Puma server?

My theory behind not using nginx is that because it has its own queue (while the Classic Load Balancer had a very limited “surge queue”, the ALB does not have such a queue), it will help in getting responses back to the user (trading for increased latency) while hindering metrics used for autoscaling and choosing what backend to route the request to (such as Rejected Connection Count).

I couldn’t find any in-depth articles about this, so I decided to prove my theory (in)correct by myself.

In this test, the application servers will be running using ECS on Fargate (platform version 1.4.0). It’s a very simple “hello world” app, but I’ll give it a bit of room to breathe with each instance having 1 vCPU and 2GB of RAM. I’ll be using Gatling on a single c5n.large instance (“up to 25 gigabits” should be enough for this test).

In this test, I wanted to try out a few configurations that mimic characteristics of applications I’ve worked on: short and long requests, usually IO-bound. A short request is defined as just rendering a simple HTML template. A long request is 300ms. The requests are ramped from 1 request/sec to 1000 requests/sec over 5 minutes.

Response Time Percentiles over Time (OK responses), simple render — 4 instances, 20 threads each, connected directly to the ALB.
Response Time Percentiles over Time (OK responses), simple render — 4 instances, 20 threads each, using Nginx.

As you can see, for the simple render scenario, Nginx and Puma were mostly the same. As load approached 1000 requests/sec, latency started to get worse, but all requests were completed with an OK status.

The 300ms scenario was a little more grim.

Number of responses per second (green OK, red error), 300ms response — 4 instances, 20 threads each, connected directly to the ALB.
Number of responses per second (green OK, red error), 300ms response — 4 instances, 20 threads each, using Nginx.

My theory that Puma will fail fast and give error status to the ALB when reaching capacity was right. The theoretical maximum throughput is 4 instances * 20 threads * (1000ms in 1 second / 300ms) = 266 requests/sec. Puma handles about 200 requests/sec before returning errors; Nginx starts returning error status at around 275 requests/sec, but at that point requests are already queueing and the response time is spiking.

Remember, these results are for this specific use case, and results for a test specific to your use case probably will be different, so it’s always important to do load testing tailored to your environment, especially for performance critical areas.

Categories
日本語

家庭料理のようなプロダクト開発

私は料理が好きです。特にパンを焼くのが好きで、この間初めて Cookpad に公開したのはフォカッチャのレシピでした。フリーランスでシステム開発をしているエンジニアで、ゼロスタートから運用まで持っていくことを得意分野としています。この2つの要素を組み合わせて新しい何かを作るのが趣味ですが、そのことを今まであまりブログに書いたことはありませんでした。今回のアプリを作ったことで、「アプリは家庭料理みたいでもいい(An app can be a home-cooked meal)」という記事のことを思い出して、同様の考え方を共有したいと思いました。

おうち時間割アプリ

COVID-19パンデミックで保育園が長期間臨時休園となって、仕事をしながら子供の世話をする必要が出てきました。その中で、技術でどう子供の世話と仕事を両立できるかを考えて、「おうち時間割」アプリを思いつきました。

前からも思っていましたが、子供に限らず、大人でも「スケジュール」ということが大事で、習慣づけできれば考えないで当たり前のように進められる。「おうち時間割」は、「今はなんの時間?」を子どもたちが自分でガイドできるようなものをイメージしながら作りました。

この1ヶ月ほど、最初の最小限に動くプロトタイプから、他の人も使えるようになったので、このタイミングで発表しようと思いました。

使い方

まずは時間割を設定します。サンプルがデフォルトで登録されています。設定が終わったら、「表示する」ボタンを押すと現在のスケジュールと次のスケジュールが表示されます。

私たちは以前セールで買って、あまり使っていなかった Amazon Fire 7 タブレットを使用して常時表示させていますが、インターネットにアクセスし、画面がある機器なら何でも大丈夫です。古くなったiPadや携帯の再利用などに便利だと思います。ちなみに、Androidの場合は電源につなぎながらスリープしない設定は開発者モードじゃないと設定できないようです。iOSは設定→画面→オートロックで設定できます。

自分たちのためのアプリ開発

プロトタイプは1日で完成させ、すぐにタブレットで表示させて使い始めました。使いながら、背景色を変えてみたり、3歳児にも理解できるように絵文字を入れてみたり、機能を追加していきました。

家庭料理のような開発は家族や自分のために料理を作るプロセスのようで、試行錯誤の上に工夫してレシピを改善していくことで自分だけの味に到達できたときの喜びを味わうことができます。

少なくとも関東ではまだしばらく続きそうな自宅待機期間に、料理に飽きたら開発も試してみてはいかがですか。

おうち時間割に限らず、自宅待機期間を仕事&子供で過ごしてるについて話したい方はコンタクトフォームまたはTwitterで連絡をお待ちしております。

Categories
Code Snippets English

A quick shortcut to open a Ruby gem in VS Code

While working on a Ruby project, I often find myself referring to the code of various libraries when it’s easier than looking up the documentation. For this, I used to use code (bundle show GEM_NAME), but recently I’ve been getting this warning:

[DEPRECATED] use `bundle info $GEM_NAME` instead of `bundle show $GEM_NAME`

Okay, that’s fine, but bundle info returns a bunch of stuff that would confuse VS Code:

> bundle info devise
  * devise (4.7.1)
	Summary: Flexible authentication solution for Rails with Warden
	Homepage: https://github.com/plataformatec/devise
	Path: /Users/keita/.asdf/installs/ruby/2.7.0/lib/ruby/gems/2.7.0/gems/devise-4.7.1

Luckily there’s bundle info $GEM_NAME --path. code (bundle info devise --path) is kind of long to type out every time, though, so I decided to make an alias.

I use the Fish shell, so the code here is written for that shell. Adapt it to your shell as required. You’ll also need the VS Code terminal integration installed for this to work.

function bundlecode
  if test -e ./Gemfile
    code (bundle info $argv[1] --path)
  else
    set_color -o red
    echo "Couldn't find `Gemfile`. Try again in a directory with a `Gemfile`."
    set_color normal
  end
end

Usage:

> bundlecode devise
# VS Code opens!
Categories
English

How I use Git

I’ve been using Git at work for around 10 years now. I started using Git with a GUI (Tower — back when I was eligible for the student discount!), but now I use the CLI for everything except complicated diffs and merges, where I use Kaleidoscope.

A question I get asked by my coworkers often is: “how in the world do you manage using Git without a GUI?”. This blog post is supposed to answer this question.

First, I use the Fish shell. It fits with the way I think. A lot of you probably use bash or zsh, that’s fine, there is lots of documentation on how to integrate Git with those shells. This is the relevant part of .config/fish/config.fish:

set __fish_git_prompt_show_informative_status 'yes'
set __fish_git_prompt_color_branch magenta
set __fish_git_prompt_color_cleanstate green
set __fish_git_prompt_color_stagedstate red
set __fish_git_prompt_color_invalidstate red
set __fish_git_prompt_color_untrackedfiles cyan
set __fish_git_prompt_color_dirtystate blue

function fish_prompt
  # ... 
  set_color normal
  printf ' %s' (prompt_pwd)
  printf '%s' (__fish_git_prompt)
  printf ' > '
end

I’ve omitted the irrelevant portions (status checking, # prompt when root, etc. If you want to see the full file, I’ve posted it as a gist.

On a clean working directory (that is, no changed files that haven’t been committed to the repository), this looks like this:

When updating some files, it will change to something like this:

This prompt doesn’t change in real time, so changes from other terminals won’t automatically change this prompt. I have a habit of tapping the “return” key to update the prompt.

To commit these changes:

The commands I use most often:

  • git add (if you only want to add a portion of a file, git add -p is your friend) / git commit
  • git push / git pull (git pull --rebase for feature branches being shared with other devs)
  • git diff @ — show all changes, staged or not, between the working directory and the latest commit of the branch you are on (@ is a alias for HEAD)
  • git diff --cached — show only changes that are being staged for the next commit
  • git status
  • git difftool / git mergetool (this will open Kaleidoscope)

This was obviously a very cursory, high-level look at how I use Git, but I hope it was useful. It’s been a long time since I’ve used a Git GUI full time, but whenever I do use one (for example, when helping a coworker), it feels clunky compared to using the CLI (that’s not saying I don’t have my complaints about the CLI — that’s another blog post 😇).

If you have any more questions, leave a comment or contact me on Twitter, and I’ll update this post with the answers.

Categories
English Tools Useful Utilities

“Logging in” to AWS ECS Fargate

I’m a big fan of AWS ECS Fargate. I’ve written in the past about managing ECS clusters, and with Fargate — all of that work disappears and is managed by AWS instead. I like to refer to this as quasi-serverless. Sorta-serverless? Almost-serverless? I’m open to better suggestions. 😂

There are a few limitations of running in Fargate, and this blog post will focus on working around one limitation: there’s easy way to get an interactive command line shell within a running Fargate container.

The way I’m going to establish an interactive session inside Fargate is similar to how CircleCI or Heroku does this: start a SSH server in the container. This requires two components: the SSH server itself, which will be running in Fargate, and a tool to automate launching the SSH server. Most of this blog post will be about the tool to automate launching the server, called ecs-fargate-login.

If you want to skip to the code, I’ve made it available on GitHub using the MIT license, so feel free to use it as you wish.

How it works

This is what ecs-fargate-login does for you, in order:

  1. Generate a temporary SSH key pair.
  2. Use the ECS API to start a one-time task, setting the public key as an environment variable.
    • When the SSH server boots, it reads this environment variable and adds it to the list of authorized keys.
  3. Poll the ECS API for the IP address of the running task. ecs-fargate-login supports both public and private IPs.
  4. Start the ssh command and connect to the server.

When the SSH session finishes, ecs-fargate-login will make sure the ECS task is stopping.

Use cases

Most of my clients use Rails, and Rails provides an interactive REPL (read-eval-print loop) within the Rails environment. This REPL is useful for running one-off commands like creating new users or fixing some data in the database, checking and/or clearing cache items, to mention a few common tasks. Rails developers are accustomed to using the REPL, so while not entirely necessary (in the past, I usually recommended fixing data using direct database access or with one-time scripts in the application repository), it is a nice-to-have feature.

In conclusion

I don’t use this tool daily, but probably a few times a week. A few clients of mine use it as well, and they’re generally happy with how it works. However, if you have any recommendations about how it could be improved, or how the way the tool itself is architected could be improved, I’m always open to discussion. This was my first serious attempt at writing Golang code, so there are probably quite a few beginner mistakes in the code, but it should work as expected.

Categories
日本語

Serverless Meetup Tokyo #13 に参加してみました

Serverless Meetup Tokyo 第13回 に参加してみました。

会場は Speee Lounge

ServerlessDays Tokyo 2019

というイベントの啓発(参加者、登壇者)ありました。私も参加しようと思っています。登壇は、、検討します(笑)

Azure Serverless 2019 Summer Edition

三宅 和之(株式会社ゼンアーキテクツ

自分は普段AWSの世界に浸かられてるのでAzureは新鮮でした。Azure Functions ではが C# と Node.js が主流だが最近 Java が最近サポートし始めてる。Runtime は全てOSSらしい。

Azure Functions v2 はオススメ!v1と違って、v2はgRPCを利用して基盤となる.NET Coreとワーカーを分離することによって軽量化できたらしい。なるほど。

Premium Planを使えば “Pre-warmed instances” という機能は使える – これはお金払ってもAWS Lambdaも提供してない。(起動時間はいつも改善しようとしてるらしいけど・・・

TypeScript正式利用!これはすごいね。デフォルトで?どこでコンパイルされるんだろう?

Durable Functions: ステートフルファンクション。 AWSならStep Functions?概要読んだところ、コード(C#, F#, JavaScript)で定義できるのは面白いですね。

KEDA “Kubernetes-based Event Driven Autoscaling”.

Azure Cosmos DB – DynamoDBみたいなやつ。もっと機能があるように見える。Change Feed DynamoDB Streams。Change Feed から Azure Function を起動することができる。SignalR というものと組み合わせれば、WebSocketsにパブリッシュできそう。SignalRはDynamoDB Streams + Lambda + API Gateway WebSocket よりかなり簡易的に実装できそう!

SQL DB Serverless。AWS Aurora Serverlessと同じような挙動してるっぽい。いつか完全にリクエスト課金のSQLデータベースできるといいね。。

Azure 世界は OSS プロダクトが多いのでコントリビュートできる。

でも正直なところ、あまりAWSから移行する!という感じはしなかった。個人的にはTypeScript興味ありますが、C#など全然興味ありません(C#を勉強する前に、Rustをちょっと深掘りしたい・・)。ただ、冒頭の通りあまりAzureのこと触れないので、今回はとてもいい機会でした。

営業職から見たサーバーレス

「既存の開発メニューにはまらない」

「どうやってサーバレスをクライアントに売る?」

私は個人的には、「サーバレス」を直接うるんじゃなくて、サーバレスで運用が楽になった分、たとえ運用費を同じ額をもらってると特になると思う。(エンドユーザーからみたら、全く同じ方式)ただ、サーバレスで以前できなかったことを実現できれば付加価値として請求できると思う。例えば、サーバレスは急なバーストなどに耐えられる設計しやすいので、「安易に急なバーストを耐えられるアプリを作る」という提案はできます。それ以外は、あくまで最適化の手段の一つだと思う。

LambdaとDynamoDBでつくるIoTバックエンド

岡本 忠浩(株式会社MMM

AWS SAM を使ってる。SAMで管理しきれないのは別レポジトリーのCloudFormationが。

100個以上のLambda関数。多い!6ヶ月をかけてエンジニア2人で作ってるみたい。なんて複雑なプロジェクト。。

DynamoDBのテーブル設計

始まりがだいたい良さそう。RDBのER図を書いて、アクセスパターンを列挙して、DynamoDB を設計する。「Serverlessを極めるためにDynamoDBデータモデリングを極めよう」という資料を参考になります。

まあ、ベストプラクティスを従ってると問題ない、という感じだね。

変わってるところがあれば、 Go で定義を落とし込んでる。

DynamoDBトランザクションが10個に制限されてる。「簡易に超過する」いいえ、これはDynamoDBをRBDMSっぽく使うとそうなるけど、本当にNoSQLファーストな設計じゃないとだめ。(先ほどのServerlessを極めるために・・)

やっぱり、DynamoDBと限らないNoSQLは、「大量なデータに適した技術」なので、まだRBDMSを使った方が適切な場合は多いと思います。特に私がいるスタートアップ界隈は条件などが急に変わったりすると、アクセスパターンがかなり変わる。。アクセスパターンを更新するたびにデータベース設計も変えないといけなかったら、かなり厳しい感じします。DynamoDBを意識しない仕組みを作るのがベストというが、これは絶対にしちゃいけないこと。そうすると慣れてるRBDMSっぽいことをしようとする。

AWS Step Functions を使ってるみたい。私も使いたいと思って、以前自分で Lambda -> SQS -> Lambda -> SQS で簡易的に対応したけど、もっと複雑な課題があったら Step Functions の方が適切っぽい。

FaaS上のコードをもっとシンプルに書くためのトランスパイラ

木村 功作(富士通研究所

https://github.com/fujitsulaboratories/escapin

正直、これって async await とどう違うのか?という疑問。技術自体はかなり凄いと思う!いつかトランスパイラー作ると面白いかもしれない。

まとめ

楽しかった!普段触れない技術や話に色々触れたので、非常に勉強になりました。

Categories
AWS English

Hosting a Single Page Application with an API with CloudFront and S3

I’ve written about how to host a single page application (SPA) on AWS using CloudFront and S3 before, using the CloudFront “rewrite not found errors as a 200 response with index.html” trick.

Recently, working on a few serverless apps, I’ve realized that this trick, while quick, isn’t perfect. The specific case where it broke down was when the API is configured as a behavior on CloudFront (I usually scope the API to /api on the same domain as the frontend, so CORS and OPTIONS requests aren’t necessary). If the API returned a 404 Not Found response, CloudFront would rewrite it to 200 OK index.html, and the front-end application would get confused. Unfortunately, CloudFront doesn’t support customized error responses per behavior, so the only way to fix this was to use [email protected] instead.

Here’s the code for the Lambda function:

'use strict'

const path = require('path')

exports.handler = (evt, context, cb) => {
  const { request } = evt.Records[0].cf

  const uriParts = request.uri.split("/")

  if (
    // Root resource with a file extension.
    (
      uriParts.length === 2 && path.extname(uriParts[1]) !== ""
    ) ||
    // Anything inside the "static" directory.
    uriParts[1] === "static"
  ) {
    // serve the original request to S3
  } else {
    // change the request to index.html
    request.uri = '/index.html'
  }

  cb(null, request)
}

This code assumes all requests to a root request with a file extension, or anything in the /static/ directory is a static file that should be served from S3. All other requests will be rewritten to index.html. These are the defaults for create-react-app, but you’ll probably need to change them to meet your requirements. (Remember, [email protected] functions need to be created in us-east-1)

Attach this Lambda function to the CloudFront behavior responsible for serving from the S3 origin as origin-request, and you should be good to go. Don’t forget to remove the 404-to-200 rewrite.

Categories
AWS English WordPress

Serverless WordPress on AWS Lambda

There are a few ways to run WordPress “serverless” on AWS. I’m going to talk about running WordPress on Lambda for this article. If you’re interested in how you can run WordPress serverless-ly on Fargate, I’m working on a post about that too.

Keep in mind that while it is possible to do this, it’s not for everyone. It’s probably not for me. Probably not for you. Use at your own risk!

Before we start, there is a core feature of Lambda that make running WordPress in Lambda quite troublesome: Read-only file system. WordPress expects a writable, persistent, local file system. We’ll be using the S3 Uploads plugin by Human Made to handle media uploads. However, core and plugin updates will not work. There’s no workaround for this, so to install / update files, we’ll need to make a new Lambda deployment.

So: let’s go! First, you’ll want to clone my boilerplate repository. I’ve prepared a WordPress installation and a simple glue script to actually boot WordPress.

$ git clone https://github.com/keichan34/wordpress-on-lambda

My plan of attack is: run WordPress in the Lambda function using a PHP custom runtime, make uploads work with S3 instead of the local filesystem, and wire up the database. In the repository above, I’ve configured static assets to be served from S3 as well.

Now, let’s prepare the database. Lambda has two networking modes: public and VPC mode. In public mode, the Lambda has default access to the public internet, but nothing else. In VPC mode, the Lambda is booted inside the VPC, and doesn’t have public internet access by default. Because WordPress requires public internet access we have to either run it in public mode, or run it in VPC mode and prepare a NAT gateway (about $30 to $50 a month, depending on the region). If Lambda runs in public mode, the database must also be publicly accessible — something that is frowned upon from a security standpoint. You should choose the option that fits your risk and price profile. In my case, I’m going with the NAT gateway route.

Now we’ve got the messy stuff out of the way, we’ll have to assemble the Lambda runtime. AWS has an article on their blog detailing how to make a PHP custom runtime, but Stackery provides a batteries-included PHP layer. It includes everything you need to make a PHP application that assumes it’s running in a traditional server environment run in AWS Lambda.

# Replace "km-wordpress-on-lambda-deployment-201906" with something that makes sense for you. It's globally unique, so copying and pasting this will result in an error.
# Make sure you're in the same region as your database!

$ DEPLOY_BUCKET="km-wordpress-on-lambda-deployment-201906"
$ aws s3 mb "s3://$DEPLOY_BUCKET"
$ cd <the directory you cloned the GitHub repository to>

Now, it’s time to install WordPress! We’ll add the WordPress files to the deployment package. As usual, copy wp-config-example.php to wp-config.php. Enter your database details. If you have a hostname that you’re going to use with CloudFront, enter it now. If not, you’ll have to wait until after the CloudFront distribution is created, then try again.

Now, let’s deploy. This will create a new CloudFront distribution and S3 bucket for public assets, so maybe it’s a good time to make a cup of coffee. If you haven’t installed the SAM CLI, do that before the next block.

$ sam package --template-file template.yaml --output-template-file serverless-output.yaml --s3-bucket "$DEPLOY_BUCKET"
$ sam deploy --template-file serverless-output.yaml --stack-name wordpress-on-lambda --capabilities CAPABILITY_IAM
$ aws s3 sync ./src/php s3://deploy-bucket-XXXXX --exclude "*.php" --exclude "*.ini"

I’ll be using the default CloudFront domain for this demo. If you’re going to be using your own domain, you need to modify the template.yaml file to add the an alias to the CloudFront distribution. Use the following command to show the CloudFront domain name.

$ aws cloudformation describe-stacks --stack-name wordpress-on-lambda | jq '.Stacks[0].Outputs'

OK! Now, you should be able to access the CloudFront URL, and you’ll get redirected to the friendly WordPress installer! If you’ve set up your wp-config.php correctly, the installation should go smoothly.

The site I set up for this post is available here: https://dskhgdbzphjkm.cloudfront.net/

Lessons Learned

This is for almost no-one. I think the only valid use case (in this current form) for running WordPress in AWS Lambda is a site that gets periodic, unpredictable spikes of intense traffic — a use case where Lambda’s scalability and price model pays off. This is also a use case where, presumably, the benefits of the scalability trumps the inconvenience of not being able to use the online updaters and installers (also, I’m assuming the database will be able to keep up with the load).

However, if updating and installing themes or plugins could be managed outside of the Lambda environment (say, with wp-cli), with deployments automated… Then, it may be a little more applicable to a larger audience.

If you’re looking for a cheap solution to host your personal blog (like me!), you might just want to bite the bullet and check out any of the hosted WordPress solutions out there.

If you liked this post, or you’d like to provide some input, please do so in the comments. My favorite AWS service is Lambda, and I like pushing it a bit, so look forward to similar posts in the future. If you find bugs in the boilerplate, or you can make improvements, please open an issue or PR!

Miscellaneous Tidbits

  • Aurora Serverless sounds like it would be the best match for this setup. It probably is. Just keep in mind that Aurora Serverless doesn’t support publicly accessible clusters. To use it, you’ll need to go the Lambda-in-VPC, NAT gateway route.
  • Regarding public / private access and NAT gateways, if you’re like me and believe in the future of IPv6 and think that you can just use an egress-only internet gateway – you’re wrong! Lambda doesn’t seem to support IPv6 at this time.
  • You can actually use a NAT instance if the NAT gateway is overkill. However, I would recommend using the NAT gateway if you can. It comes with automatic scalability and redundancy, so you don’t have to babysit your NAT instance. (If you need more than one NAT instance, use the gateway. Seriously.)
  • At time of writing, my patches to php-lambda-layer haven’t been merged yet, so you can use my patched version (the boilerplate repository has this applied already).
  • If you’re really going all-in, consider using an Application Load Balancer rather than API Gateway to save money. API Gateway has zero fixed costs, but there is a point where ALB will become cheaper than API Gateway.
  • Doing some crude calculations, you should be able to handle an average of a few hundred users per day under the perpetual free tier. Your highest bill may be data transfer to the user.
Categories
AWS English

Managing ECS clusters, 4 years in.

Throughout these past 4 years since AWS ECS became generally available, I’ve had the opportunity to manage 4 major ECS cluster deployments.

Across these deployments, I’ve built up knowledge and tools to help manage them, make them safer, more reliable, and cheaper to run. This article has a bunch of tips and tricks I’ve learned along the way.

Note that most of these tips are rendered useless if you use Fargate! I usually use Fargate these days, but there are still valid reasons for managing your own cluster.

Spot Instances

ECS clusters are great places to use spot instances, especially when managed by a Spot Fleet. As long as you handle the “spot instance is about to be terminated” event, and set the container instance to draining status, it works pretty well. When ECS is told to drain a container instance, it will stop the tasks cleanly on the instance and run them somewhere else. I’ve made the source code for this Lambda function available on GitHub.

Just make sure your app is able to stop itself and boot another instance in 2 minutes (the warning time you have before the spot instance is terminated). I’ve experienced overall savings of around 60% when using a cluster exclusively comprised of spot instances (EBS is not discounted).

Autoscaling Group Lifecycle Hooks

If you need to use on-demand instances for your ECS cluster, or you’re using a mixed spot/on-demand cluster, I recommend using an Autoscaling Group to manage your cluster instances.

To prevent the ASG from stopping instances with tasks currently running, you have to write your own integration. AWS provides some sample code, which I’ve modified and published on GitHub.

The basic gist of this integration is:

  1. When an instance is scheduled for termination, the Autoscaling Group sends a message to an SNS topic.
  2. Lambda is subscribed to this topic, and receives the message.
  3. Lambda tells the ECS API to drain the instance that is scheduled to be terminated.
  4. If the instance has zero running tasks, Lambda tells the Autoscaling Group to continue with termination. The Autoscaling Group terminates the instance at this point.
  5. If the instance has more than zero running tasks, Lambda waits for some time and sends the same message to the topic, returning to step (2).

By default, I set the timeout for this operation to 15 minutes. This value depends on the specific application. If your applications require more than 15 minutes to cleanly shut down and relocate to another container instance, you’ll have to set this value accordingly. (Also, you’ll have to change the default ECS StopTask SIGTERM timeout — look for the “ECS_CONTAINER_STOP_TIMEOUT” environment variable)

Cluster Instance Scaling

Cluster instance scale-out is pretty easy. Set some CloudWatch alarms on the ECS CPUReservation and MemoryReservation metrics, and you can scale out according to those. Scaling in is a little more tricky.

I originally used those same metrics to scale in. Now, I use a Lambda script that runs every 30 minutes, cleaning up unused resources until a certain threshold of available CPU and memory is reached. This technique further reduces service disruption. I’ll post this on GitHub sometime in the near future.

Application Deployment

I’ve gone through a few application deployment strategies.

  1. Hosted CI + Deploy Shell Script
    • Pros: simple.
    • Cons: you need somewhere to run it, easily becomes a mess. Shell scripts are a pain to debug and test.
  2. Hosted CI + Deploy Python Script (I might put this on GitHub sometime)
    • Pros: powerful, easier to test than using a bunch of shell scripts.
    • Cons: be careful about extending the script. It can quickly become spaghetti code.
  3. Jenkins
    • Pros: powerful.
    • Cons: Jenkins.
  4. CodeBuild + CodePipeline
    • Pros: simple; ECS deployment was recently added; can be managed with Terraform.
    • Cons: Subject to limitations of CodePipeline (pretty limited). In our use case, the sticking points are not being able to deploy an arbitrary Git branch (you have to deploy the branch specified in the CodePipeline definition).

Grab-bag

Other tips and tricks

  • Docker stdout logging is not cheap (also, performance is highly variable across log drivers — I recently had a major problem with the fluentd driver blocking all writes). If your application blocks on logging (looking at you, Ruby), performance will suffer.
  • Having a few large instances yields more performance than many small instances (with the added benefit of having the layer cache when performing deploys).
  • The default placing strategy should be: binpack on the resource that is most important to your application (CPU or memory), AZ-balanced
  • Applications that can’t be safely shut down in less than 1 minute do not work well with Spot instances. Use a placement constraint to make sure these tasks don’t get scheduled on a Spot instance (you’ll have to set the attribute yourself, probably using the EC2 user data)
  • Spot Fleet + ECS = ❤️
  • aws update-service help for service administration commands. I use --force-new-deployment and --desired-count quite often.
  • If you manage your own EC2 instances with Auto Scaling Groups: aws autoscaling terminate-instance-in-auto-scaling-group --instance-id "i-XXX" --no-should-decrement-desired-capacity will start a new EC2 instance and perform termination lifecycle hooks on it. This is what I use to switch out old EC2 instances with new launch configurations.
Categories
English

“Truth in bots”

The bots should announce, “I’m not a person, or if I am, I’m not allowed to act like one.”

Or, if there’s no room or time for that sentence, perhaps a simple bot at the top of the conversation. That way, we can save our human emotions for the humans who will appreciate them.

Truth in bots | Seth’s Blog

“If you can’t tell the difference, does it matter?”

Quacking like ducks, et cetera.

The point of the post is a bit different (it’s predicated on there being able to tell the difference — “… only a minute or two into the interaction that you realize you’re being fooled by an AI, not a caring human”), but what happens when you can’t tell the difference? Should AIs always announce themselves as AIs if they are indistinguishable from a human? Why?