Categories
English

Shipping Events from Fluentd to Elasticsearch

We use fluentd to process and route log events from our various applications. It’s simple, safe, and flexible. With at-least-once delivery by default, log events are buffered at every step before they’re sent off to the various storage backends. However, there are some caveats with using Elasticsearch as a backend.

Currently, our setup looks something like this:

The general flow of data is from the application, to the fluentd aggregators, then to the backends — mainly Elasticsearch and S3. If a log event warrants a notification, it’s published to a SNS topic, which in turn triggers a Lambda function that sends the notification to Slack.

The fluentd aggregators are placed by an auto-scaling group, but are not load balanced by a load balancer. Instead, a Lambda function connected to the auto-scaling group lifecycle notifications updates a DNS round-robin entry with the private IP addresses of the fluentd aggregator instances.

We use the fluent-plugin-elasticsearch plugin to output log events to Elasticsearch. However, because this plugin uses the bulk insert API and does not validate whether events have actually been successfully inserted in to the cluster, it is dangerous to rely on it exclusively (thus the S3 backup).

Categories
AWS

IAM Policy for KMS-Encrypted Remote Terraform State in S3

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:PutObject",
                "s3:DeleteObject"
            ],
            "Resource": [
                "arn:aws:s3:::<bucket name>/*",
                "arn:aws:s3:::<bucket name>"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "kms:Encrypt",
                "kms:Decrypt",
                "kms:GenerateDataKey"
            ],
            "Resource": [
                "<arn of KMS key>"
            ]
        }
    ]
}

Don’t forget to update the KMS Key Policy, too. I spent a bit of time trying to figure out why it wasn’t working, until CloudTrail helpfully told me that the kms:GenerateDataKey permission was also required. Turn it on today, even if you don’t need the auditing. It’s an excellent permissions debugging tool.

Categories
Linux

Authenticating Linux logins via LDAP (Samba / Active Directory)

I’ve been working on infrastructure of a fleet of a few dozen Amazon EC2 instances for the past week, and with a rapidly-growing team, we decided it was appropriate to make a central authentication / authorization service.

So, that meant setting up some sort of LDAP server.

I was a bit intimidated at first (the most I’ve done is seen people manage and complain about Active Directory), but I finally got it set up. Here are the components:

  • AWS Directory Service (Simple Directory) is used as the directory server.
  • A t2.large Windows Server instance used to administer directory (usually stopped).
  • A bunch of VPC settings to make the directory service the default DNS resolver of the VPC.
  • An Ansible play I made to:
    • Join the instance to the directory.
    • Configure sshd to pull public keys from the directory.
    • Add an access filter to allow access for users who are members of the appropriate groups.
    • Update sudoers to allow sudo for users who are members of the appropriate groups.

The first three weren’t too hard — Amazon has pretty good documentation and tutorials that cover this pretty well. I recommend reading them in this order:

  1. Tutorial: Create a Simple AD Directory.
  2. Create a DHCP Options Set.
  3. Joining a Windows Instance to an AWS Directory Service Domain — read the limitations and prerequisites (you’ll need a special EC2 IAM Role) — then skip to “Joining a Domain Using the Amazon EC2 Launch Wizard”.
  4. Delegating Directory Join Privileges — this is important for security.
  5. Manually Add a Linux Instance.

On step 5, the realm join command will prompt for a password. I spent a few days trying to figure out what the best way to automate this was — I tried creating a Kerberos keytab and use that for authentication, but I wasn’t getting consistent results (for some reason that is probably clear to someone who knows a lot about Kerberos, the realm join would work but after a realm leave, Kerberos would complain that the join account didn’t exist anymore — even though I couldn’t find any differences from the AD admin tools). I eventually decided to encrypt the directory join account password in an Ansible vault and use the Ansible expect module to automate the password entry.

To do

I’m currently using the Active Directory “Users & Groups” administration tool to administer users, but this involves booting a Windows instance every time a change to the directory is made — ideally, I want a simple web-based tool to add/remove/change users, their SSH public keys, and groups. There are a few web-based tools out there already, but the ones I’ve come across are either too complicated or don’t manage SSH keys as well.

Categories
English

Upgrading PostgreSQL on Ubuntu

I recently started using Ubuntu Linux on my main development machine. That means that my PostgreSQL database is running under Ubuntu, as well. I’ve written guides to upgrading PostgreSQL using Homebrew in the past, but the upgrade process under Ubuntu was much smoother.

These steps are assuming that you use Ubuntu 16.04 LTS, and PostgreSQL 9.6 is already installed via apt.

  1. Stop the postgresql service.
    $ sudo service postgresql stop
    
  2. Move the newly-created PostgreSQL 9.6 cluster elsewhere.
    $ sudo pg_renamecluster 9.6 main main_pristine
    
  3. Upgrade the 9.5 cluster.
    $ sudo pg_upgradecluster 9.5 main
    
  4. Start the postgresql service.
    $ sudo service postgresql start
    

Now, when running pg_lsclusters, you should see something like the following:

9.5 main          5434 online postgres /var/lib/postgresql/9.5/main          /var/log/postgresql/postgresql-9.5-main.log
9.6 main          5432 online postgres /var/lib/postgresql/9.6/main          /var/log/postgresql/postgresql-9.6-main.log
9.6 main_pristine 5433 online postgres /var/lib/postgresql/9.6/main_pristine /var/log/postgresql/postgresql-9.6-main_pristine.log

Verify everything is working as expected, then feel free to remove the 9.5/main and 9.6/main_pristine clusters (pg_dropcluster).

These cluster commands may be available in other distros, but I haven’t been able to check them. YMMV. Good luck!

Categories
English

macOS Sierra

Here are the headline features of Sierra, and my thoughts about them.

Siri

Don’t use it on the phone, won’t use it on the Mac. It would be nice if I could use it by typing, though.

Universal Clipboard

This seems like a massive security risk. It works via iCloud, so if someone (like my daughter) is using my iPad, then it will sync the clipboard to that, too. How do I turn it off?

Auto Unlock

screen-shot-2016-09-21-at-21-35-28

I can’t use it, because “two-factor authentication” is apparently completely different from “two-step verification”, and they’re completely incompatible. I can switch to 2FA, but that means I have to logout and re-login to everything, right? Will that help my iMessages that are already not syncing between my Mac and my iPhone?

(Side rant: I can’t use any of the iMessage features because I’m completely on Telegram now, because of the aforementioned troubles with iMessages being unreliable. Sometimes they come to the Mac. Sometimes the iPhone. Sometimes both. Who knows! I need some kind of cloud forecast app to know. Or learn how to read the stars. Or something. Is this the kind of stuff they put in horoscopes?)

Apple Pay

Not available in Japan yet (but will be coming soon, to the “major” credit card brands: JCB and MasterCard. VISA isn’t “major”, I suppose).

Photos

For some reason, the photos themselves sync between machines, but the facial recognition metadata doesn’t? Progress is reducing the amount of things we want to babysit, not more. I don’t want to babysit my photos by correcting facial recognition guesses on both machines, over 5,000+ photos.

Messages

Hmm.

iCloud Drive

Is it better than Dropbox yet?

Optimized Storage

All of a sudden, the voice of one of my podcast panelists simply vanished from the mix. I quit and re-launched Logic, only to be told that the file in question was missing. Sure enough, a visit to Finder revealed that Sierra had “optimized” my storage and removed that file from my local drive.

macOS Sierra Review by Jason Snell at sixcolors.com

No thanks.

Picture in Picture

Okay, this might be useful sometimes.

Tabs

“Finally”.

Conclusion

2/10

Look, I don’t mean to degrade the excellent work of the team of engineers responsible for macOS at Apple. They’re doing a good job. The upgrade process to Sierra was smooth. There are just a lot of features in this release that don’t excite me. I wish they had fixed other things, like that silly facial recognition sync thing. Maybe they will in the future. I really hope so.

Also: Don’t say “then just use Linux”, because I already do. It’s not as elegant as macOS, but it’s reliable for the work I do (web programming) and when something does go wrong, I know how to fix it.

If you want to get in to an argument with me about how I’m wrong, please make an App.net account and contact me there, I’m @keita.

Categories
AWS English

Convox

I stumbled upon Convox a couple weeks ago, and found it pretty interesting. It’s led by a few people formerly from Heroku, and it certainly feels like it. A simple command-line interface to manage your applications on AWS, with almost no AWS-specific configuration required.

An example of how simple it is to deploy a new application:

$ cd ~/my-new-application
$ convox apps create
$ convox apps info
Name       my-new-application
Status     creating
Release    (none)
Processes  (none)
Endpoints  
$ convox deploy
Deploying my-new-application
Creating tarball... OK
Uploading... 911 B / 911 B  100.00 % 0       
RUNNING: tar xz
...
... wait 5-10 minutes for the ELB to be registered ...
$ convox apps info
Name       my-new-application
Status     running
Release    RIIDWNBBXKL
Processes  web
Endpoints  my-new-application-web-L7URZLD-XXXXXXX.ap-northeast-1.elb.amazonaws.com:80 (web)

Now, you can access your application at that ELB specified in the “Endpoints” section.

I haven’t used Convox with more complex applications, but it definitely looks interesting. It uses a little too much infrastructure than I would like for personal, small projects (a dedicated ELB for the service manager, for example). However, when you’re managing multiple large deploys of complex applications, the time saved by Convox doing the infrastructure work for you seems like it would pay for itself.

The philosophy behind Convox:

The Convox team and the Rack project have a strong philosophy about how to manage cloud services. Some choices we frequently consider:

  • Open over Closed
  • Integration over Invention
  • Services over Software
  • Robots over Humans
  • Shared Expertise vs Bespoke
  • Porcelain over Plumbing

I want to focus on “Robots over Humans” here — one of AWS’s greatest strengths is that almost every single thing can be automated via an API. However, I feel like its greatest weakness is that the APIs are not very user-friendly — they’re very disjointed and not consistent between services. The AWS-provided GUI “service dashboard” is packed with features, and you can see some kind of similarity with UI elements, but it basically stops there. Look at the Route 53, ElastiCache, and EC2 dashboards — they’re completely different.

Convox, in my limited experience, abstracts all of this unfriendliness away and presents you with a simple command line interface to allow you to focus on your core competency — being an application developer.

I, personally, am an application / infrastructure developer (some may call me DevOps, but I’m not particularly attached to that title), and Convox excites me because it has the potential to throw half of the work necessary to get a secure, private application cluster running on AWS.

Categories
English Linux

A month with Linux on the Desktop

It’s been a bit over a month since I installed Linux as my main desktop OS on a PC I built to replace OS X on a (cylinder) Mac Pro. I installed Ubuntu MATE 16.04.

Here are my general thoughts:

  • Linux has come a far way in 6 years (last time I used it full-time on the desktop).
  • There are Linux versions of popular software that is vital to my workflows — Firefox, Chrome, Dropbox, Slack, Sublime Text, etc.
    • When there isn’t a direct equivalent, there is usually a clone that gets the job done. Zeal, Meld, for example.
  • It still is definitely not for the casual user.
  • Btrfs.
  • Lacks the behavioral consistency of OS X.
  • Some keyboard shortcuts get some getting used to (but most of the time, they’re completely configurable).
  • Steam is available for Linux! (10 of the 11 titles in my library run on Linux. Does that say something about the games I play, or are Linux ports popular these days?)
  • If something is broken, it can be fixed*.

(*) maybe, probably. Sometimes. It depends.

Some thoughts specific to the development work I do:

  • Docker is as easy to use as it is on a Linux server. Because the kernel is exactly the same. 🙂
  • I can quickly reproduce server environments locally with minimal effort.
  • Configuration files are in the same place as any Ubuntu 16.04 server.

Some things really surprised me. For example, I plugged my iPhone in to the USB to charge it, and it automatically launched the photo importer and started the tethering connection. I did not expect that on a clean install.

It hasn’t been all peaches and roses, though — there are some specific complaints I have about the file browser (Caja, a Nautilus fork) and the MATE Terminal — so much so that I have replaced the MATE Terminal with GNOME 3’s terminal emulator. I haven’t gotten around to trying other file browser because most of the time I’m browsing files, I’m in the terminal.

Other nice-to-have things that don’t relate to the OS itself, but rather to building your own PC (I’m aware of Hackintosh-ing, but my issues were mainly with software, not hardware):

  • The particular case I’m using has space for 2 large (optical drive-sized) bays and 8 3.5 inch hard drive bays. That’s a lot of storage. It currently holds 2 SATA SSDs (and one M.2 SSD, but that doesn’t take up any room in the case).
  • Access to equipment that is much newer / faster than anything you can get via the Apple Store. (I’m planning on getting the Nvidia GTX 1080 at some point in the future, and I’m currently using the i7-6700K quad-core CPU at 4.0GHz now)

Conclusion: I’m enjoying it. I realize that I’m a special case, and I strongly discourage anyone from using Linux on the Desktop unless they really know what they’re doing. In my case, I regularly manage Linux servers professionally, so I know how to fix something when it’s gone wrong (most of the time). I still use a MacBook Pro with OS X installed on it when I’m on the go or need something specifically for Mac, but it usually stays asleep for most of the time.

Categories
Elixir English

Elixir anonymous function shorthand

Elixir’s Getting Started guides go over the &Module.function/arity and &(...) anonymous function shorthand, but there are a couple neat tricks that are not immediately apparent about this shorthand.

For example, you can do something like &"...".

iex> hello_fun = &"Hello, #{&1}"
iex> hello_fun.("Keita")
"Hello, Keita"

Let’s have some more fun.

iex> fun = &~r/hello #{&1}/
iex> fun.("world")
~r/hello world/
iex> fun = &~w(hello #{&1})
iex> fun.("world")
["hello", "world"]
iex> fun.("world moon mars")
["hello", "world", "moon", "mars"]
iex> fun = &if(&1, do: "ok", else: "not ok")
iex> fun.(true)
"ok"
iex> fun.(false)
"not ok"

You can even use defmodule to create an anonymous function that defines a new module.

iex> fun = &defmodule(&1, do: def(hello, do: unquote(&2)))
iex> fun.(Hello, "hello there")
{:module, Hello, <<...>>, {:hello, 0}}
iex> Hello.hello
"hello there"

(Note that I don’t recommend overusing it like I did here! The only one that has been really useful to me was the first example, &"...")

Categories
Elixir English

Elixir: A year (and a few months) in

In the beginning of 2015, I wrote a blog post about how my then-current programming language of choice (Ruby) was showing itself to not be as future-proof as I would have liked it to be.

A lot has changed since then, but a lot has remained the same.

First: I have started a few open-source Elixir projects:

  • Exfile — a file upload handling, persistence, and processing library. Extracted from an image upload service I’m working on (also in Elixir).
  • multistream-downloader — a quick tool to monitor and download HTTP-based live streams.
  • runroller — a redirect “unroller” API, see the blog post about it.

The initial push to get me in to Elixir was indeed its performance, but that’s not what kept me. At the same time, I also tried learning Go and more recently, Rust has caught my attention.

In most cases, languages like Go and Rust can push more raw performance out of a simple “Hello World” benchmark test — exactly what I initially did to compare Ruby on Rails to Elixir / Phoenix. The more I used Elixir, the more I gained an appreciation for what I now regard critical language features (yes, most of these apply to Erlang and other functional languages too).

Immutability

This is a big one. Manipulating a mutable data structure is essentially changing the data in memory directly. This, however, is susceptible to the classic race condition bug programmers encounter when writing multithreaded programs:

  • Thread 1 reads “1” from “a”
  • Thread 2 reads “1” from “a”
  • T1 increments “1” to “2” and writes it to “a”
  • T2 increments “1” to “2” and writes it to “a”

In this case, the intended value is “3” because the programmer incremented “a” two times, but the actual value in memory is “2” due to the race condition.

In systems with immutable variables, this class of bug doesn’t exist. It forces the programmer to be explicit about shared state.

Lightweight Processes

A typical Erlang (Elixir) node can have millions of processes running on it. No, these are not your traditional OS processes — they are not threads, either. The Erlang VM uses its own scheduler and thread pool to execute code. The memory overhead of a single process is usually very light — on my machine, it’s 2.6kb (SMP and HiPE enabled).

Inter-Process Communication

Another big one.

Want to send a message to another process?

send(pid, :hello)

Want to handle a message from another process?

receive do
  :hello -> puts "I received :hello"
end

This works on processes that are running in the current node — but it also works across nodes. The syntax is exactly the same. The “pid” variable in the example is able to refer to a process anywhere in the cluster of nodes. The other node doesn’t even have to be Erlang, it just needs to be able to speak the same language, the distribution protocol and the external term format.

OTP

You can’t talk about Erlang or Elixir without bringing up OTP. OTP stands for “Open Telecom Platform” (Erlang was initially developed by Ericsson for use in telecom systems). OTP is a framework with many battle-tested tools to help you build your application — for example,

  • gen_server – an abstraction of the client-server model
  • gen_fsm – a finite state machine
  • supervisor – a supervisor process that automates recovery from temporary failures

OTP is, for all intents and purposes, part of the Erlang standard library. Thus, it is automatically included in any Elixir application as well. The nuts and bolts of OTP are out of the scope of this blog post, but having such a rich toolbox is like a breath of fresh air coming from Ruby (I thought the same when switching full-time from PHP to Ruby).

Conclusion

I’ve learned a lot in this past year, and yet I feel like I’ve only scratched the surface. Thanks for reading!

Categories
Elixir English

tokyo.ex #1

I attended my first Elixir-related meetup yesterday, tokyo.ex #1.

(If the slides don’t work here, I have also uploaded them to YouTube.)

In my 5-minute lightning talk, I talked about the basics of using Exfile. Exfile is a file upload persistence and processing library for Elixir with integrations for Ecto and Phoenix.