Categories
AWS English

Hosting a Single-Page App on S3, with proper URLs

Note (2019/07/05): I’ve posted a follow-up to this post about limitations about the technique used here, especially when hosting an API on the same domain.

Amazon S3 is a great place to store static files. You might want to even serve a single-page application (SPA) written in JavaScript there.

When you’re writing a single-page app, there are a couple ways to handle URLs:

A) http://example.com/#!/path/of/resource
B) http://example.com/path/of/resource

A is easy to serve from S3. The server only sees the http://example.com/ part, and so it serves that file to everyone.

B, however, is a little tricky. Single-page apps usually use pushState or replaceState to change the current URL without reloading, but once you reload (or give the URL to someone else) — BAM! You’ll get presented with a 404 Not Found error.

So why don’t we just use A? There are quite a few advantages to using B, over just being more elegant than putting that pesky #! in there. In my opinion, the biggest advantage of using B is that you’ll be able to make backend changes in the future without having to redirect URLs. For example, as your app gets bigger, you want to render some (or all) components server-side (see Isomorphic or Universal JavaScript).

To implement the B strategy, we need to serve the same index.html file to any URL requested by the client. As I mentioned earlier, we can’t do this with S3 itself, so we’ll enlist the help of CloudFront.

First, create a CloudFront distribution for the S3 bucket. Since CloudFront caches items for quite a long time, you might want to either set Cache-Control headers on your S3 files, or set the default TTL to something short, like a few seconds, in the CloudFront distribution settings. Once everything is set up (and you can access index.html by itself), click the “Error Pages” tab.

Screen Shot 2015-11-24 at 9.28.46 AM

Click the big blue button, “Create Custom Error Response”:

Screen Shot 2015-11-24 at 9.28.55 AM

Now, I think you can tell what I’m up to now. Enabling “Customize Error Response” allows you to change a 404 from the backend (in this case, S3) in to a 200! Note that S3 will return a 403 response if you use the “S3 Origin” option instead of the S3-hosted origin. If you’re getting a 403 error from S3, customize the 403 error as well.

Screen Shot 2015-11-24 at 9.29.23 AM

You can try out this setup below:

https://d3qxx6yxxvp94v.cloudfront.net/https://d3qxx6yxxvp94v.cloudfront.net/testhttps://d3qxx6yxxvp94v.cloudfront.net/l87v3

These all serve the same index.html. If you inspect the headers, the first link should be X-Cache: RefreshHit from cloudfront or Miss from cloudfront. However, if you look at the other requests, it will be X-Cache: Error from cloudfront. The status returned, however, is 200 — just as we wanted it.

Any questions? Contact me or leave a comment in the box below.

Categories
English Uncategorized

Podcasts I’m Listening To (November 2015 Edition)

My wife Naoko wrote a reply to this post. It was fun comparing how different the podcasts we listen to are. 🙂

First, I’d like to plug a podcast that I’m a semi-regular guest on, techsTalking(5417), a podcast where technology people just talk about whatever is on our mind.

Here are some other podcasts that I’m currently subscribed to:

  • The Incomparable — a podcast about anything geeky. Star Wars? Check. Star Trek? Check. Silly drafts? Check. Crazy movies? Check.
  • The Incomparable Game Show — born from The Incomparable proper, regular panelists play crazy games for your entertainment. On the podcast.
  • Incomparable Radio Theater — The Incomparable podcast, once upon a time, liked to do funny things on April Fools. Like, say: release a full-length episode in the format of old-time radio drama. Including equally funny sponsors (some fake, some real). Now, they’ve spun it off in to a separate podcast.
  • Random Trek — Incomparable regular Scott McNulty hosts a podcast with non-random guests talking about random episodes of Star Trek.
  • Robot or Not? — Is it a robot? Or not?
  • Astronomy Cast — A weekly “facts-based journey through the cosmos”.
  • Reconcilable Differences — Two of my favorite podcasters, John Siracusa and Merlin Mann, get together on one podcast.

A few other podcasts I listen to occasionally:

And assorted programming-specific podcasts.

Categories
English Open Source Projects

Runroller UI

I recently released a simple API to un-shorten URLs. A few people wanted a super-simple interface to this, so I whipped one up: https://keita.blog/unroll/. Enjoy!

Some notes about the tools I’ve used:

  • React — I’ve used React in portions of sites before, but this is the first, albeit simple, full-page React app I’ve made.
  • Brunch — used by default in Phoenix apps, it’s just what I’m used to these days.

Just like the service that runs the API, the UI is also open-source. Hack away!

Categories
Elixir English Open Source Projects

Link Unroller Service

As a small side project, I recently launched a “link unroller” service. This is a very simple service. You give it a URI, and it follows any redirect chain for you. Then it spits out the final URI via a friendly JSON API.

Give it a spin:

https://unroll.kbys.me/unroll?uri=http://bit.ly/1QZ6acT

Basically, all you do is send a GET request to:

https://unroll.kbys.me/unroll?uri=<URI to unroll>

Done. If there are no problems, you will get a JSON response:

{
  "uri":"http://bit.ly/1QZ6acT",
  "unrolled_uri":"https://keita.blog/",
  "redirect_path":[
    "http://bit.ly/1QZ6acT",
    "http://keita.blog/"
  ],
  "error":false
}

The unrolled_uri parameter is the final link in the chain, and the redirect_path is an array of the links that were traversed.

If you’d like to take a look at the code, make some contributions, or submit some bugs, please head over to the GitHub page.

Technical details:

  • The server is in Tokyo.
  • Written in Elixir.
  • Backend responses are ~ 600 microseconds on a cache hit.

Policy details:

  • Up to 7 redirects will be followed.
  • The request will time out after 20 seconds and return an error.
  • 301 redirects are cached forever, regardless of Cache-Control or Expires headers present in the response.
  • 302 redirects will honor caching headers, with a minimum TTL of 1 minute (this is for DoS protection on my side)
  • 200 responses are cached for 1 hour.
Categories
English

bundler gotcha

So, this is a thing:

bundle install --without development:test
...
...
Bundle complete! XX Gemfile dependencies, XX gems now installed.
Gems in the groups development and test were not installed.

Now,

bundle install
...
...
Bundle complete! XX Gemfile dependencies, XX gems now installed.
Gems in the groups development and test were not installed.

Basically — you run bundle install --without <group> once, and that’s saved in .bundle/config. So next time you run bundle install without any arguments, it won’t install gems in the groups you specify.

It looks like it is fixed in Bundler 2.0, though.

Categories
AWS English

Heroku + SSL = Expensive?

Note: This blog post covers the legacy SSL Endpoint. Heroku now recommends the use of Heroku SSL, which can provide you with a free certificate and HTTPS (provided you are using the Hobby tier or higher).


If you use Heroku, you probably know a couple things:

  1. You can’t use an apex domain for your site (unless you use a DNS service that emulates ALIAS / ANAME records).
  2. Using your own SSL certificate costs $20/month.

I’m going to solve both of these problems with one stone: AWS CloudFront.

AWS CloudFront is a content delivery network — think of it as a proxy, distributed around the globe. It’s usually used for static content that can be cached for long periods of time, but we’re going to use it for dynamic (or semi-dynamic) content today.

Point CloudFront to your *.herokuapp.com domain, then assign CloudFront the domain of your choice (in the “Alternate Domain / CNAME” field). You can then upload your SSL certificate. CloudFront supports something called SSL Server Name Indication — SNI for short. Remember those days where each server serving a separate SSL certificate had to be on a different IP address? No more — SNI lets the same server on the same IP serve multiple SSL certificates at the same time.

Now, add the “Host” header to “Whitelist Headers”, and you’re ready to go.

Wait. How do you wire CloudFront, which uses dynamic IPs, up to the apex domain? Fortunately, AWS has you covered. Route 53, the DNS service that AWS provides, has an ALIAS feature for resources in your AWS account (CloudFront, Elastic Load Balancer, S3, etc).

For low-traffic sites, this will almost always work out to be less than $20/month.

My preliminary experience with using this setup has been quite good for a mostly-static site, but performance suffered when with user sessions. CloudFront excels at serving cached content — it’s not so fast on cache misses. YMMV.

Categories
Elixir English

My Great Language Hunt — Elixir

Edit 2016/4/29 I have written a follow-up piece to this blog post.

As many of you probably know, I am a professional programmer. I started my professional career with WordPress and PHP development, and now I find myself doing a lot of Ruby work. I am still in the very early stages of my professional career — I have only been doing this for about 5 years. There are people who are much more experienced than I am, and there is a whole world of things that I have yet to learn and experience.

Until recently, I’ve been relatively reluctant to change. I just wanted to get something done, using the tools I’m most familiar with. In the first few years of my career, that was PHP. After learning and getting used to Ruby and Rails, that’s been my go-to language and framework.

However, I’ve decided to throw caution in the wind, and choose a new language that I’m going to be using going forward: Elixir.

Every programming language has something that irks somebody. The thing that irks me about Ruby is its lack of a really robust concurrency model. There have been experiments in EventMachine — evented I/O like Node, Rubinius and JRuby — real threads, Unicorn — forking (!). But none of these solutions are particularly elegant. Relying on forking for concurrency, even with copy-on-write, is not efficient at all. (Read more about concurrency and Ruby: Matz is not a threading guy)

Enter Elixir. Elixir is a language that runs on top of the Erlang VM. The Erlang environment is something special, I think. It’s built from the ground up with concurrency and reliability in mind (it was developed for use in telecom systems). Originally a proprietary language at Ericsson for a little over 10 years, it was open-sourced in 1998. Elixir builds on this very mature ecosystem, and makes it accessible enough for “mere mortals” to build high-performance applications.

When I first tried Elixir out to build a web app, like you would with Rails, I was pretty surprised. If you’re used to Rails, you’ve seen this log message:

Screen Shot 2015-01-04 at 9.22.23 AM

This is the Elixir / Phoenix framework equivalent:

Screen Shot 2015-01-04 at 9.25.50 AM

170ms down to 5ms. And that’s in development mode. Switch it to production mode, and this is what you see:

Screen Shot 2015-01-04 at 9.28.04 AM

5ms down to 300 microseconds.

Granted, this is a very simple page. It’s basically a “Hello World” benchmark, but it certainly piqued my interest. This amount of performance, coupled with the ability to maintain it even under immense load, is one of the factors going in to my choice of Elixir.

The other factor is, what I believe, the direction of the Internet. We’re moving away from servers rendering static HTML and sending that over the wire, and to more dynamic content powered by client-side JavaScript or native code on mobile devices. Elixir and Erlang handle this beautifully — dutifully keeping thousands and thousands of open WebSocket sessions alive without breaking a sweat. You should take a look at how Whatsapp uses one machine running their chat backend on Erlang to service more than 2 million simultaneous connections.

By investing my time now in Elixir and Erlang, I believe that I’m essentially getting ready for the next 20 years of the Internet. And I’m excited.

Categories
English Uncategorized

Homebrew and PostgreSQL 9.4

Edit 2016/1/9 I have updated these instructions for upgrading from PostgreSQL 9.4 to 9.5.

As you may know, I am a big PostgreSQL user and fan. I also use Homebrew to manage 3rd party software packages on my Mac. PostgreSQL 9.4 was just released a couple days ago with some really cool features — a binary-format JSON datatype for speed and flexibility (indexes on JSON keys? Of course.), and some really good performance improvements. Read the release blog post and release notes for more information.

However, if you’ve used PostgreSQL before, you know that upgrading can be a little difficult. Here’s what you have to do to upgrade your Homebrew-installed PostgreSQL 9.3 to 9.4. Keep in mind, these steps are for a standard Homebrew installation — as long as you haven’t configured custom data directory paths, it should work.

  1. Turn PostgreSQL off first, just in case:
    $ launchctl unload ~/Library/LaunchAgents/homebrew.mxcl.postgresql.plist
    
  2. Update PostgreSQL itself:
    $ brew update && brew upgrade postgresql
    
  3. Make a new, pristine 9.4 database:
    $ initdb /usr/local/var/postgres9.4 -E utf8
    
  4. Migrate the data to the new 9.4 database:
    $ pg_upgrade \
      -d /usr/local/var/postgres \
      -D /usr/local/var/postgres9.4 \
      -b /usr/local/Cellar/postgresql/9.3.5_1/bin/ \
      -B /usr/local/Cellar/postgresql/9.4.0/bin/ \
      -v
    
  5. Move 9.4 data directory back to where PostgreSQL expects it to be:
    $ mv /usr/local/var/postgres /usr/local/var/postgres9.3
    $ mv /usr/local/var/postgres9.4 /usr/local/var/postgres
    
  6. Start PostgreSQL back up!
    $ launchctl load ~/Library/LaunchAgents/homebrew.mxcl.postgresql.plist
    

Note: If you’re using the pg gem for Rails, you should recompile:

$ gem uninstall pg
$ gem install pg
Categories
English Optimization & Speed

Web App Development and Caching

Any web developer who works with external services or databases (that’s probably almost every web developer) has probably run into performance problems. The problem is that running code by itself is pretty fast. Databases and external services / APIs are very slow. Waiting on an external API to load is basically the computer equivalent of waiting for a brontosaurus to walk a kilometer.

As web developers, we have a very powerful tool called caching. It’s being used in your computer reading this sentence every microsecond, with various levels of caching happening between the CPU, memory, and hard drive (or SSD). The act of caching is saving the result of a slow operation in a easily accessible space. In this case, we will be talking about caching database results and API results.

There are only two hard things in computer science: cache invalidation and naming things.

-- Phil Karlton (Adapted)

Cache invalidation is a hard problem. Let me illustrate:

  1. Server A fetches Record A and associated records from the database, and caches it.
  2. Server B updates Record A.
  3. Server A continues serving its cached copy from step 1 (until it expires).

There are a few ways to solve this problem.

You can manually invalidate caches:

  1. Server A fetches Record A and associated records from the database, and caches it.
  2. Server B updates Record A, notifying all servers to remove cached copies of Record A.
  3. Server A continues serving its cached copy from step 1 (until it expires).

This is tenable with one or two cache servers, but clearly not scalable — you’ll need to send cache purge requests to all your cache servers.

Then there’s my favorite — key-based cache invalidation:

  1. Server A fetches Record A, and looks up the cache key “Record A [timestamp when Record A was updated]”. It doesn’t exist, so it fetches associated records and stores everything in the cache.
  2. Server B updates Record A.
  3. Server A fetches Record A, and looks up the cache key “Record A [timestamp when Record A was updated]”. It exists, so it serves the cached copy.

This method has some drawbacks – it still requires one query to the canonical data store, and you need to remember to update the updated_at attribute of your record when any associated records change. If you’re using Rails, this is trivial:

class MyRecord < ActiveRecord::Base
  has_many :associated_records
end

class AssociatedRecord < ActiveRecord::Base
  belongs_to :my_record, touch: true
end

Another drawback is that your cache is going to be full of old keys, when a record is updated. Luckily, there are caches that already deal with this! LRU, or Least Recently Used, is a cache eviction policy that removes the least recently used records first, making room for new records. Redis can be used as a LRU cache and Memcached is sort of LRU. The Rails Memory Cache Store also uses a LRU algorithm.

“Caching sounds great! How do I use it?”

Caching is not something that you should “tack on” to an app. There are awesome tools, such as Varnish, that are based around this concept, but it is not ideal. The ideal web application will be designed from the ground up with caching in mind — even in the development environment. If you’re writing tests, make sure your test environment is connected to a cache, then test cache invalidation and lookup. Ideally, you should use the cache you use in production in both development and test environments.

Categories
App.net English Open Source Projects

App.net Object Sync

I recently released my first app on the Mac App Store, Toki, and I decided that talking about the inner workings of the sync mechanism I’m using would not only be interesting, but helpful for me to think about some of the problems I’m having[footnote]This is inspired by the “Vesper Sync Diary” series of blog posts by Brent Simmons[/footnote].

Toki uses an App.net sync mechanism that I’ve been thinking about and working on for a while — I thought that Toki would be a good low-profile app to test out some ideas in real-world use.

Although Toki is a production app, the sync portion is not feature-complete yet. I’m working on ironing out the basic bugs (planned to be released in 1.0.1 and 1.0.2 releases), before moving on to the more advanced features. My “end goal” for the App.net sync mechanism is an open-source multi-platform library that anyone can use in their own app to use the App.net backend to keep their objects synchronized[footnote]There already has been some work been done with a Ruby-based command-line client.[/footnote]. From now on (and until I think of a better name), I will call this sync mechanism “App.net Object Sync” or “ADNOS”.

ADNOS uses the private message / channel feature of the App.net API, using a channel for the app namespace, and a message representing each object.

There are two fundamental steps of the sync protocol (as of version 1.0.2, still in development):

  1. Pulling remote changes: ADNOS pulls all new messages from the channel, processing them one by one, creating or updating local records as necessary. If the record is deemed “duplicate”, it will overwrite local changes[footnote]Conflict resolution is not applicable in the context of Toki, but is high on the list of features I want to implement.[/footnote]. If the record was newly created in the local database, ADNOS will also record the App.net Message ID in the local database.
  2. Pushing local changes: ADNOS makes a new message post for each record that does not have a App.net Message ID recorded in the local database. When the message is successfully created, the message ID is set.

This has some limits, some inherent in the protocol, and some inherent in the App.net backend:

  1. Message creation is rate-limited to 20 messages per minute[footnote]This is also why Toki works with such low-resolution data.[/footnote].
  2. Each message is limited to 8192 bytes. Less, actually, because of serialization and metadata.
  3. Recreating the database requires a full playback of messages — this takes time, especially on databases with more than 200 objects.

These limits are pretty much unavoidable. I work around the rate-limiting problem by queueing sync up requests, and pausing the queue when X-RateLimit-Remaining gets dangerously low. The queue is re-started after X-RateLimit-Reset elapses. Any app that plans on using ADNOS will want to be aware of these limitations before implementing it.

And, now for the features I’m planning on implementing (in no particular order):

  • Proper conflict resolution.
  • Streaming support for pulling new changes.
  • Updating and deleting previously synced items. This isn’t implemented yet, but should be relatively easy.

I’ll be writing more as I progress. I hope to get some sort of open-source library out there in the near future.