Categories
AWS English

Hosting a Single-Page App on S3, with proper URLs

Note (2019/07/05): I’ve posted a follow-up to this post about limitations about the technique used here, especially when hosting an API on the same domain.

Amazon S3 is a great place to store static files. You might want to even serve a single-page application (SPA) written in JavaScript there.

When you’re writing a single-page app, there are a couple ways to handle URLs:

A) http://example.com/#!/path/of/resource
B) http://example.com/path/of/resource

A is easy to serve from S3. The server only sees the http://example.com/ part, and so it serves that file to everyone.

B, however, is a little tricky. Single-page apps usually use pushState or replaceState to change the current URL without reloading, but once you reload (or give the URL to someone else) — BAM! You’ll get presented with a 404 Not Found error.

So why don’t we just use A? There are quite a few advantages to using B, over just being more elegant than putting that pesky #! in there. In my opinion, the biggest advantage of using B is that you’ll be able to make backend changes in the future without having to redirect URLs. For example, as your app gets bigger, you want to render some (or all) components server-side (see Isomorphic or Universal JavaScript).

To implement the B strategy, we need to serve the same index.html file to any URL requested by the client. As I mentioned earlier, we can’t do this with S3 itself, so we’ll enlist the help of CloudFront.

First, create a CloudFront distribution for the S3 bucket. Since CloudFront caches items for quite a long time, you might want to either set Cache-Control headers on your S3 files, or set the default TTL to something short, like a few seconds, in the CloudFront distribution settings. Once everything is set up (and you can access index.html by itself), click the “Error Pages” tab.

Screen Shot 2015-11-24 at 9.28.46 AM

Click the big blue button, “Create Custom Error Response”:

Screen Shot 2015-11-24 at 9.28.55 AM

Now, I think you can tell what I’m up to now. Enabling “Customize Error Response” allows you to change a 404 from the backend (in this case, S3) in to a 200! Note that S3 will return a 403 response if you use the “S3 Origin” option instead of the S3-hosted origin. If you’re getting a 403 error from S3, customize the 403 error as well.

Screen Shot 2015-11-24 at 9.29.23 AM

You can try out this setup below:

https://d3qxx6yxxvp94v.cloudfront.net/https://d3qxx6yxxvp94v.cloudfront.net/testhttps://d3qxx6yxxvp94v.cloudfront.net/l87v3

These all serve the same index.html. If you inspect the headers, the first link should be X-Cache: RefreshHit from cloudfront or Miss from cloudfront. However, if you look at the other requests, it will be X-Cache: Error from cloudfront. The status returned, however, is 200 — just as we wanted it.

Any questions? Contact me or leave a comment in the box below.

20 replies on “Hosting a Single-Page App on S3, with proper URLs”

Hi.

In my case, I just had to fill both document fields: “Index document” as index.html and “Error document” as index.html in the “Static website hosting” area.

Cheers,

That gets you most of the way there. If you inspect the requests, you’ll find that when the error document is being hit, you don’t get a 200 status code — which means that search engine crawlers will assume that the link is broken. If that’s okay for you, it isn’t a problem. 🙂

It looks like the cloudfront workaround is not needed anymore as you can set this now directly in S3 (Properties/Static website hosting/Error document).

S3’s error document was available when writing this post — the problem with using S3’s error document function is that the HTTP code returned is not 200 — which is not good for search engines and crawlers.

How to ignore backend API endpoints? For example /user/123 responds with 404. In this case 404 should not be served as 200 and index.html.

Unfortunately, this is not possible with this setup. If you want to use the same domain for the API and frontend, you’ll have to set up 2 CloudFront distributions. The first has a origin for the API and an origin for the web endpoints — you’ll probably want to prefix your API with something like /api so you can forward those requests to the API. In the second, you can use the trick in this blog entry.

Thanks @Keita.
Does this mean though that now any VALID 404 for say an asset is going to come back to the browser as “200 OK”?
Or similarly, any valid 404 route would come back as 200 OK ?

If you go the route that @Maksim suggests then ALL your traffic gets at LEAST one redirect. Two if you count the naked -> www redirect that your likely also to want to do.

Seems like there is no golden bullet. If you end up going SPA you end up losing some HTML status clarity.

I’ve thought about what it might look like to add an API Gateway and Lambda layer that could push all requests thru and let the Lambda handle the simple naked->www and single page redirects. But not sure what that would look like.

Correct. 404 status is hard in the SPA world — how could you differentiate an invalid route and a valid route, but a nonexistent object ID in the URL, without server rendering? This guide was meant to be a quick-and-dirty hack — if it’s not enough, doing something with API Gateway and Lambda may be interesting, but I feel like that’s not ideal either. Google’s Firebase website hosting is a lot more powerful these days, and I probably would recommend that for most cases.

I have a problem when the path is multipart, for example
mydomain.com/bar/foo. My index.html has relative paths to javascript etc, and when the above type path is entered, it cannot find them as it tries to find them in mydomain.com/bar/ as opposed to mydomain.com. Is there any way to fix this kind of behavior?

Unfortunately, it sounds like you can’t fix it with this. Consider switching to absolute paths or you’ll have to use different index.html files for each directory.

Try this Redirection Rules:

<RoutingRules>
  <RoutingRule>
    <Condition>
      <HttpErrorCodeReturnedEquals>404</HttpErrorCodeReturnedEquals>
    </Condition>
    <Redirect>
      <HostName>example.com</HostName>
      <ReplaceKeyPrefixWith>#!/</ReplaceKeyPrefixWith>
    </Redirect>
  </RoutingRule>
</RoutingRules>

and in your app need to remove “#!” from location

something like this:

if (location.hash.length > 0 && location.hash.substring(0, 2) === '#!') {
  // #!/signup/confirm -> /signup/confirm
  history.replace(location.hash.substring(2))
}

Great post, thanks.

I’ve tried to implement this but I must have not set some permission on the S3 Bucket or misconfigured cloudfront. I’m able to go retrive my index.html by going to the cloudfront url but when I add a path I get a AccessDenied error.

Been unsuccessful trying to figure this one out. Any idea what might be going wrong?

You will need to setup a Public bucket policy.

In the target bucket click the ‘Permissions’ dropdown, click ‘Add bucket policy’.

Enter a policy document like:

{
    "Version": "2008-10-17",
    "Statement": [
        {
            "Sid": "AllowPublicRead",
            "Effect": "Allow",
            "Principal": {
                "AWS": "*"
            },
            "Action": [
                "s3:GetObject"
            ],
            "Resource": [
                "arn:aws:s3:::{YOUR BUCKET NAME}/*"
            ]
        }
    ]
}

Make sure to replace {YOUR BUCKET NAME} with the name of your bucket and save the policy. Should be working!

Also try this if that doesn’t work for you:

{
    "Version": "2008-10-17",
    "Statement": [
        {
            "Sid": "AllowPublicRead",
            "Effect": "Allow",
            "Principal": {
                "AWS": "*"
            },
            "Action": [
                "s3:GetObject"
            ],
            "Resource": "arn:aws:s3:::{YOUR BUCKET NAME}/*"
        },
        {
            "Sid": "AllowPublicList",
            "Effect": "Allow",
            "Principal": {
                "AWS": "*"
            },
            "Action": [
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::{YOUR BUCKET NAME}"
            ]
        }
    ]
}

Thanks Prachetas 🙂 (I fixed the formatting of your comments for clarity — it’s just Markdown so you can use ``` code blocks)

Thanks for the help and the examples, they are much appreciated. I did also manage to solve it by putting a redirect on 403 errors to server the index as well. This might be better if you don’t want someone to access the bucket directly? I think I read in the AWS documentation somewhere you might not want that if you are using signed cookies.

Thanks again.

I was having the same problem even though my bucket has a public bucket policy. Setting up a 403 redirect, in addition to the 404 redirect described in the post, was the only way I could solve it.

Thanks for this, spent a bunch of time trying to set up a behavior to do exactly this but setting up the error was all that I needed.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.