In my last post I compared using Amazon S3 as a CDN to other low cost alternatives. What was clear was that S3 performed badly when compared to the true CDNs. Not such a surprising outcome; S3 was not designed to be a CDN.
The real surprise to me was that CacheFly, the lowest cost CDN I tested, performed better than the much higher priced options.
My performance were pure HTTP GET based and utilized the monitoring service Pingdom. What performance monitoring services don’t monitor is how quickly a website load feels to the end-user.
A lot can be done to make a website “feel” faster most of these techniques are outlined in the book High Performance Web Sites by Steve Souders of Yahoo!. Most of the book’s content can be read for free on the Yahoo! Developer Network.
The simplest way to increase performance is to minimize requests to your server by setting proper Expires and Cache-Control headers so that your static content can be cached.
When I tested the CDN options for cacheability only Panther Express and my DIY Nginx web server EC2 servers returned proper “cacheable” headers.
S3 and CacheFly returned no “cache-friendly” headers and EdgeCast returned proper headers but the server time was inaccurate which could lead to caching issues.
I found no way to add expires headers to files hosted on CacheFly. You can add custom headers to files in S3.
By setting your Expires headers on your static files far into the future you can create a cheap CDN with S3 that can “feel” as fast a traditional CDN. You’ll need to automate this and make it part of your deployment process which ideally renames the deployed files to contain revision numbers so that you can really set those Expires headers far in the future.
| Vote on HN | Please consider voting for this up on my favorite news site. |





{ 10 comments… read them below or add one }
That's good to know. I was looking potentially at cachefly for interface images (based on your previous article) but no expires headers will kill that deal for me.
As a side note, I wrote up an FTP-style method of setting far futures expires headers in s3 last year : Setting Far Future Expires Headers For Images In Amazon S3
Rob-See above I had linked to the post you highlighted in the original post.
Oh! Cool. I'll get the hang of this reading thing at some point
FYI, CacheFly *does* support setting max-age and Expires headers, however it's currently not a control panel option; you just need to drop a support ticket.
David,
Another issue with S3 you should be aware of is that you can't really host gzipped content on it. You can upload a gzipped files, but the S3 will not serve the non-gzipped version to browsers who does not support gzipped content properly (like IE 5.5,some mobile browsers, etc).
I was trying to use S3 as edge node for our inhouse CDN, but that was the dealbreaker for me (we just have too much JS that has to be gzipped for most of the users, but not for all). The second dealbreakers was that I couldn't get any decent download speed out of it (max download speed was ~12Mbit/sec). And it was very unstable (which usually means that their pipes are very much saturated).
For me right now the cheapest and most reliable option remains to push the “hot” content to a network of dedicated servers, located at different ISPs. I pay ~$100 per such server, but I get a real 100Mbit pipe and a few Tb of transfer (the whole thing is around 3-4 times cheaper than S3 and much most faster).
Cheers,
Lenkov
SiteKreator.com
Lenkov,
Here's a quick blog post I wrote detailing how to serve gzipped content from S3 only to users whose browser supports it. http://clickontyler.com/blog/2008/05/using-amaz…
Tyler,
That's a very good approach (to rewrite the URLs based on the user agent) to workaround the S3 limitations, unfortunately this won't make S3 run any faster. I just can't believe that I can't get more than 10-12Mbit/sec of sustained download speed. Will wait few more months before to consider this seriously — hopefully they will resolve these issues by then.
Regards,
Lenkov
SiteKreator.com
One other point to consider is that for text content, e.g., large-ish XML, HTML, etc., Amazon's S3 or EC2 might end up being faster than CacheFly, for the simple reason that with S3 or EC2 you can gzip your content and set the Content-Encoding header so that the browser will know it's getting gzip, but with CacheFly you can't do it.
I've been asking CacheFly to add this “simple” feature (being able to upload gzipped content and have the Content-Encoding header set correctly when it gets downloaded) for years, but they're just uninterested I guess.
Interesting, I have also done some testing and found cachefly gave very good performance compared to S3 and slightly better than 2 other CDNs (averaged over about 10 locations).
I presume all the CDNs support expiry headers when they are correctly set at the source and you are talking about setting expiry information on files actually hosted by the CDN or adding headers that were missing on the source.
I would expect a caching solution like cacheFly to use headers set at the source location not be adding it's own headers or modifying headers – strictly speaking that breaks the HTTP rules be ignoring the headers from the source server and adding it's own headers.
S3 has support for setting headers, because it's not a cache – it is the actual source of the files.
Where the CDNs support hosting files as well as caching, the issue get's more blurred – one would want to set headers where the CDN is acting as the source, but not if it is just caching.
Thanks,very interesting and useful post