Using Amazon S3 as a CDN?

May 29, 2008

At Lookery our Javascript analytics tracker is now pushing more than 250GB of bandwidth per month. This javascript file has grown a bit but is still about 6kb.

We’ve been serving that file from a Nginx webserver running on Amazon EC2 instances. Currently Amazon has 3 EC2 data centers but strangely they are all located on the East Coast of the US (Virginia). Since a lot of our users are international we needed to move that file to a CDN in order to reduce latency.

Less than a month ago we moved that single file off to CacheFly a low-priced CDN. I thought CacheFly would be an interim solution but from our performance testing they seem to be a good longer term option.

For this round of testing I tested the following serving options:

  1. CacheFly
  2. EdgeCast
  3. Amazon S3
  4. Nginx running on an Amazon EC2 Instance (DIY option)

For the next round of tests I’ll add Panther Express to the mix.

Performance testing Amazon S3 is a bit unfair but since so many people are using this as a cheap solution I thought I would test it out myself. The performance tests show that you’re much better serving your static content from a lightweight server like NGINX or using an inexpensive option like CacheFly.

For the performance tests I used Pingdom, a 3rd party performance monitoring service that we’ve been quite happy with.

The monitoring servers were geographically distributed as follows:

CDN Tests

Summary

CDN Tests

CacheFly performed the best but only slightly better than EdgeCast. The S3 option was the worst with the Nginx/DIY option performing just over 100 ms faster.

Details

Below are the details for a single day. I ran these tests for over 2 weeks, the results were identical to this single day.

CDN Tests
CacheFly CDN
CDN Tests
Nginx on an Amazon EC2 Instance (DIY)
CDN Tests
EdgeCast CDN
CDN Tests
Amazon S3 used as a CDN

Notes

  • I also tested a second DIY option, running a Varnish cache on an Amazon EC2 instance, but for static content Nginx performed much better so I omitted the results.
  • EdgeCast has an option that allows frequently used content to be served directly from RAM. My trial account did not allow me to test this option. This option would allow for even better performance, possibly matching/beating CacheFly’s performance.

So far we’re sticking with CacheFly and testing a few other options. I’ll post the Panther Express performance after our tests are complete.

Let me know if you’ve found similar results or if I should be testing any other solutions.

Related posts:

Vote on HNPlease consider voting for this up on my favorite news site.

{ 15 comments… read them below or add one }

Tim May 30, 2008 at 3:09 am

What about NGINX using the memcache plugin (serving content from RAM)?

http://wiki.codemongers.com/NginxHttpMemcachedM

Reply

David Cancel May 30, 2008 at 3:36 am

Tim,

Good point, I didn't try the Nginx Memcached module for this test but have used it before. I think it could lead to a slight bump since memcached still requires a socket connection. A better bump might be had by caching it in RAM in a similar way that the 1×1.gif pixel module does, http://wiki.codemongers.com/NginxHttpEmptyGifMo….

What I found in looking at our other performance tests was that most of the latency we saw in the EC2 tests were due to the network connectivity. I don't think any solution would bring an EC2 instance into the sub 200ms range. But I'm sure we can bring it down into the low 300ms or even high 200ms range which would be fine for most applications.

Thanks for the comment.

Reply

Jon Alexander May 30, 2008 at 10:38 am

You should try using Velocix.

We've just announced a free service, http://www.velocix.com/accelerator/, that should do exactly what you want. Let me know if you want to get access to the service (we're still in closed beta).

Reply

David Cancel May 30, 2008 at 10:42 am

Jon, would love access to the Velocix beta. My email is dcancel ~at~ dcancel dot com.

Thanks,
David

Reply

teh June 10, 2008 at 2:52 pm

Price is also a big deal breaker/maker. If you have to serve hundreds or even thousands of gigabytes, which of the above mentioned you think will bring peace and balance?

Reply

teh June 10, 2008 at 7:07 pm

correction: If you have to store and eventually serve thousands of gigabytes. Maybe S3 will be cheapest in that scenario.

Reply

Walter July 13, 2008 at 5:47 am

Nginx's creator has said that serving static files from disk is usually significantly faster than serving data from memcached. That's because the files aren't really served from disk, they're served directly from Linux's speedy file cache, at least if you're serving a small enough set of content to fit in Linux's cache. So if you only have, say, <100 MB of static content that you'll be serving up on a regular basis, files will beat memcached.

Reply

Stu Thompson August 23, 2008 at 9:09 pm

Interesting read, thanks for the analysis.

I've just about concluded a two month beta period using a) EC2 to host my streaming media/podcasting application, and b) S3 as a CDN for 80% of my static traffic. While I love the prices and flexibility, EC2 instance bandwidth caps and, more importantly, the too frequent per connection bandwidth performance limits and problems are discouraging. We are based in Switzerland, which probably has a lot to do with the performance issues, but not even the EU-based buckets provided a manageable improvement.

So…AWS is only suitable for my price sensitive customers.

Time to try someone else. I think I'll have a look at CacheFly.

Thanks again for the post!

Stu

PS: My own efforts have kindled the blogging itch. Maybe I'll be able to post my own metrics in the next week or so.

Reply

quelish September 10, 2008 at 7:20 am

What were you using as an origin server for these tests? That's not clear from your post.

Reply

David Cancel September 10, 2008 at 11:12 am

The origin server was running on Amazon's EC2 service.

Reply

tekgems May 6, 2009 at 8:32 pm

I would add Wuala as a potential CDN. I haven't measured ping response times, but it is free. You upload your file via wuala and you can hot link to it via HTTP or HTTPS. The service is inexpensive cloud CDN.

Reply

maxkir March 6, 2010 at 8:45 pm

Hello David,

I've just started looking into CDN solutions and your post is really useful. Thanks a lot for the analysis and for sharing it with us.

Regards,
KIR

Reply

Joe July 20, 2010 at 10:40 pm

Hi David,

I'm not sure if you've considered building a CDN from scratch. We've built one for a client using Nginx, Varnish, Bind, and GeoIP. The cost of such undertaking is low considering the availability of affordable VM nodes. Here's the link in question:

http://blog.unixy.net/2010/07/how-to-build-your-own-cdn-using-bind-geoip-nginx-and-varnish/

Regards
Joe

Reply

costicanu May 24, 2011 at 6:11 pm

Hi, if I have few sites and without lots of users, what is the cheapest CDN company to use? I heard that with Amazon S3 you pay as it transfers, something like $1/month… Some ideas?

Reply

David Cancel May 31, 2011 at 6:00 pm

I use Amazon cloudfront now and usually get a bill for less than $0.50 each month.

David

Reply

Leave a Comment

Previous post:

Next post: