Scaling Lessons Learned from About.me
By True Ventures, April 12, 2011
1. The Perfect Cloud
Before scaling out a system, the first axiom is that in most cases ‘the cloud’ won’t provide perfect insulation from real world faults, and hardware issues will bring you back down to earth (I hope to eventually be proved wrong on this). Reducing single points of failure (more on this later), as well as having a proper backup policy in place, will help. If you don’t do a daily snapshot of your entire site that is stored in a separate account under a different set of credentials, then stop reading this post and go implement it now.
2. Don’t Reinvent the Wheel
We use Amazon’s S3 for handling user uploaded images, as it integrates nicely with Amazon’s CDN (CloudFront) to allow for speedier downloads and will scale as we scale. We use another CDN (Edgecast) for serving static files that are pulled from our origin server. Google Analytics provides site wide analytics, and SendGrid is our outbound email provider. It was nice to partner with many True Ventures portfolio companies (we joked that we’d be the first company to use all of them,) as they help fill various needs. For example, we use TypeKit for font serving and Loggly for log aggregation. Infectious artists provided initial backgrounds that users could choose from, if they didn’t have a picture ready, and KISSMetrics’ KISSInsights is used to occassionally poll our user base when they interact with the site.
3. Right Tool for the Right Job
This is really an extension of the above section but deserves its own mention. jQuery and other javasript libraries helped us get going quickly in terms of engineering a dynamic site on the client. For storing data, we use a combination of CouchDB (using BigCouch) and MySQL (running on Amazon’s RDS). The reason for these two data stores is that some structured data (dictionaries, lists of varying types) is best stored in a document store. Other things like tweets lend themselves to a SQL store. We use replication and Amazon’s EBS volumes to get better data durability. Tool selection for team communication and task prioritization is also important; to this end we use a combination of Google Apps, FogBugz, Kiln, Campfire and Strides, a new product from Socialcast in private beta.
4. Optimization
Get the low hanging optimization fruit that enhances the user experience. There are a handful of optimizations that are relatively easy to take advantage of for delivering a speedier experience: compressing text based content with gzip, long expiration times for static files, and using a CDN (see YSlow and Google Page Speed for more). On the backend, we have a Membase cluster for caching data; it is protocol compatible with memcached, so we did not have to expend engineering resources in order to target it and obtain its benefits.
5. Redundancy/Distribution
Reduce/eliminate single points of failure. If there is a ‘key to scale’, this is it. While it’s easy to be prone to over-engineer, it’s worth revisiting site architecture when time and resources are available. This will help by taking advantage of new products, as well as software features, in addition to re-thinking how to handle the next wave of growth. Expecting to have all failure cases covered is a fool’s journey, but you can reduce downtime and be better prepared with proper monitoring (for us it’s a combination of Pingdom, Server Density, CloudKick, and Cloudwatch). Over time, targeted monitoring will help alert you to looming issues even though you won’t know about all failure cases. In general, this approach also lends itself to a ‘smart proxy’ model where each request is routed to a self contained cluster that can stand on its own (first X users get routed to cluster A, the second X users get routed to cluster B, …). Implementing this is also where a lot of engineering resources will be spent as a site scales.
Don’t be afraid to ask for help. When times get rough, it’s good to seek advice from those that have been through it before.
This post was written by Luke Gotszling of About.me, a True Ventures Portfolio Company.