Glad You're Ready. Let's Get Started!

Let us know how we can contact you.

Thank you!

We'll respond shortly.

Rails, Slashdotted: no problem

By Steve Conover and Brian Takita

Peer-to-Patent, one of Pivotal Labs’ clients, got Slashdotted last week, and we had no trouble handling the load. The site was just as responsive as it always is, and we didn’t come close to having a scale problem.

Moral of the story: the technology for serving static web pages is old, boring, and extremely scalable. If you have the type of site that can be page-cached, do so aggressively, starting with the front page and any pages likely to be linked to. We got a huge payoff for the engineering time that we invested in our page-caching strategy.


  • We moved away from Rails page-caching and developed our own “holeless cache”, which uses a symlink trick (see below) to instantly and “holelessly” switch to a new version of a cached page. (The cache “hole” is the time between the expiration or purge of a cached page and the time when it’s regenerated. The danger is that in that time your Mongrels can be saturated with requests – something we proved to ourselves could easily happen.)
  • Here’s our symlink trick, using the front page as an example:
    1. Have index.html point to index.html.current
    2. If (index.html.current is >= 20 minutes old)
      1. Copy index.html.current to index.html.old
      2. Point index.html to index.html.old
      3. Rewrite index.html.current by asking Rails for the page (using the process method)
      4. Repoint index.html back at index.html.current
    3. Repeat step 2 every minute using a cron job.
  • For cache expiration that’s model-based, we make a call from the model observer class to our holeless cache routine, instead of using Rails cache sweepers. So, instead of just deleting the cached page we regenerate it in place.
  • It was important to write tests that proved that the HTML we generated for cached pages looked exactly the same in different “modes” (user logged in vs not, for example). This forced us to push modal decision logic out of Markaby templates and into JavaScript, meaning that view-oriented Rspec tests asserting modal differences became useless. We rewrote them as Selenium tests.
  • Performance/load testing: we tried several tools and approaches and found that a simple Ruby script that launches wget requests (that write to /dev/null) in many separate threads worked best for us.
  • We send down exactly one .js and one .css file. If you are sending down more than one of each of these to the browser, you have a performance problem. Fix it with asset packager.

Update: one clarification about the cron job: we deploy this “automatically” using capistrano.

  1. watt says:

    the symlink trick is not very efficient – since “move” operations on Linux are atomic, you could simply “move” the new page over the “old” page.

    1) have index.html point to index.html.current
    2) generate the cached as
    3) have cron script check if exists, and if yes, “mv index.html.current”

  2. Steve says:

    watt – thanks for your suggestion, this is a fine solution to the problem, anyone thinking of doing something similar to what we did should consider it.

  3. Claus says:

    So “Rails, Slashdotted – no problem” means “Avoid Rails if there is any load” – the strategy you talk about has little to do with Rails.

  4. Dav says:

    Claus, actually it means you can code your application in Ruby on Rails, taking advantage of the short time to market (I think Peer to Patent took under two months), vastly improved maintainability (over PHP, for instance) and enjoy the pleasure of coding in Ruby, yet still put the application into production in a manner that survives a slashdotting. That’s pretty nice.

  5. “So “Rails, Slashdotted – no problem” means “Avoid Rails if there is any load” – the strategy you talk about has little to do with Rails.·”

    Yes, it has. Caching is the life of any web app, and Rails excels on enabling you to cache exactly the content you need.

  6. Quite a useful caching trick, and one that works with all application languages too.

    Nothing Rails specific here, if not for the “moved away” part…

  7. Claus says:

    The development speed benefits of rails are largely anecdotal and accidental to the caching technique.
    As for Juan’s assertion that “rails excels” here – is funny, considering that the technique largely involves not using Rails to solve the problem.

    I’m totally OK with Rails and the people loving Rails – I use it too on some projects – but it is quite simply not true that rails development is orders of magnitude faster/better than the other dynamic environments.
    It is exactly as ridiculous as the “Enterprise == Java” claims, the Rails community are so fond of laughing at.

  8. Thanks for the caching trick.

    On the “one .js and .css” comment though I am pretty sure you aren’t advocating it for everyone in every context. With YUI on a CDN you don’t want to be sending down your own copy of YUI inside your one .js file. Let Yahoo’s CDN do it for you and it may be cached already on the visitors machine. And with some web-apps now having many hundreds of k of JS you need to find a balance between HTTP requests and file-size.

  9. Hi guys,

    Interesting solution…thanks for posting it. A question not as much about this but about your javascript login/logged in code, I’d love to hear more about how you are doing that, (so that you can do full page caching), but keep the “Hello Cameron” on the screen – where you got that idea, and especially, does it fail gracefully, if a user has javascript disabled?



  10. Mike Bailey says:

    Do you have any nice tricks for managing cron via Capistrano? Do you maintain entire cron files under version control and push them out with cap or have you written tasks to add/remove cron entries individually?

  11. Jim Meyer says:

    @Mike: Just surmising, but you could keep your crons somewhere in your app’s dir structure, then make a cap recipe to call “crontab [filename]” when appropriate. Problematic if you have multiple apps with crons all to manage under a single username, though.

  12. Dan Kubb says:

    Another approach that you may want to think about is using the Varnish HTTP Accelerator. With a bit of configuration all you’d have to do is set your Expires and Cache-Control headers to keep the pages fresh for 20 minutes, and it’ll automatically keep refreshing the content every 20 minutes without any file copying.

    Plus Varnish is fast, faster even than Nginx. I read that the Joyent guys say its about 10x the speed of Nginx, but I’ve only gotten it to run about 40% faster — still not too shabby considering Nginx is one of the fastest web servers around. Keep in mind that Varnish isn’t a full web server, its just a cache, but its probably worth testing in front of Nginx.

  13. Brian Takita says:

    @Cameron – We store the user’s basic information in the Cookie header. The javascript code then reads/alters the cookies.

    Cookies work out well because they can be read/altered by the server on POSTs and passed to the next request on cache hits.

    Obviously, we need to consider privacy, so sensitive information cannot be stored this way.

    Since the vast majority of our target users have javascript enabled we did not implement a non-javascript way to interact with the site.
    We also heavily use AJAX techniques for our site’s interaction, which, afaik, makes it expensive to have a parallel non-javascript solution.

  14. Brian Takita says:

    @mike + @jim – We have a cron installer that overwrites the crontab on deploy. Since P2p is the only app running on our slice, its not a problem.

    If you have a shared server, you could use a modified technique, like search/replace using comments and your favorite text munging tool.

Post a Comment

Your Information (Name required. Email address will not be displayed with comment.)

* Copy This Password *

* Type Or Paste Password Here *