Close
Glad You're Ready. Let's Get Started!

Let us know how we can contact you.

Thank you!

We'll respond shortly.

LABS
Making Rails Wicked Fast: Pagecaching Highly Personalized Web Pages

Consider the following snippet for a page showing blog articles. Notice how content on the page differs based on who is viewing it:

<% if current_user.nil? %>
  You are logged out
<% elsif current_user.admin? %>
  You are an admin
<% elsif @article.author == current_user %>
  You are the author of this blog article
<% end %>

Pagecaching such a page is difficult because all of this conditional logic would need to be translated to Javascript. The appropriate data (whether the user is logged in, etc.) needs to be available to the client–usually this is stored in a cookie or comes from an Ajax request (presumably the Ajax request is much faster than having Rails generate the entire page).

While we can translate this conditional logic to Javascript, a much simpler approach is to use CSS:

<style>
  .logged_out, .admin, .author { display: none; }
  body.logged_out .logged_out { display: block; }
  body.admin .admin
</style>
<body class="">
  <div class="logged_out">
    You are logged out
  </div>
  <div class="admin">
    You are an admin
  </div>
  <div class="author">
    You are the author...
  </div>
</body>

By default, anything of class admin, logged_out, etc. is invisible. But simply by adding a class to the body tag, we can “unlock” these hidden parts of the page:

<body class="admin author">
</body>

And voila! both the admin and author sections are visible to the end user.

Implementation Details

So how do we add classes to the page? And where do we get the appropriate data for the end user? Use Javascript to add classes to the body tag:

for(var i = 0; i < classes.length; i++) {
  $$('body').first().addClassName(classes[i]);
}

Data then comes from one of three places.

Constant Data about the Current User

Set constant data about the current user at the start of a session (for example, we know whether the person is logged in and we know whether she is an admin):

class ApplicationController < ActionController::Base
  after_filter :set_classes
  def set_classes
    cookies[:user_classes] = current_user.classes
  end
end
class User
  def classes
    [admin?? 'admin' : 'not_admin', ...]
  end
end

Personalized Data

Set data about the current user’s relationship to the presently displayed content (for example, whether she the author of the article she is currently looking at) using an Ajax request.

new Ajax.Request('#{url_for(:format => :js)}', {
  method: 'post',
  onComplete: function(transport, json) {
    if (json.classes) {
      for(var i = 0; i < json.classes.length; i++) {
        $$('body').first().addClassName(json.classes[i]);
      }
    }
  }
}

class ArticlesController < ApplicationController
  caches_page :show
  def show
    @article = Article.find(params[:id])
    respond_to do |format|
      format.html {}
      format.js do
        headers['X-JSON'] = @article.to_json_for(current_user)
      end
    end
  end
end

class Article < ActiveRecord::Base
  def to_json_for(user)
    {
      :classes => [
          [user == author ? :is_author : :is_not_author
          ...
          ]
    }.to_json
  end
end

A couple gotchas here. First, You must override Rails pagecaching functionality to ensure it doesn’t cache requests for Json. Put this in ApplicationController:

def self.caches_page(*actions)
  return unless perform_caching
  actions.each do |action|
    class_eval "after_filter { |c| c.cache_page if c.action_name == '#{action}' && c.params[:format] != 'js' }"
  end
end

(Some details are missing in the above implementation since respond_to seems to delete the :format parameter from the params hash.)

Second, since we’re making an Ajax request, we are hitting the Rails stack; nevertheless, this is still wicked fast because the Ajax request returns only that data that is essential–very few objects need be instantiated. Also, you can almost always avoid making this Ajax request for logged out users, which should take enormous load off the server.

Stateful Session Data

Some data related to the current user is not constant–it lasts only for some finite part of the session. Still it should persist longer than just the current page. An example is flash[:error] content, but many sophisticated web sites utilize this kind of personalization extensively (think wizards and contextual help). The easiest way to populate this data is as part of the Ajax request but rather than return it in the Json, return it in a cookie.

class ArticlesController < ApplicationController
  caches_page :show
  def show
    @article = Article.find(params[:id])
    respond_to do |format|
      format.html {}
      format.js do
        cookies[:temporary_classes] += ...
      end
    end
  end
end

You may want to remove these “temporary” classes from the cookie as you use them on the client side.

CSS is Easy to Use

The reason this technique works is that CSS selectors permit the ability to do complex Boolean expressions. Though far from Turing Complete, CSS is powerful enough to express and and or, and it expresses it in an elegant way. It’s far easier and more elegant than translating your conditional logic to inlined Javascript.

Disadvantages

There are a few downsides to this approach. One is security. Because we render content for all possible people into the page, there is a potential security violation. Though no sensitive data appears in the browser, it is visible in the source code. For most applications, these security concerns are unimportant because the “real” security rules are enforced during write operations. But your mileage may very.

Some Cache Design Principles

There are a few principles to bear in mind when implementing a caching strategy.

  • Distinguish data a) independent of the current user (e.g., the title of an article) and data b) dependent on the current user.
  • In the latter category, distinguish data b1) concerning the current user only (e.g., whether the user is an admin) and data b2) concerning the relationship between the current user and some other object (e.g., whether the user is the author of an article).
  • Data concerning the relationship between the current user and some other object (b2) can usually be segmented into axes of variability; that is, we can reduce the “space” of the data from the number of users, to some smaller set of criteria. For example, we can usually categorize the kinds of relationships into is_author, is_not_author, is_friend, is_not_friend, etc.

Data of type (a) and (b2) are usually pre-rendered into the pagecached page. Data of type (b2) is then shown and hidden using the CSS technique. Data of type (b1) is usually set into the cookie when the user logs in, and alternatively any time the user hits the Rails app.

Designing Cachable URLs

It is very important to know when to use and when to avoid putting a user id in your url. For example, if you model the current user’s profile page with the following url:

http://www.mysite.com/profile

You’ve effectively made that page uncachable, since all of the content on that page depends upon the current user. A better URL would be:

http://www.mysite.com/profiles/nick

If my profile looks differently to me than others, model that as data of type (b2), that is data concerning the relationship between the current user and some other object.

That said, never put the current user into the url:

http://www.mysite.com/users/nick/articles/1

If this isn’t an article written by Nick (data of type a), but is rather Article 1 as seen by Nick (data of type b2), you’ve just pagecached a page with a cache hit ratio of zero. So consider carefully how you model a resource like the following:

http://www.mysite.com/account/password vs. http://www.mysite.com/users/nick/account/password

The latter format ensures that the password page has a zero cache-hit ratio, so screw that. Furthermore, a page that links to the latter url (for example, an “Account Settings” link in the site’s header) that happens to be cached must now generate that link using Javascript since the url differs based upon the current user. To reiterate this point:

Ensure that any pages that concern only the current user do not have the user identifier in the url. Examples include: the logged-in home page, account settings page, edit my profile page, my message inbox page, etc.

Hope you guys find these techniques helpful. Along with a little bit of scriptaculous templates, this technique will make it easy to make your highly personalized Rails apps scale massively.

Comments
  1. Brilliant tutorial, thank you for putting this together! I’ve known about this methodology of using a callback for page caching, however seeing an example like this with code snippets is very useful! Thanks for sharing the knowledge..

  2. Excellent techniques here, and I think it’s especially cool how using this method keeps logic out of the views and pushes it back to the controller, where it can be more easily tested.

  3. Tom Armitage says:

    Not sure I agree. By putting all that data into the source HTML, whether it’s visible or not, you don’t just have a security violation; you also have the issue that it’s all available to Google. Which, to be honest, is confusing for Google (to say the least).

    Next, you also have a huge accessibility issue. Using javascript like this can be a major hang-up for useragents which don’t have it, or which don’t use it appropriately. This is especially an issue given you’re using XmlHttpRequest.

    Those two reasons alone make it just feel wrong.

    I’m also not convinced by Duncan’s commentary; this kind of conditional logic should be in the view. I mean, the highest level I can pull it back to is to put it into the controller, and tell it to render the “admin” view, but that still feels like a conditional about the view. You shouldn’t be making decisions about functionality based upon “how easy it is to test”. If it belongs in the view, put it in the view, and find a way to test it.

  4. Nick Kallen says:

    Tom — the issue about google is an insightful criticism. However, concerning accessibility, pagecaching a dynamic page will require javascript — there’s no alternative. Most apps choose which browsers to support–they don’t support all of them. “Accessibility,” therefore, is not simply defined as “support all browsers including Lynx”.

    Duncan — I agree about the testing issue. Although I would test this in the model not the Controller (the latter just delegates to the model, right?). Tom — architecting software based upon how easy it is to test is one of the primary imperatives of TDD. That doesn’t make it a law, or even correct, but it is well understood that testing first and architecting around testability are synonymous.

  5. Eleo says:

    I think I agree with Tom on the accessibility issue. Accessibility is a sort of arbitrary standard, but I think that knowingly cutting off a certain portion of users when you don’t have to (I say that because YouTube ain’t gonna work in Lynx), is design that can use improvement.

    In this case we’re sacrificing potential usability (admittedly from a small percentage of users) to save ourselves resources. From a financial standpoint it could be looked at as a good idea because you are saving more resources than you are losing customers/adsponges. However, it’s still putting our needs before those of the users. In the end it really comes down to personal standards and also target audience.

  6. JB says:

    “Highly Personalized Web Pages” to me means Facebook home page or equivalent. Regardless of the other drawbacks discussed in the comments, the technique outlined here won’t scale to Facebook-level personalization. “Somewhat Personalized”? Sure!

Post a Comment

Your Information (Name required. Email address will not be displayed with comment.)

* Copy This Password *

* Type Or Paste Password Here *