Close
Glad You're Ready. Let's Get Started!

Let us know how we can contact you.

Thank you!

We'll respond shortly.

LABS
Railsconf: HTTP's Best-Kept Secret: Caching – Ryan Tomayko (Heroku)

HTTP’s Best-Kept Secret: Caching Ryan Tomayko (Heroku)

About Ryan

  • http://tomayko.com
  • Sinatra maintainer.
  • Rack core team.
  • Creator and maintainer of Rack::Cache.

Http Caching?

  • NOT Rails Caching
  • HTTP caching headers in requests: Cache-control: If-Modified-Since: If-None-Match:
  • and responses: Cache-control: Last-Modified: ETag: Vary:
  • This stuff is defined in RFC2616, we won’t be going into this that deeply.

Types of Cache

Client cache

  • Built into browsers and other types of client.
  • 1:1 relationship between cache and client. The cache only serves one client (private cache).
  • How much bandwidth does each cache save: can’t beat it.

Shared Proxy Cache

  • Setup for an organization
  • 1:many relationship between cache and clients. Serves more than one client (shared cache).
  • Is closer to the client than the server, therefore saves a lot of bandwidth.

Gateway Cache

  • a.k.a. Reverse Proxy Cache
  • Situated inside of the origin site
  • 1:everyone relationship between cache and clients.
  • Reduces bandwidth the least.

Why cache?

  • The answer to this has changed over time.
  • In Nov 1990 there was 1 guy on the web – Tim Berners-Lee.
  • In Feb 1996 the web population was 20M. State of the art connectivity was a 28.8kbps modem. At that speed, loading the current http://yahoo.com (~350k) would take 2:48s. Bandwidth was the largest issue. RFC1945 HTTP 1.0 included the Expires: and Last-Modified: headers.
  • In March 1999 RFC2616 HTTP 1.1 was released. Addressed 1996 caching problems.
  • Today: we cache so we can scale. Keep your back-ends free from as much work as possible. Push as much work up the stack as possible.

HTTP 1.1 defines 2 caching models

Expiration

  • Back-end sets Cache-Control: public, max-age: 60
  • Gets cached in gateway cache an browser cache.
  • Public says it is good for many clients.
  • Cached for 60s.

Rails example

def show
  expires_in 60.seconds, :public -> true
  # stuff
  render ...
end

Sinatra example

headers['Cache-Control'] = 'public, max-age=60'

Validation (Conditional GET)

  • Back-end adds ETag or Last-modified, e.g. ETag: abcdef012345
  • Last-modified is redundant, basically there for HTTP 1.0 clients.
  • On 2nd request, gateway cache realizes it has this page in cache, then sends a GET /foo, Host: foo.com, If-None-Match: abcdef012345 to the back-end.
  • If back-end returns a 304 Not Modified, gateway cache returns cached version.

Rails example:

def show
  @foo = Foo.find(params[:id])
  fresh_when :etag => @foo,
  :last_modfied => @foo.updated_at.utc

Alternative idiom:

def show
  @foo = Foo.find(params[:id])
  modified = @foo.updated_at.utc
  if stale?(:etac => @foo, :last_modifed => modified)
    respond_to ...

Sinatra example:

get '/foo' do
  @foo = Foo.find(paramsp:id])
  etag @foo.etag
  erb :foo
end

Combine Expiration & Validation

  • Back-end sets Cache-control: public, max=age=60 and ETag: abcdef012345
  • In < 60 seconds, cache-control takes precedence
  • After 60 seconds, it queries back-end using ETag
  • Back end can then send back a 304 not modified with a new Cache-control: public, max-age: 60

Misc

  • Never Generate the Same Response Twice

Recommend using Rack:cache

gem install rack-cache

config.middlware.use Rack::Cache,
  :verbose          => true,
  :metatstore       => "fie:/var./cahe/rack/meta",
  :entitystore      => "file var/cache/rack/body",
  :allow_reload     => false,
  :allow_revalidate => false

The client controls what happens at the cache as well as the server using Cache-control. Refresh send Cache-control: no-cache. No-cache means gateway cache MUST revalidate ETag before sending response. This is bad and people can pound your back-end. :allow_reload => false disables this.

  • High-Performance Caches: Squid, Varnish (Heroku uses this)
  • Interesting discussion about ESI at the end.
  • Rails by default uses id of model, classname and last_updated to create an MD5 hash for etag.
  • Need to start with a seed that covers your release version, otherwise etag will not change. Rails now has a mechanism to handle this.
  • 2.3 branch has a new “touch” mechanism too.
  • Browser behavior differs and varies quite significantly when using SSL.

Comments
Post a Comment

Your Information (Name required. Email address will not be displayed with comment.)

* Copy This Password *

* Type Or Paste Password Here *