Glad You're Ready. Let's Get Started!

Let us know how we can contact you.

Thank you!

We'll respond shortly.

Standup 5/27/2010

Ask for Help

Recurring jobs

A pivot asked “what is the current state of the art in scheduling recurring processes?”

The first-order answer was simply “cron”, but then the conversation got interesting.

Cron has a few downsides-

  • Each task execution has to re-load the entire ruby/rails runtime, so, you pay a significant penalty in terms of startup time
  • Crontabs often don’t get checked into source control, so there ends up being little visibility into which jobs are running when

One suggestion to solve the visibility problem was to use the whatever gem, which allows you to express your cron schedules in a ruby DSL that can easily be kept in source control.

A suggested alternative that eliminates cron altogether is resque-scheduler with resque.

The upsides with resque-based scheduling are that all your schedule logic is expressed in ruby, and you don’t pay the ruby/rails startup penalty for each worker.

The downside is that it adds additional operational infrastructure for you to manage (the resque workers and the redis server(s)).

  1. You could also:

    1) Store a cron file in your version control
    2) symlink /etc/cron.d/project-name => location of cron file

    This would let you manage cron and its schedule in source control.

    You’d still have the rails boot issue.

    Also, this means that anyone who can deploy code could deploy something in the cron job that would be executed as root, so there is a security issue.

  2. …or add your cron file / cron setup to a [chef]( “Opscode Chef Wiki”) script kept in source control. We can set up (almost) an entire server / dev machine with chef alone, cron is just a small piece there.

  3. Malc says:

    @Nick – This would only work if root owns the symlinked file. Cron doesn’t like it if any user other the root owns any of the files in /etc/cron.d because they have the extra ‘run as user’ parameter.

    Similar to Bruce, I generally control crontabs via puppet, so there is some source control but it is separate from the project that the jobs are being run against. I.e. it’s in the puppet repo rather than the project repo.

  4. Joe Van Dyk says:

    I use cron to add jobs to resque.

    For example, I have script/periodic_tasks/minute.rb. Cron runs this file every minute. This file (without loading the whole rails environment) adds a few jobs to resque.

    It’s worked perfectly and takes about 15 lines of ruby for all of my jobs (minute, 10 minute, hourly, daily).

  5. Steve Conover says:

    “Crontabs often don’t get checked into source control, so there ends up being little visibility into which jobs are running when”

    Well you should definitely be committing crontab to source control and writing it out with something like chef.

    The part that’s not as visible is, what’s going on in your task queue – what’s currently in progress, etc. Or, not without ssh’ing in.

    @Joe check out resque-scheduler:

Post a Comment

Your Information (Name required. Email address will not be displayed with comment.)

* Copy This Password *

* Type Or Paste Password Here *