Close
Glad You're Ready. Let's Get Started!

Let us know how we can contact you.

Thank you!

We'll respond shortly.

LABS
Collapsing Migrations

(6:30 pm: updated to use mysqldump)
(12/14/07: updated to remove db:reset since the Rails 2.0 version now does something different.)
(12/15/07: updated to not set ENV[‘RAILS_ENV’] since that gets passed down to child processes)

There was an old hacker who lived in a shoe; she had so many migrations she didn’t know what to do. Every time her build ran clean, she spent a whole minute staring at the screen.

Fortunately, she read this blog post and now her db:setup task is so fast she’s started building multiple test environments so she can run tests in parallel!

  • Figure out what migration to collapse to. This number should be less than or equal to the oldest deployed version of your app. E.g. if most of your deployments are on version 348 but there’s one client running a branch that’s only up to version 298, then pick 298 (or 297 if you’re afraid of off-by-one errors). For this example we will use 100.

  • Install lib/tasks/db.rake and lib/db_tasks.rb (source below)

  • Clear the development database by running

    rake db:clear

  • Dump the development structure by running

    rake db:dump

  • Delete all the migrations up to and including your target version. Here’s a sneaky awk script that deletes everything up to and including 100. (Go ahead and run it, it won’t bite, and you can always revert.)

    ls db/migrate/ | awk ‘{split($0, a, “_”); if(a[1]<=100) print $0}’ | xargs svn rm

  • Create a new migration called “100_collapsed_migrations.rb” using the following template.

100_collapsed_migrations.rb:

class CollapsedMigrations < ActiveRecord::Migration
  def self.up
    sql = <<-SQL
  # development_structure.sql goes here
    SQL

    execute("SET FOREIGN_KEY_CHECKS=0")
    sql.split(";").each do |statement|
      execute(statement)
    end
  ensure
    execute("SET FOREIGN_KEY_CHECKS=1")
  end

  def self.down
    raise IrreversibleMigration
  end
end
  • Open up db/development_dump.sql and copy its entire contents into your clipboard, then paste it above the “SQL” line in your new migration 100.

  • Search for the statement that creates the schema_info table and remove it.

Mine looks like this:

CREATE TABLE `schema_info` (
  `version` int(11) default NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
  • Set up your databases and run your tests.

    rake db:setup test

  • Congratulations! Your migrations are now blazingly fast, just like back in the (scaff)old days. You can run “rake db:setup” any time you get a svn update that looks like it may have done something funky to your schema, rather than shying away from that minute-long migration and just hoping your tests still pass.

Why do we need to use db:dump rather than db:schema:dump? Well, unfortunately, db:schema:dump doesn’t dump everything. It misses CONSTRAINT statements and also seems to get the charset wrong (although that may have been a function of how I constructed the db in my test). And db:structure:dump misses any data that may have been added by your migrations.

Here’s my current db.rake. Unfortunately, it only works with MySQL, but if you want to make it support your favorite DB (or even your least favorite) then please go right ahead.

Oh, and that part about multiple test environments and parallellized tests? Stay tuned… :-)

db.rake:

require "db_tasks"

namespace :db do
  def tasks
    (@db_tasks ||= DbTasks.new(self))
  end

  desc "Drop and recreate database"
  task :clear => :environment do
    tasks.clear
  end

  desc "Clear and migrate dev and test databases, and load fixtures into development db"
  task :setup => :environment do
    tasks.setup
  end

  desc "Dump the current environment's database schema and data to, e.g., db/development_dump.sql (optional param: FILE=foo.sql)"
  task :dump => :environment do
    if ENV['FILE']
      tasks.dump ENV['FILE']
    else
      tasks.dump
    end
  end

  desc "Load an sql file (by default db/development_dump.sql). (Optional param: FILE=foo.sql)"
  task :load => :environment do
    if ENV['FILE']
      tasks.load ENV['FILE']
    else
      tasks.load
    end
  end
end

db_tasks.rb:

# This creates a duplicate of the database config for a db config as defined in database.yml.
# For example, if the "test" database is named "myapp_test",
# for clone number 0, the new environment is named "test0", and the database is "myapp_test0".
# All other settings are preserved (esp. username and password).
module ActiveRecord
  class Base
    def self.clone_config(original_config, worker_number)
      original = configurations[original_config.to_s]
      raise "Could not find conguration '#{original_config}' to clone" if original.nil?
      worker_config = original.dup
      worker_config["database"] += worker_number.to_s
      configurations["#{original_config}#{worker_number}"] = worker_config
    end
  end
end

class DbTasks
  def initialize(rake)
    @rake = rake
  end

  def init
    connect_to('development')
    clear_database
    migrate_database
    dump
    test_environments.each do |test_db|
      if test_db =~ /([0-9]+)$/
        clone_test_config($1.to_i)
      end
      connect_to(test_db)
      clear_database
      load
    end
  end

  # db:clear -> drop and create db for RAILS_ENV
  def clear
    clear_database
  end

  # db:setup -> drop, create, and migrate dbs for test and development environments, and import fixtures into development
  def setup
    init
    connect_to 'development'
    load_fixtures
  end

  def dump(file = "#{RAILS_ROOT}/db/#{environment}_dump.sql")
    puts "Dumping #{database} into #{file}"
    system "mysqldump #{database} -u#{username} #{password_parameter} --default-character-set=utf8 > #{file}"
  end

  def load(sql_file = "#{RAILS_ROOT}/db/development_dump.sql")
    puts "Loading #{sql_file} into #{database}"
    query('SET foreign_key_checks = 0')
    sql_file = File.expand_path(sql_file)
    IO.readlines(sql_file).join.split(";").each do |statement|
      query(statement.strip) unless statement.strip == ""
    end
    query('SET foreign_key_checks = 1')
  end

  protected

  def clone_test_config(worker_num)
    ActiveRecord::Base.clone_config("test", worker_num)
  end

  def connect_to(environment)
    ActiveRecord::Base.establish_connection(environment)
    @environment = environment
    Object.const_set(:RAILS_ENV, environment)
    # Note: don't set ENV['RAILS_ENV'] since that gets passed down to invoked tasks (including 'rake test')
  end

  def environment
    (@environment ||= RAILS_ENV)
  end

  def test_environments
    environments = ['test']
    if Object.const_defined?(:TEST_WORKERS)
      TEST_WORKERS.times do |worker_num|
        environments << "test#{worker_num}"
      end
    end
    environments
  end

  def load_fixtures
    puts "Loading fixtures into #{environment}"
    Rake::Task["db:fixtures:load"].invoke
  end

  def clear_database
    puts "Clearing #{environment} database"
    sql = "drop database if exists #{database}; create database #{database} character set utf8;"
    cmd = %Q|mysql -u#{username} #{password_parameter} -e "#{sql}"|
    # puts "executing #{cmd.inspect}"
    system(cmd)
  end

  def migrate_database
    puts "Migrating #{environment} database"
    ActiveRecord::Migration.verbose = false
    Rake::Task["db:migrate"].invoke
  end

  def config(env = environment)
    ActiveRecord::Base.configurations[env]
  end

  def query(sql)
    ActiveRecord::Base.connection.execute(sql)
  end

  def database
    config["database"]
  end

  def username
    config["username"]
  end

  def password
    config["password"]
  end

  def password_parameter
    if password.nil? || password.empty?
      ""
    else
      "-p#{password}"
    end
  end

  def execute(cmd)
    puts "t#{cmd}"
    unless system(cmd)
      puts "tFailed with status #{$?.exitstatus}"
    end
  end

  def system(cmd)
    @rake.send(:system, cmd)
  end
end

Comments
  1. JB says:

    A lot of people use migrations to seed data as well as structure. Regardless of whether this is a good practice, can you tell us whether the structure dump includes data, or just schema?

  2. Tim Connor says:

    Sweet, I was just thinking about collapsing our hefty migrations at work. They’re getting to be a bit of a pain with lots of seed data migrations for really outdated schemas. This is a good excuse to do it and then fixturize our dev seed data.

  3. Alex C says:

    The structure dump does not include data, which makes a certain amount of sense because loading fixtures wipes out all data regardless of whether it was added by migrations or later. But it is a flaw with the current code, since some tables may not be fixturized, and I’m working right now on making it use mysqldump rather than db:structure:dump.

    Too bad, I was hoping to use the built-in Rails rake tasks as much as possible to make it more shareable (and easier to port to other db engines).

  4. Alex Chaffee says:

    The new Rails task “db:migrate:reset” does *almost* what db:setup does. It drops, creates and migrates — but it only does it for one database (by default, development). db:setup does it for test and development, which means it can serve the purpose of “make it work” after a questionable update or during a continuous build.

Post a Comment

Your Information (Name required. Email address will not be displayed with comment.)

* Copy This Password *

* Type Or Paste Password Here *