Close
Glad You're Ready. Let's Get Started!

Let us know how we can contact you.

Thank you!

We'll respond shortly.

LABS
Converting Rails application data from MySQL to PostgreSQL

One of our projects had a pending chore in Tracker to move its backend to PostgreSQL from MySQL. This project has about a quarter of a million rows of production data and around a hundred tables in its schema which needed to be exactly migrated into PostgreSQL.

Forklifting the data proved more complicated than expected due to incompatibilities in the two DBMS’ syntax such as in the way string escaping worked, how booleans were represented and a bunch of other small but painful differences. Despite MySQL’s mysqldump utility including a command-line option to write statements in PostgeSQL format, it became clear that it wasn’t going to be simple to create a repeatable procedure to do this work across our environments.

There’s a bunch of information out there about how to approach this problem but none felt right. Most are multi-step manual procedures that require altering a dump file using sed or perl and others require the data to be loaded into an intermediary database and massaged prior to import. After testing some of these approaches, Todd and I decided to timebox ourselves to an hour to test the viability of a Ruby script using the DBI gem to move the data. We came up with:

require 'dbi'
require 'dbd/mysql'
require 'dbd/pg'

begin
  mysql = DBI.connect("DBI:Mysql:source:localhost", "username", "password")
  postgres = DBI.connect("DBI:Pg:destination:localhost", "username", "password")

  mysql.select_all("SHOW TABLES") do |table|
    next if ['schema_migrations', 'sessions'].include?(table.to_s)
    select = mysql.execute("SELECT * FROM #{table}")
    columns = select.column_names.map { |key| ""#{key}"" }.join(', ')
    placeholders = (['?'] * select.column_names.size).join(', ')
    insert = postgres.prepare("INSERT INTO #{table} (#{columns}) VALUES(#{placeholders})")
    select.each { |row| insert.execute(*row) }
    insert.finish
  end
rescue DBI::DatabaseError => e
  puts "Error #{e.err}: #{e.errstr}"
ensure
  mysql.disconnect if mysql
  postgres.disconnect if postgres
end

Our antiquely Perl-like script worked better than we expected — our application started right up with all of its data intact.

Has anybody out there encountered this need before? What kinds of solutions did you come up with?

Comments
  1. Michael Siebert says:

    we had almost the same need for our app and after some poking around, i wrote ar_dbcopy gem which worked really welll for a db even bigger than what you outlined. code is at github.com/siebertm/ar_dbcopy

  2. John Pignata says:

    Cool, thanks Michael. The link is actually: [ar-dbcopy](http://github.com/siebertm/ar-dbcopy)

    Looks like [Rama McIntosh](http://myutil.com/2008/8/31/rake-task-transfer-rails-database-mysql-to-postgres) also came up with an [ActiveRecord-ish solution](http://github.com/face/rails_db_convert_using_adapters).

  3. Joe Van Dyk says:

    Heroku uses the taps gem to extract data from a mysql database to import into postgresql.

    http://adam.blog.heroku.com/past/2009/2/11/taps_for_easy_database_transfers/

  4. John Pignata says:

    Whoa. That’s awesome. Thanks, Joe. The known issue “Foreign Keys get lost in the schema transfer” seems like a doozey, though.

  5. Trevor Turk says:

    I used the taps gem, too. It worked pretty well, but I was also caught by some nasty encoding issues moving from latin1 to utf8, I believe. This was just for a hobby project, though, so that wasn’t a big deal :)

  6. John Pignata says:

    Trevor — Cool. In our little hack, we had similar encoding pain that we got around by explicitly setting the MySQL connection to output in UTF8:

    mysql.execute(“SET NAMES ‘UTF8′”)

  7. You should be aware that changing from a MySQL-backed Rails app to a PostgreSQL-backed application may have some hidden gotchas depending on your table schemas.

    For example, and this is the most common that I’ve seen: if you have a string column that limits to 100 characters, the Rails MySQL adapter will invisibly truncate the data for you when saving. However, the PostgreSQL adapter will raise a data too long exception, instead of invisibly truncating the data, so you may have to add a local model validation or override the setter to auto-truncate to mimic the MySQL functionality.

  8. Nathaniel — Do you know if there is good information detailing all those little gotchas?

  9. Xac Stegner says:

    This script works great, except for one thing. When inserting these rows, the sequences aren’t getting incremented.

    If you don’t reset the sequences to the highest value you’ll get an error like this:
    PGError: ERROR: duplicate key value violates unique constraint “tablename_pkey”

    This is fixed by adding this line after the insert.finish line:
    postgres.execute(“select setval(‘#{table}_id_seq’, (select max(id) + 1 from #{table}));”)

Post a Comment

Your Information (Name required. Email address will not be displayed with comment.)

* Copy This Password *

* Type Or Paste Password Here *