Sometimes you come across a problem where your go-to approaches start to fail you. I came across one of these examples recently where a Rails app required a report to be built which consumed data from almost every model within … Read more
Is there a way to get MySql indexing to speed up queries involving greater and less than operators on date columns?
Postgres handles these operators a little bit better than MySql, but may not actually solve this problem.
Using millis instead of dates would give the DB the best chance of handling this scenario.
We are using Git's subtree merge facility instead of submodules to stay synced to a different repository for part of our project. How do we push changes back to that repo?
See Tim Connor's blog post "Git sub-tree merging back to the subtree for pushing to an upstream". Early in that post is a pointer to an article describing the the subtree merge operation. Tim goes on to explain how to push your changes back through the chain.
Some cloud environments leave the names of temp files visible even when their contents are not accessible. Be sure to use obfuscated names for your temporary files!
The "Headless" gem allows you to easily set up an alternate "display" that allows programs to execute in a headless environment. See this blog post about how to use Headless to run Selenium tests on a CI box: http://www.aentos.com/blog/easy-setup-your-cucumber-scenarios-using-headless-gem-run-selenium-your-ci-server
Ccrb will bog down to painfully slow levels if more than a couple of CC Tray clients are pinging it repeatedly during a build.
Cron will not honor your .rvmrc file unless you do some work to set up the environment. If you set up your cron job like this:
0 6 * * * /bin/bash -l -c 'echo /home/someuser/.rvm/bin/rvm rvmrc trust ... && cd ...
the -l & -c parameters cause bash to load your environment as if your were logging in before running the specified commands. Someone also mentioned that rvm-shell can be used as a solution to this problem.
My company launched our app, Cohuman, a few weeks ago. The rush of finishing features, fixing bugs, and responding to user feedback has subsided a bit, and it's time to go back and give the little baby a tune-up. I find that a good development process will ebb and flow, and as long as you don't let something slide for too long, it's perfectly acceptable to let bugs, or performance issues, or development chores pile up for a bit and then attack them concertedly for an entire day or two. A bug-fest or chore-fest or tuning-fest can actually increase efficiency as you get in a rhythm... and it feels really good at the end of the day when you see all the bugs you slayed or all the milliseconds you shaved.
In this article I'd like to describe some of my techniques. I make no claim of originality or great expertise; I just want to share what I know, and hear (in comments) what other people have learned. I'm using Sinatra and ActiveRecord, but not Rails; hopefully this discussion will help people no matter what framework they're using.
Now that I'm starting to use DelayedJob to perform jobs in the future in my Heroku Sinatra app, its important that they happen at the scheduled time. But unless you pay attention, you'll find that times get mysteriously changed -- in my case, since I'm in San Francisco in the wintertime, by +/-8 hours -- which means that some conversion to or from UTC is being attempted, but it's only working halfway.
Trying to keep a handle on which libraries are attempting, and which are failing, to convert times is a losing battle, so I'm trying to do the right thing and save all my times in the database in UTC, and convert them to and from the user's local time as close to the UI as possible. Unfortunately, a variety of gotchas in Ruby and ActiveRecord and PostgreSQL makes this trickier than it should be. Here's a little catalog of my workarounds.
We all have multi-core machine these days, but most rspec suites still run in one sequential stream. Let's parallelize it!
The big hurdle here is managing multiple test databases. When multiple specs are running simultaneously, they each need to have exclusive access to the database, so that one spec's setup doesn't clobber the records of another spec's setup. We could create and manage multiple test database within our RDBMS. But I'd prefer something a little more ... ephemeral, that won't hang around after we're done, or require any manual management.
Enter SQLite's in-memory database, which is a full SQLite instance, created entirely within the invoking process's own memory footprint.
What's the best way to import a million records into a postgres database via ActiveRecord (which is needed to implement some application-specific logic)? We anticipate waiting a second (or so) between inserts to avoid slowing down the production database (which is under load, almost entirely reads). If there is any ActiveRecord feature which helps batch together inserts, noone knew about it. As for generally how long this will take (estimates range from 9 to 27 hours), and what the load on the production database will be, we planned on answering that with a trial run of a small number of these records.
We're thinking of having capistrano deploy to two demo servers, one particularly aimed at showing to prospective users of our application, and the other mostly for story acceptance. The former would be hosted at a hosting company; the latter an internally run machine. Several people reported they have done this on their projects, and the problems were minor, mostly having to do with whether the deployed location (/u/apps/whatever or some such) is different on the two machines (the solution would be to use the capistrano variables, but tracking down all the places that need to do that could be an issue).
Erector tip of the day: in a Rails project, you can put a file (named edit.rb or edit.html.rb) in your view directory, and Rails/Erector will find the template implicitly (as it would for ERB, HAML, etc). It is not necessary to explicitly call render from your controller method.
Using multiple buckets for Amazon S3. One of our sites has a lot of images (perhaps 30+ photos per page, different for each page and user) and got significant benefits from using four buckets instead of one. Multiple buckets allows browsers to fetch several images in parallel. Increasing it beyond four probably wouldn't help, as browsers have a limit on how many parallel requests they will send.
Amazon S3 now has a copy command. This could be useful, for example, if you have a lot of data in a single bucket and want to move it to multiple buckets. Copy is faster than downloading and re-uploading all that data. The ruby S3 gem, however, only lets you copy in one bucket, so you'll need to bypass the S3 gem.
We wrote a script to dump a local SQL database and copy it up to a remote server (for example, a demo or production server). This is in contrast with a script we wrote some months ago which copies from demo to a local workstation (for test data, reproducing data-driven bugs, etc). The push to remote feature was for a situation in which there was a bunch of data to be generated (based on some XML input files) and we could afford to bog down a workstation for half an hour, but not an overloaded (and perhaps underpowered) server.
Deprec is a set of capistrano recipes for setting up a remote server (in conjunction with deploying an application), for example creating accounts, ssh keys, init scripts, logrotate, etc.
Capistrano 2.3 has weird sudo issues (deleting old releases or something). Recommend Capistrano 2.5.
(6:30 pm: updated to use mysqldump) (12/14/07: updated to remove db:reset since the Rails 2.0 version now does something different.) (12/15/07: updated to not set ENV['RAILS_ENV'] since that gets passed down to child processes)
There was an old hacker who lived in a shoe; she had so many migrations she didn't know what to do. Every time her build ran clean, she spent a whole minute staring at the screen.
Fortunately, she read this blog post and now her
db:setup task is so fast she's started building multiple test environments so she can run tests in parallel!