We'll respond shortly.
Last week, I was tasked with diving into Pivotal's allocations application to figure out why it was operating so slowly and hopefully make it a bit better. The application was written as a side project about 4 years ago, and clearly showed its age. It's not every day I run into an application that uses RJS! Anyways, I was able to use an incremental refactoring-based approach to improve the speed by about 80% or so. Edward and Josh Knowles suggested that I write up a bit about what I saw and how I improved it, in hopes that other engineers can make use of these performance tuning and refactoring concepts on other projects.
So, without further ado...
Any time I look into a performance feature, I try to focus on getting a clear piece of functionality to optimize (in this case, the main project matrix page) and measure the heck out of its performance both on production and locally. So, the first place to go was NewRelic. NewRelic told me a couple of things - first off, there were a lot of database queries hogging most of the time. Secondly, most of those database queries involved an ActiveRecord object called 'PersonRange'.
Adding some manual benchmarking (it's easy! Just add a 'stamp' function and a before/after filter to generate a report of your timestamps) told me that many of the database queries were actually happening during view processing - a big no-no. I had a direction of investigation.
Like a lot of Rails projects, the view on allocations relies on a series of partials to generate a large matrix. All of these partials are looped - and with the main matrix page looping over projects, each project looping over weeks, each week looping over allocation tiles, you can imagine how the numbers add up quickly.
My first line of improvement was to streamline the innermost partial as much as possible. First off, I replaced the partial with a helper method. In general, looping over a partial is slower than calling the partial as a collection, which is in turn slower than using a helper method to generate markup. The innermost partial was very simple and easily lent itself to being a helper method. For good measure, I turned the 'loop over allocation tiles' code snippet into a helper method as well.
When I did this, I naturally started looking at the parameters this method needed. One strangeness - a random lookup hash named 'roles' was passed to this method/partial. The partial then looked up an the person's role from this hash. The lookup hash itself was generated through a DB query generated by a helper method in the next outer partial (project_week_cell), so it generated a DB query per project per week.
On a parallel note, we also needed to look up people's locations on a week by week basis, and there were some on the fly DB calls happening in the view layer for this as well.
So where did role and location come from? Lo and behold, both of these properties were methods of a single PersonRange object.
My direction was clear. In the controller layer, I made one set of queries to figure out the relevant person range for each person for each week. This cache was used for all location and role questions from that point forward. The cache was plopped into the main ProjectAllocationMatrix object along with the other preexisting caches, of which there were many. Boom, 50% speed improvement.
Before I could tackle a bigger refactor, I needed to simplify and organize the code a bit. The codebase had some obvious things to organize, that wouldn't affect the performance much but would clarify flows and responsibilities.
Even after these changes, I was still seeing some database calls in the view layer. I decided to track them down and get rid of them.
All of these changes chopped another 25% or so off of the load time.
Now the basic matrix page render was in fairly decent shape, and I moved onto the other mandate - making the drag/drop operate more smoothly. The mechanism was basically that it would perform the allocation change, recalculate the matrix for the changed projects, and then render RJS that refreshed the project rows for these changed projects.
When I looked at performance, I discovered that most of the time was spent generating two copies of the allocation matrix. The first matrix was the one mentioned in the controller (just calculating for those two projects); the second matrix was a full matrix calculated to generate refreshed billable percentages.
The interesting thing was that both matrices took almost the same time as each other. Restricting projects did not matter for performance.
As a first pass, I decided that the 'restricted projects' matrix was useless. If we used the bigger matrix for everything, we only had to calculate everything once and we would have everything we needed. I made this conversion and I cut server time in roughly half.
I was now ready to do another performance pass (beyond the minor improvements from having smaller markup).
I returned to some of the oddities of the first drag/drop pass.
First oddity - making an allocation matrix with only 2 projects of interest was just about as expensive as making an allocation matrix with all projects. I discovered that the reason was buried in the caches - there was a low level shared cache of allocation information used both to calculate project allocations and unallocated people.
Getting rid of this cache and making these two calculations retrieve just the information they needed was the way to go. When I did this, the full matrix stayed at about the same performance level it was before. The smaller matrix, however, became much faster.
This led to the question of whether I could get away with only using the smaller matrix. The answer appeared to be 'yes', provided I figured out how to keep the billable percentages over all projects up to date.
Every refactor leads to a new refactoring idea. Even though my week was finished, I saw plenty of other things that could be tightened up. Among my random thoughts:
"bundle install seems very slow everytime, but bundle check seems fast. Why doesn't bundle install use bundle check before doing its thing?"
Consensus was that this seemed like a good idea.
"When setting up a cc.rb box, the box could not connect to Github, yielding the 'You don't exist, go away!' message. How do we fix this situation? We can get to github through the command line without any issues."
"In Rails 2.3, we tried mocking a has_one association. However, it looks like the association isn't mocking. Why?"
Rails 2.3 associations have a proxy object that delegate to lower level objects. This proxy isn't mockable, but the target (proxy_target) is.
"What is the current best of breed passenger config beyond what you get from the passenger site?"
Recommendations were given for mod_speed.
"What are some easy ways to implement CSS spriting on my site?"
For a quick definition of CSS sprites, look here. Recommendations included Compass/SASS.
"When I tried to clear cookies on IE8, the cookies stuck around anyway. I was only able to delete them through the developer toolbar. What's going on?"
The consensus theory was that the developer toolbar might be affecting IE8's cookie behavior (IE8 is not known for its robust extensions). More investigation seems in order.
Here's the situation: Rails validates_uniqueness_of has a flag called :case_sensitive. This flag defaults to 'true', but can be flipped.
MySQL's default collation is case-insensitive. As a result, queries will, in general, ignore case unless specifically overridden.
So one might imagine that setting :case_sensitive to false would be completely harmless in a standard MySQL application.
One would be wrong. Setting case_sensitive to false changes the query to lowercase the field in question, causing the MySQL database to ignore any indices it may have and turning the validates_uniqueness_of operation from something cheap and quick to something requiring a full table scan.
The open Lighthouse ticket on this issue is: https://rails.lighthouseapp.com/projects/8994/tickets/2503-validates_uniqueness_of-is-horribly-inefficient-in-mysql
"Any clever ways to catch out of bounds exceptions from Solr?"
This is a follow-up to yesterday's Solr question. After some investigation, it looks like none of the major providers catch out of bound exceptions for very large numbers. Rather than instrumenting every Ruby call with validations to prevent these numbers from getting into Solr, are there any other brilliant ideas?
Follow-up to the help from 5/19/2010's SEO routing question. The latest hotness appears to be FriendlyId (http://github.com/norman/friendly_id) This plugin makes human-friendly slugs and comes with a variety of interesting features, including versioning and slug scoping.
Power RubyMine commands:
Goto File + line #: If you use ctrl-shift-N to go to a file, try typing in a line number after a colon, something like "my_file:30". You'll end up on that line.
Analyze stack trace: This tool lets you paste in an external stack trace, and gives you the ability to browse to all of the pieces of that stack trace.
"Our site requires crafting URLs in a very particular SEO-friendly way. Rails doesn't seem to give us a good solution for our URLs. Any suggestions?"
One of our clients needs to make their app accept and generate compound URLs that look something like the following:
where author, series, and book are all different domain concepts. Rails RESTful resources don't really support this format. There wasn't an immediate solution, but among the peanut gallery of ideas:
Hyphens are better than slashes in URL crafting, but Rails doesn't separate on slashes at all
to_param solutions - Overriding to_param to something that starts with an integer ID generates URLs that look very slug-like, but can use standard Rails Domain.find mechanisms. For example, a book.to_param might be overridden to be "1-bookname", which works for all purposes. The problem with this solution is that it doesn't quite fit the requirements here, and doesn't cover the compound needs.
Custom routes are always a possibility. You can hook up a special (non-resource) controller that understands flexible browse-y routes like the one above, parses them, and delegates to the more standard resource controllers. The problem here is that you have to figure out a decent delegate pattern and route generation pattern.
In general, URL crafting is a separate art from domain model crafting, and Rails doesn't really cater to this. You will have to design URL-centric code to suit your URL crafting.
"Any ideas on ways to performance test IE7?"
No immediate ideas, but potentially more later.
"When users enter very large search parameters for numbers we get the following exception out of RSolr:"
RSolr::RequestError: Solr Response: For_input_string_11111111111111111111__javalangNumberFormatException
Is there an elegant solution to this aside from validating that all input parameters aren't larger than max int?
When using named scope methods that refer to other named scope methods, you may discover that your SQL has some redundant condition clauses. This is a bug in Rails 2, and has been true for several versions. However, it's a harmless bug - MySQL will understand the extraneous condition clauses just fine, without performance implications.
Mocking Paperclip for tests is a careful art. See our other blog post: Stubbing out Paperclip ImageMagick in Tests
"Anyone have good strategies for using S3 as a content delivery network for static files?"
Using S3 as a CDN is pretty common. S3 is certainly cheap, and fairly easy to set up. However, latency can be large - S3 isn't built to act as a CDN, so the performance can be lacking. In addition, you need to work out your pathing in your CSS files to find background images correctly. Relative paths are a common technique here.
The performance of files in your public directory is much better. Amazon's Cloudfront is another (expensive) option.
Note observation #4 in this blog article: link
"I can't get ImageMagick to work on Snow Leopard. What gives?"
A brief look online shows several step-by-step instructions. It's unclear what this particular problem is about.
"After upgrading to the latest version of Mocha, any_instance doesn't clean up after itself. Why?"
Mocha's any_instance stubbing is one of the few features that distinguish Mocha from other mock frameworks.
One suggestion was to update rspec as well.
"Heroku 1.5.3 isn't letting me use heroku rake commands. What can I do?"
Upgrade to Heroku 1.5.6.
EY Cloud's slave database functionality is broken right now. It's supposed to be fixed this afternoon.
Amazon restricts you to 20 EBS volumes/EC2 instances per account by default. The trick here is that deleting volumes does not immediately free up space. Volumes stay in a 'Deleting volume' state for an indefinite amount of time before they are truly free, making it hard to allot space for them. Finding these deleting instances can be a real challenge - the AWS API can find them, but not the EY cloud GUI.
If you need to manipulate AWS credentials for EY Cloud, it's fairly easy to go to the machine and find the appropriate file - /etc/.mysql.backups.yml
"Ever since we upgraded to RSpec 1.2.9, we haven't seen any stack traces. What gives?"
One of our projects lost stack traces as soon as they upgraded to RSpec 1.2.9. Reverting to RSpec 1.2.8 fixed the problem. No other projects have reported the issue yet.
"What's a good design for sharing a page cache across multiple servers?"
One of our clients would like to have a distributed server environment share its page cache. At this point, they're relying on GFS to do this.. but this solution appears to have problems with reliability.
Several engineers questioned the necessity for such a thing, but memcached appeared to be the solution of choice.
"Any information on RabbitMQ?"
One of our engineers is beginning to play with RabbitMQ. Anyone who has good comments about this technology, please feel free to chime in.
Rails maintains an internal array of files that it 'knows' about for this purpose. However, load and require bypass this mechanism, and lock files into place.
If you'd like to add a require that will class-reload, use the command 'require_dependency "location"'. This command, added by Rails, will require the file AND add it to ActiveSupport.
"How do I make attachment_fu use both the file system and S3 as storage backends?"
One of our clients would like to migrate attachments from the file system to S3. They want a clever way to make attachment_fu look in S3 or the filesystem, where new files are in S3 and old files are in the filesystem.
Their current solution, which they're not super happy with, is to monkeypatch the S3 backend by extending it with file system methods. This solution doesn't really seem to work too well, since the two backends share some of the same methods, and calling "extend FileSystemBackend" doesn't give them the freedom to pick and choose their methods. In addition, their patch makes the S3 backend not be an S3 backend any more, which could cause problems for maintenance down the road.
A better solution is to define a new backend object, based on the S3 backend but falling back to file system-style methods. Attachment_fu supports defining custom backend modules; a class using :storage => :my_storage would look for a backend module called Technoweenie::AttachmentFu::Backends::MyStorageBackend.
Attachment_fu still has a design problem. The backend objects are all modules, not classes. As a result, it's not easy to make a new backend descend from one of the existing backends.
Issue tracking: http://github.com/wycats/bundler/issues/#issue/134