We'll respond shortly.
Last week, I was tasked with diving into Pivotal’s allocations application to figure out why it was operating so slowly and hopefully make it a bit better. The application was written as a side project about 4 years ago, and clearly showed its age. It’s not every day I run into an application that uses RJS! Anyways, I was able to use an incremental refactoring-based approach to improve the speed by about 80% or so. Edward and Josh Knowles suggested that I write up a bit about what I saw and how I improved it, in hopes that other engineers can make use of these performance tuning and refactoring concepts on other projects.
So, without further ado…
Any time I look into a performance feature, I try to focus on getting a clear piece of functionality to optimize (in this case, the main project matrix page) and measure the heck out of its performance both on production and locally. So, the first place to go was NewRelic. NewRelic told me a couple of things – first off, there were a lot of database queries hogging most of the time. Secondly, most of those database queries involved an ActiveRecord object called ‘PersonRange’.
Adding some manual benchmarking (it’s easy! Just add a ‘stamp’ function and a before/after filter to generate a report of your timestamps) told me that many of the database queries were actually happening during view processing – a big no-no. I had a direction of investigation.
Like a lot of Rails projects, the view on allocations relies on a series of partials to generate a large matrix. All of these partials are looped – and with the main matrix page looping over projects, each project looping over weeks, each week looping over allocation tiles, you can imagine how the numbers add up quickly.
My first line of improvement was to streamline the innermost partial as much as possible. First off, I replaced the partial with a helper method. In general, looping over a partial is slower than calling the partial as a collection, which is in turn slower than using a helper method to generate markup. The innermost partial was very simple and easily lent itself to being a helper method. For good measure, I turned the ‘loop over allocation tiles’ code snippet into a helper method as well.
When I did this, I naturally started looking at the parameters this method needed. One strangeness – a random lookup hash named ‘roles’ was passed to this method/partial. The partial then looked up an the person’s role from this hash. The lookup hash itself was generated through a DB query generated by a helper method in the next outer partial (project_week_cell), so it generated a DB query per project per week.
On a parallel note, we also needed to look up people’s locations on a week by week basis, and there were some on the fly DB calls happening in the view layer for this as well.
So where did role and location come from? Lo and behold, both of these properties were methods of a single PersonRange object.
My direction was clear. In the controller layer, I made one set of queries to figure out the relevant person range for each person for each week. This cache was used for all location and role questions from that point forward. The cache was plopped into the main ProjectAllocationMatrix object along with the other preexisting caches, of which there were many. Boom, 50% speed improvement.
Before I could tackle a bigger refactor, I needed to simplify and organize the code a bit. The codebase had some obvious things to organize, that wouldn’t affect the performance much but would clarify flows and responsibilities.
Even after these changes, I was still seeing some database calls in the view layer. I decided to track them down and get rid of them.
All of these changes chopped another 25% or so off of the load time.
Now the basic matrix page render was in fairly decent shape, and I moved onto the other mandate – making the drag/drop operate more smoothly. The mechanism was basically that it would perform the allocation change, recalculate the matrix for the changed projects, and then render RJS that refreshed the project rows for these changed projects.
When I looked at performance, I discovered that most of the time was spent generating two copies of the allocation matrix. The first matrix was the one mentioned in the controller (just calculating for those two projects); the second matrix was a full matrix calculated to generate refreshed billable percentages.
The interesting thing was that both matrices took almost the same time as each other. Restricting projects did not matter for performance.
As a first pass, I decided that the ‘restricted projects’ matrix was useless. If we used the bigger matrix for everything, we only had to calculate everything once and we would have everything we needed. I made this conversion and I cut server time in roughly half.
I was now ready to do another performance pass (beyond the minor improvements from having smaller markup).
I returned to some of the oddities of the first drag/drop pass.
First oddity – making an allocation matrix with only 2 projects of interest was just about as expensive as making an allocation matrix with all projects. I discovered that the reason was buried in the caches – there was a low level shared cache of allocation information used both to calculate project allocations and unallocated people.
Getting rid of this cache and making these two calculations retrieve just the information they needed was the way to go. When I did this, the full matrix stayed at about the same performance level it was before. The smaller matrix, however, became much faster.
This led to the question of whether I could get away with only using the smaller matrix. The answer appeared to be ‘yes’, provided I figured out how to keep the billable percentages over all projects up to date.
Every refactor leads to a new refactoring idea. Even though my week was finished, I saw plenty of other things that could be tightened up. Among my random thoughts: