before(:all) doesn’t do what you’d expect

Like many before me, last week I was bitten by RSpec’s interesting implementation of before(:all). Although the documentation (http://rspec.info/documentation/) clearly states that before(:all) ‘is run once and only once, before all of the examples and before any before(:each) blocks’, it in fact runs once for every context in the scope. This might be useful, but in every situation I’ve seen it isn’t. The documentation goes on to say

Warning: The use of before(:all) and after(:all) is generally discouraged because it introduces dependencies between the Examples. Still, it might prove useful for very expensive operations if you know what you are doing.

which implies that it really does only get invoked once – you wouldn’t want an expensive operation to be executed for every context.

RSpec’s author has explained that the behaviour is actually as he intended it and proposed to change the documentation, but so far that hasn’t happened.

So, why did I need before(:all) to operate as advertised? I have written a test that collects some metadata and then iterates over it to generate common Examples for every attribute in the metadata. This is great because as the metadata is expanded, the test suite expands automatically ensuring consistent implementation of all of my API. The metadata is nested, so it makes sense to take advantage of RSpec’s contexts to nest the Examples so that the generated test results are self-documenting. Unfortunately, the setup for this test involves one expensive operation: login to the API to get the metadata and prepare to probe every attribute. Logging in takes several seconds. It should be faster, but that’s a different story – my test expands to nearly 5000 examples so even if logging in only took half a second we would still waste over 40 minutes. As it is, with RSpec running my expensive operation for every context, the test was taking over 4 hours!

A blog post helped me on my way, but it focussed on resetting the database between tests and didn’t work with Rails RSpec. I modified it to work with Rails RSpec and refactored it to make it useable in any test. Using it is simple; add the following to your spec_helper.rb file:

# HORRIBLE HACK to work around RSpec's broken before(:all) which does not get run
# once as per the documentation - it runs once at the level defined and once per
# sub-context!
# Extend ActiveSupport::TestCase with a 'before_once' that takes a block just like
# before(:all), but only gets called once at the top-level.
# Inspired by http://sickill.net/blog/2009/11/23/quick-and-dirty-hack-for-rspec-before-all.html
# but recast so that it can be used in any test.
class ActiveSupport::TestCase
  # Like before(:all), but only runs once at the top context and not for every
  # sub-context. It keeps track of the classes that are registered and does not
  # invoke the block in subclasses.
  def self.before_once
    _before_once_class_parents = []
    self.instance_variable_set(:"@_before_once_class_parents", _before_once_class_parents)
    before(:all) do
      unless _before_once_class_parents.select{|g| self.class.to_s =~ /^#{g}/}.any?
        _before_once_class_parents << self.class.to_s
        yield
      end
    end
  end
end

It adds a before_once method to ActiveSupport::TestCase that you can use just like before(:all), but it guarantees that it will only run once per scope.

So now, using before_once, I can ensure that my test only logs into the API once and not once per attribute. The net effect is a test that runs in about 90 seconds rather than taking half a day and climbing!

Performance implications of procs and blocks in Ruby

[tweetmeme]
Whilst profiling the performance of some code, I wondered why a method that was yielding to a small block of code was taking so long.

The answer was in the method arguments where out of habit I automatically capture the block argument and convert it to a proc using the ampersand (&) operator. When yielding to code that you are passing to a method, there is no need to convert the block to a proc as the block of code that you are yielding to is already a proc in the correct context.

An example to clarify:

# if you just want to yield to code, this way is inefficient
# as the block argument will be converted to a proc
def foo(argument, &block)
  if some_logic
    block.call  # if we are just yielding to the block in the
                  # context it is already in, this is unnecessary.
  end
end

# if you are just yielding to code, this way is better
def bar(argument)
  if some_logic
    yield  # much better, there was no overhead
            # of converting the block into a proc
  end
end

# called like so
bar(my_argument) { p "Hello" }

Blocks and procs are complicated, here are a couple of resources:

Posted in Ruby. Tags: , , . 1 Comment »

Profiling your Ruby on Rails application

[tweetmeme]
At some point you want to check that there aren’t any really slow bits of your application and even if there aren’t, you might like to know where to spend effort in optimising.  Luckily for you the script/performance/request script coupled with ruby-prof gem produces very useful profiling reports.

Getting script/performance/request to work with the standard gem (version 0.6.0) is troublesome impossible, however that nice man Jeremy Kemper from 37signals has published a version (0.6.1) that does work! Hurrah!

You can just install jeremy-ruby-prof from the git gem repo, however this installs the gem with the wrong name if you want to use it with script/performance/request. It can be done by downloading the gem, building the gem and then installing it from the local gem. E.g. (on a Ubuntu box):

wget http://github.com/jeremy/ruby-prof/tarball/89e2a4bc3f5881519a2fe1e5c5c05f7e1e0acf6e
tar -xf jeremy-ruby-prof-89e2a4bc3f5881519a2fe1e5c5c05f7e1e0acf6e
cd jeremy-ruby-prof-89e2a4bc3f5881519a2fe1e5c5c05f7e1e0acf6e
rake gem
sudo gem install pkg/ruby-prof-0.6.1.gem

Ta da! Installed with the right name and now you will be able to create yourself a benchmarking environment and profiling script like the tutorial at Railscase – Request Profiling.

By default, two outputs are generated in your tmp/ directory. An HTML call graph (see Reading Call Graphs) and a flat profile (txt) file.

Writing ruby code that only executes in development mode

[tweetmeme]
Our application does a lot of parsing of tree structures in development mode which aids our application developers in creating squeeky-clean code. However, in production we don’t really want and don’t need the overhead of all these additional recursive tree-parsing methods, but we also don’t want to change the application code.

The answer is to leave the application code making calls to the platform, but adapting the methods in the platform to only execute the method bodies in development mode.

We have a module that is mixed in where appropriate:

module DefineInDevelopment

  # Defines a method used to define other methods that are only available in development mode
  if ENV['RAILS_ENV'] == "development"
    def define_in_development
      yield
    end
  else
    def define_in_development
    end
  end

end

Then in any method where we only want the processing overhead in development, we wrap the method’s body in a define_in_development method call passing a block:

include DefineInDevelopment
def my_expensive_method_for_development(arg1, arg2) do
  define_in_development do
    # really expensive code
  end
end

Simple, but incredibly effective! Exploring this also identified a performance gotcha which I discuss in my next blog entry.

Posted in Ruby. Tags: . 1 Comment »