Blog

Nov

Solving Cucumber's Problems


Cucumber took the Rails community by storm a couple of years ago. For the first time, we had an easy way of excercising the full stack of our applications. Many people didn't even realize that behind the scenes there was another library, Webrat, doing all the hard work. Cucumber became the de-facto way of writing end-to-end tests in Ruby.

When I wrote Capybara, it was mostly to improve the experience of writing Cucumber features. Over time however, with the arrival on the scene of Steak and similar approaches, people realized that the Capybara API was quite efficient at driving acceptance tests on its own and began using it with just plain RSpec or another testing framework.

And all was well, case closed, right?

Over the last year, I kind of abandoned cucumber, to write acceptance tests in plain Ruby, Capybara and RSpec. Throughout this experience, I have tried to keep an open mind. My verdict is: I don't buy it. In the beginning, it was fantastic, the overhead of Cucumber was gone, we were insanely productive. But over time, cracks appeared. As the projects grew larger, the tests became more and more difficult to maintain. I have since tried to figure out why this is.

Imagine the scenario of creating a task with a particular title, in Capybara, this might look something like this:

visit(new_tasks_path)
fill_in('Title', :with => 'Buy milk')
click_button('Create')

Simple enough. Imagine that we do this a lot, now we want to abstract this. Also quite convenient:

create_task(:title => 'Buy milk')

That looks good. Now imagine that this task is attached to a milestone:

milestone = create_milestone('name' => '1.0')
create_task(:title => 'Buy milk', :milestone => '1.0')
create_task(:title => 'Drink milk', :milestone => '1.0')

That's quite okay too, but what if this is a common pattern, a milestone with multiple tasks?

create_milestone('name' => '1.0', :tasks => ['Buy milk', 'Drink milk'])

That looks fantastic!

Here's the problem though: no one ever builds this abstraction. There is so much overhead involved in implementing the create_milestone method, that in practice, it's simply not done. It's certainly not done for the first acceptance test that could have used it. And herein lies the whole crux of the problem: the default behaviour for acceptance tests in Ruby is to be unnecessarily verbose, and you have to constantly fight this behaviour in order to write maintainable tests.

It is in abstracting these kinds of common patterns that Cucumber shines. In fact, this abstraction is probably still at too high a level for Cukes. If cukes are written like this:

Given there is a milestone called "1.0"
And there is a task called "Buy milk" for the milestone "1.0"
And there is a task called "Drink milk" for the milestone "1.0"
When I visit the homepage
And I click "Milestones"
And I click "1.0"
Then I should see "Buy milk"
And I should see "Drink milk"

Then you are not gaining any benefit from cucumber at all. You really want something like this:

Given I am looking at a milestone with the tasks "Buy milk" and "Drink milk"
Then I should see "Buy milk" and "Drink milk"

In my experience, it's very difficult to write tests at this level of abstraction with Ruby and a lot easier to write them with Gherkin, the language that cucumber features are written in.

Problems

Still, going back to Cucumber after being in Ruby land for a while, I encountered a number of problems. These are the same problems that are mentioned by many abandoning cucumber for plain Ruby.

  1. Having a separate test framework is annoying
  2. Mapping steps to regexps is hard
  3. Cucumber has a huge, messy codebase
  4. Steps are always global

I have written a new library, which I believe solves these problems.

Turnip

Turnip parses Gherkin feature files and runs them in RSpec. You run your feature files the exact same way you would run a normal spec file, and they are automatically run when you run your RSpec suite. So to run a feature file with Turnip, you would do something like:

$ rspec spec/acceptance/view_milestone.feature

Steps are implemented with strings instead of regexps, like this:

step "there is a task called :name" do |title|
  Task.create(:title => title)
end

It still allows for some variation in natural language by allowing a pseudo syntax for optional letters or alternative words:

step "there is/are :count monster(s)" do |count|
end

Just like Markdown, we're aiming for something which follows the natural conventions of writing text, instead of using the more arcane regexp syntax. The idea is to cover the 90% use case very well, instead of allowing every possible variation.

Turnip was written just to solve this particular, rather simple problem. There is no support for other programming languages, no wire protocol, it doesn't have its own runner or formatters or anything. Its only dependencies are rspec and gherkin.

In Turnip, steps can be local by scoping them to tags:

steps_for :interface do
  step "I do it" do
    click_link('Do it')
  end
end

steps_for :database do
  step "I do it" do
    Do.it!
  end
end

Now just tag the scenarios with the @interface and @database tags and you have different behaviour for the same step in different scenarios.

@interface
Scenario: do it through the interface
  When I do it

@database
Scenario: do it through the database
  When I do it

Conclusion

I don't know if Turnip solves the problems that Cucumber has. I don't know if Cucumber is the right solution for you. I do believe that Cucumber has a lot of benefits which the hivemind of this community has too easily dismissed this past year or so. I have tried to separate the ideas of Cucumber from its implementation. Try it out and see if you like the result!

blog comments powered by Disqus