Blog


Nov

Solving Cucumber's Problems


Cucumber took the Rails community by storm a couple of years ago. For the first time, we had an easy way of excercising the full stack of our applications. Many people didn't even realize that behind the scenes there was another library, Webrat, doing all the hard work. Cucumber became the de-facto way of writing end-to-end tests in Ruby.

When I wrote Capybara, it was mostly to improve the experience of writing Cucumber features. Over time however, with the arrival on the scene of Steak and similar approaches, people realized that the Capybara API was quite efficient at driving acceptance tests on its own and began using it with just plain RSpec or another testing framework.

And all was well, case closed, right?

Over the last year, I kind of abandoned cucumber, to write acceptance tests in plain Ruby, Capybara and RSpec. Throughout this experience, I have tried to keep an open mind. My verdict is: I don't buy it. In the beginning, it was fantastic, the overhead of Cucumber was gone, we were insanely productive. But over time, cracks appeared. As the projects grew larger, the tests became more and more difficult to maintain. I have since tried to figure out why this is.

Imagine the scenario of creating a task with a particular title, in Capybara, this might look something like this:

visit(new_tasks_path)
fill_in('Title', :with => 'Buy milk')
click_button('Create')

Simple enough. Imagine that we do this a lot, now we want to abstract this. Also quite convenient:

create_task(:title => 'Buy milk')

That looks good. Now imagine that this task is attached to a milestone:

milestone = create_milestone('name' => '1.0')
create_task(:title => 'Buy milk', :milestone => '1.0')
create_task(:title => 'Drink milk', :milestone => '1.0')

That's quite okay too, but what if this is a common pattern, a milestone with multiple tasks?

create_milestone('name' => '1.0', :tasks => ['Buy milk', 'Drink milk'])

That looks fantastic!

Here's the problem though: no one ever builds this abstraction. There is so much overhead involved in implementing the create_milestone method, that in practice, it's simply not done. It's certainly not done for the first acceptance test that could have used it. And herein lies the whole crux of the problem: the default behaviour for acceptance tests in Ruby is to be unnecessarily verbose, and you have to constantly fight this behaviour in order to write maintainable tests.

It is in abstracting these kinds of common patterns that Cucumber shines. In fact, this abstraction is probably still at too high a level for Cukes. If cukes are written like this:

Given there is a milestone called "1.0"
And there is a task called "Buy milk" for the milestone "1.0"
And there is a task called "Drink milk" for the milestone "1.0"
When I visit the homepage
And I click "Milestones"
And I click "1.0"
Then I should see "Buy milk"
And I should see "Drink milk"

Then you are not gaining any benefit from cucumber at all. You really want something like this:

Given I am looking at a milestone with the tasks "Buy milk" and "Drink milk"
Then I should see "Buy milk" and "Drink milk"

In my experience, it's very difficult to write tests at this level of abstraction with Ruby and a lot easier to write them with Gherkin, the language that cucumber features are written in.

Problems

Still, going back to Cucumber after being in Ruby land for a while, I encountered a number of problems. These are the same problems that are mentioned by many abandoning cucumber for plain Ruby.

  1. Having a separate test framework is annoying
  2. Mapping steps to regexps is hard
  3. Cucumber has a huge, messy codebase
  4. Steps are always global

I have written a new library, which I believe solves these problems.

Turnip

Turnip parses Gherkin feature files and runs them in RSpec. You run your feature files the exact same way you would run a normal spec file, and they are automatically run when you run your RSpec suite. So to run a feature file with Turnip, you would do something like:

$ rspec spec/acceptance/view_milestone.feature

Steps are implemented with strings instead of regexps, like this:

step "there is a task called :name" do |title|
  Task.create(:title => title)
end

It still allows for some variation in natural language by allowing a pseudo syntax for optional letters or alternative words:

step "there is/are :count monster(s)" do |count|
end

Just like Markdown, we're aiming for something which follows the natural conventions of writing text, instead of using the more arcane regexp syntax. The idea is to cover the 90% use case very well, instead of allowing every possible variation.

Turnip was written just to solve this particular, rather simple problem. There is no support for other programming languages, no wire protocol, it doesn't have its own runner or formatters or anything. Its only dependencies are rspec and gherkin.

In Turnip, steps can be local by scoping them to tags:

steps_for :interface do
  step "I do it" do
    click_link('Do it')
  end
end

steps_for :database do
  step "I do it" do
    Do.it!
  end
end

Now just tag the scenarios with the @interface and @database tags and you have different behaviour for the same step in different scenarios.

@interface
Scenario: do it through the interface
  When I do it

@database
Scenario: do it through the database
  When I do it

Conclusion

I don't know if Turnip solves the problems that Cucumber has. I don't know if Cucumber is the right solution for you. I do believe that Cucumber has a lot of benefits which the hivemind of this community has too easily dismissed this past year or so. I have tried to separate the ideas of Cucumber from its implementation. Try it out and see if you like the result!

Aug

You're Cuking It Wrong


Opinions on cucumber seem to be divided in the Ruby community. Here at Elabs we've been using cucumber to fantastic success on all of our projects for more than a year. At the same time Steak and projects like it seem to be gaining traction; some people are seemingly frustrated and fed up with cucumber.

So where does this gulf of experiences come from, why is cucumber loved by some and hated by others. At the risk of over-generalisation and mischaracterisation I recently came up with a theory: the cucumber detractors are not using cuke the way it was intended.

This is in fact not their fault. The entire cucumber ecosystem, and in fact even cucumber itself, encourage its misuse.

A while ago someone created an issue on the Capybara issue tracker. The interesting thing about this issue wasn't the problem itself, but rather the cucumber feature that the author presented in order to replicate the problem. This is the feature the author submitted:

Scenario: Adding a subpage
  Given I am logged in
  Given a microsite with a Home page
  When I click the Add Subpage button
  And I fill in "Gallery" for "Title" within "#document_form_container"
  And I press "Ok" within ".ui-dialog-buttonpane"
  Then I should see /Gallery/ within "#documents"

At first glance this seems reasonable. But contrast this with the following, improved version:

Scenario: Adding a subpage
  Given I am logged in
  Given a microsite with a home page
  When I press "Add subpage"
  And I fill in "Title" with "Gallery"
  And I press "Ok"
  Then I should see a document called "Gallery"

The difference isn't huge, the steps are largely the same, and there's an argument to be made for writing in a more declarative style, but there's one crucial difference: the first feature is code, the second isn't.

The argument against cucumber that's often presented is that as a programmer, plain text is unnecessary, because we can all read code. While it's true that we all can read code, I still find it beneficial to jump out of the code writing mode for describing the behaviour of the application. When you're writing features first, you don't want to be bothered with the details of how this functionality works. In this initial stage you care nothing about the implementation, about how the result is achieved. You care nothing about things like #document_form_container or .ui-dialog-buttonpane.

I believe that it's in this switching between designer mode and developer mode where cucumber, done right, really shines.

In order to evaluate the bigger picture before hacking
As a developer
I want to write my stories before writing my code

There are some secondary benefits as well. Writing truly plain text features leads to better maintainability as well, since the features are robust against code changes. Plain text is also easier to understand for new developers coming to an existing project. Probably the nicest advantage though is that over time a library of steps is built up, which can then be simply combined to describe new features.

The above feature is nicely illustrative of this anti-pattern, but it is far from the only example. In many of our cucumber suites here at Elabs, we have steps like the above, some of them were written by me. Which leads me to what's really wrong with the last three lines of the above feature. They are written using nothing else than the standard web steps generated by cucumber-rails own generator. Cucumber itself ships with steps which in my opinion encourage an anti-pattern.

Pickle my fancy

Another tool which we've experimented a bit with is Pickle, which allows you to easily generate models from your feature files. A basic example from the README:

Given a user exists
And a post exists with author: the user

Given a person: "fred" exists
And a person: "ethel" exists
And a fatherhood exists with parent: user "fred", child: user "ethel"

It actually looks fairly nice, reads quite naturally, so a first instinct might be to call this plain text. But on closer inspection, there is a whole language in there. To comprehend what these steps are doing you'd need to understand not only the domain models involved, but also the language Pickle uses to manipulate these. I'm pretty sure a non-technical person couldn't make sense of the above. This is really no different, and in fact worse, than writing actual code:

@user = User.make
Post.make(:user => @user)

@fred = Person.make(:name => 'Fred')
@ethel = Person.make(:name => 'Ethel')
Fatherhood.make(:user => @fred, :child => @ethel)

Note how there is an almost one-to-one mapping between the feature above, and the code below. The only thing cucumber does in this case is act as some kind of phoney translator. We write code, but not actual code. So we can do some stuff, but mostly it comes out worse than if we'd just written it as code in the first place. I can't blame anyone for disliking cucumber when using it like this.

However, try this instead:

Given there is a user called "Jimmy"
And there is a post authored by "Jimmy"

Given there is a person called "Fred"
And there is a person called "Ethel"
And "Fred" is the father of "Ethel"

There's not a huge difference between the first couple of lines, even though they read somewhat nicer when written out like this. The real difference is in the last line. Here cucumber is adding value by explaining this abstract concept of a Fatherhood into something very concrete: one person is the other's dad. Cucumber added value to this feature, instead of only acting as a hindrance.

I believe that Pickle is flawed as a concept, in order to achieve readable steps, they need to be written by hand.

The worst feature ever written.

As a curiosity, I present the worst cucumber feature known to man. If you are responsible for something like this, please go slap yourself in the face as hard as you can.

Scenario: User creates some sites and circuits, check connected sites list
    Given a "site" exists with {"name"=>"Somewhere1", "identifier" => "TER1", "provider"=>"TER1 Provider"}
    And a "site" exists with {"name"=>"Somewhere2", "identifier" => "TER2", "provider"=>"Some Provider"}
    And a "site" exists with {"name"=>"Somewhere3", "identifier" => "TER3", "provider"=>"TER3 Provider"}
    And a "circuit" exists with {"provider_name"=>"Another provider", "redacted_circuit_id"=>"ABC1", "provider_circuit_id"=>"C1", "circuit_type"=>CircuitType.find_by_name("Peering"), "service_type"=>CircuitServiceType.find_by_name("Dark Fiber"), :capacity => CircuitCapacity.find_by_name("1 Gbps"), "physical_wire_type"=>PhysicalWireType.find_by_name("Multi Mode Fiber"), "status"=> CircuitStatus.find_by_name("Cancelled"), "a_end"=>Site.find_by_identifier("TER1"), "b_end"=>Site.find_by_identifier("TER2")}
    And a "circuit" exists with {"provider_name"=>"Switch and Data", "redacted_circuit_id"=>"ABC2", "provider_circuit_id"=>"C2", "circuit_type"=>CircuitType.find_by_name("Backbone"), "service_type"=>CircuitServiceType.find_by_name("Dark Fiber"), :capacity => CircuitCapacity.find_by_name("1 Gbps"), "physical_wire_type"=>PhysicalWireType.find_by_name("Multi Mode Fiber"), "status"=> CircuitStatus.find_by_name("Cancelled"), "a_end"=>Site.find_by_identifier("TER1"), "b_end"=>Site.find_by_identifier("TER3")}
    When I am on the "connected_sites" page for site "TER1"
    Then the "connected-sites-list" should look like
      |   Site ID    |  Site Name | Site Provider | Provider Circuit ID  | Provider Name    | Circuit Status |
      |     TER2     | Somewhere2 | Some Provider |         C1           | Another provider |  Cancelled     |
      |     TER3     | Somewhere3 | TER3 Provider |         C2           | Switch and Data  |  Cancelled     |
    When I am on the "connected_sites" page for site "TER2"
    Then the "connected-sites-list" should look like
      |   Site ID    |  Site Name | Site Provider | Provider Circuit ID  | Provider Name    | Circuit Status |
      |     TER1     | Somewhere1 | TER1 Provider |         C1           | Another provider |  Cancelled     |
    When I am on the "connected_sites" page for site "TER3"
    Then the "connected-sites-list" should look like
      |   Site ID    |  Site Name | Site Provider | Provider Circuit ID  | Provider Name    | Circuit Status |
      |     TER1     | Somewhere1 | TER1 Provider |         C2           | Switch and Data  |  Cancelled     |

Yes, those are Hashes inside a feature, which are then eval'd. Make sure to scroll to the right to experience the full horror of it all. I challenge anyone to find a worse cucumber feature than this. I assure you, that thing is real (from one of our rescue mission projects), and there is much more where it came from.

Writing better steps

So how do we write better steps? For me personally, I've found that sticking to the following rule seems to lead to nice, maintainable steps:

A step description should never contain regexen, CSS or XPath selectors, any kind of code or data structure. It should be easily understood just by reading the description.

Aug

You're Cuking It Wrong


Opinions on cucumber seem to be divided in the Ruby community. Here at Elabs we've been using cucumber to fantastic success on all of our projects for more than a year. At the same time Steak and projects like it seem to be gaining traction; some people are seemingly frustrated and fed up with cucumber.

So where does this gulf of experiences come from, why is cucumber loved by some and hated by others. At the risk of over-generalisation and mischaracterisation I recently came up with a theory: the cucumber detractors are not using cuke the way it was intended.

This is in fact not their fault. The entire cucumber ecosystem, and in fact even cucumber itself, encourage its misuse.

A while ago someone created an issue on the Capybara issue tracker. The interesting thing about this issue wasn't the problem itself, but rather the cucumber feature that the author presented in order to replicate the problem. This is the feature the author submitted:

Scenario: Adding a subpage
  Given I am logged in
  Given a microsite with a Home page
  When I click the Add Subpage button
  And I fill in "Gallery" for "Title" within "#document_form_container"
  And I press "Ok" within ".ui-dialog-buttonpane"
  Then I should see /Gallery/ within "#documents"

At first glance this seems reasonable. But contrast this with the following, improved version:

Scenario: Adding a subpage
  Given I am logged in
  Given a microsite with a home page
  When I press "Add subpage"
  And I fill in "Title" with "Gallery"
  And I press "Ok"
  Then I should see a document called "Gallery"

The difference isn't huge, the steps are largely the same, and there's an argument to be made for writing in a more declarative style, but there's one crucial difference: the first feature is code, the second isn't.

The argument against cucumber that's often presented is that as a programmer, plain text is unnecessary, because we can all read code. While it's true that we all can read code, I still find it beneficial to jump out of the code writing mode for describing the behaviour of the application. When you're writing features first, you don't want to be bothered with the details of how this functionality works. In this initial stage you care nothing about the implementation, about how the result is achieved. You care nothing about things like #document_form_container or .ui-dialog-buttonpane.

I believe that it's in this switching between designer mode and developer mode where cucumber, done right, really shines.

In order to evaluate the bigger picture before hacking
As a developer
I want to write my stories before writing my code

There are some secondary benefits as well. Writing truly plain text features leads to better maintainability as well, since the features are robust against code changes. Plain text is also easier to understand for new developers coming to an existing project. Probably the nicest advantage though is that over time a library of steps is built up, which can then be simply combined to describe new features.

The above feature is nicely illustrative of this anti-pattern, but it is far from the only example. In many of our cucumber suites here at Elabs, we have steps like the above, some of them were written by me. Which leads me to what's really wrong with the last three lines of the above feature. They are written using nothing else than the standard web steps generated by cucumber-rails own generator. Cucumber itself ships with steps which in my opinion encourage an anti-pattern.

Pickle my fancy

Another tool which we've experimented a bit with is Pickle, which allows you to easily generate models from your feature files. A basic example from the README:

Given a user exists
And a post exists with author: the user

Given a person: "fred" exists
And a person: "ethel" exists
And a fatherhood exists with parent: user "fred", child: user "ethel"

It actually looks fairly nice, reads quite naturally, so a first instinct might be to call this plain text. But on closer inspection, there is a whole language in there. To comprehend what these steps are doing you'd need to understand not only the domain models involved, but also the language Pickle uses to manipulate these. I'm pretty sure a non-technical person couldn't make sense of the above. This is really no different, and in fact worse, than writing actual code:

@user = User.make
Post.make(:user => @user)

@fred = Person.make(:name => 'Fred')
@ethel = Person.make(:name => 'Ethel')
Fatherhood.make(:user => @fred, :child => @ethel)

Note how there is an almost one-to-one mapping between the feature above, and the code below. The only thing cucumber does in this case is act as some kind of phoney translator. We write code, but not actual code. So we can do some stuff, but mostly it comes out worse than if we'd just written it as code in the first place. I can't blame anyone for disliking cucumber when using it like this.

However, try this instead:

Given there is a user called "Jimmy"
And there is a post authored by "Jimmy"

Given there is a person called "Fred"
And there is a person called "Ethel"
And "Fred" is the father of "Ethel"

There's not a huge difference between the first couple of lines, even though they read somewhat nicer when written out like this. The real difference is in the last line. Here cucumber is adding value by explaining this abstract concept of a Fatherhood into something very concrete: one person is the other's dad. Cucumber added value to this feature, instead of only acting as a hindrance.

I believe that Pickle is flawed as a concept, in order to achieve readable steps, they need to be written by hand.

The worst feature ever written.

As a curiosity, I present the worst cucumber feature known to man. If you are responsible for something like this, please go slap yourself in the face as hard as you can.

Scenario: User creates some sites and circuits, check connected sites list
    Given a "site" exists with {"name"=>"Somewhere1", "identifier" => "TER1", "provider"=>"TER1 Provider"}
    And a "site" exists with {"name"=>"Somewhere2", "identifier" => "TER2", "provider"=>"Some Provider"}
    And a "site" exists with {"name"=>"Somewhere3", "identifier" => "TER3", "provider"=>"TER3 Provider"}
    And a "circuit" exists with {"provider_name"=>"Another provider", "redacted_circuit_id"=>"ABC1", "provider_circuit_id"=>"C1", "circuit_type"=>CircuitType.find_by_name("Peering"), "service_type"=>CircuitServiceType.find_by_name("Dark Fiber"), :capacity => CircuitCapacity.find_by_name("1 Gbps"), "physical_wire_type"=>PhysicalWireType.find_by_name("Multi Mode Fiber"), "status"=> CircuitStatus.find_by_name("Cancelled"), "a_end"=>Site.find_by_identifier("TER1"), "b_end"=>Site.find_by_identifier("TER2")}
    And a "circuit" exists with {"provider_name"=>"Switch and Data", "redacted_circuit_id"=>"ABC2", "provider_circuit_id"=>"C2", "circuit_type"=>CircuitType.find_by_name("Backbone"), "service_type"=>CircuitServiceType.find_by_name("Dark Fiber"), :capacity => CircuitCapacity.find_by_name("1 Gbps"), "physical_wire_type"=>PhysicalWireType.find_by_name("Multi Mode Fiber"), "status"=> CircuitStatus.find_by_name("Cancelled"), "a_end"=>Site.find_by_identifier("TER1"), "b_end"=>Site.find_by_identifier("TER3")}
    When I am on the "connected_sites" page for site "TER1"
    Then the "connected-sites-list" should look like
      |   Site ID    |  Site Name | Site Provider | Provider Circuit ID  | Provider Name    | Circuit Status |
      |     TER2     | Somewhere2 | Some Provider |         C1           | Another provider |  Cancelled     |
      |     TER3     | Somewhere3 | TER3 Provider |         C2           | Switch and Data  |  Cancelled     |
    When I am on the "connected_sites" page for site "TER2"
    Then the "connected-sites-list" should look like
      |   Site ID    |  Site Name | Site Provider | Provider Circuit ID  | Provider Name    | Circuit Status |
      |     TER1     | Somewhere1 | TER1 Provider |         C1           | Another provider |  Cancelled     |
    When I am on the "connected_sites" page for site "TER3"
    Then the "connected-sites-list" should look like
      |   Site ID    |  Site Name | Site Provider | Provider Circuit ID  | Provider Name    | Circuit Status |
      |     TER1     | Somewhere1 | TER1 Provider |         C2           | Switch and Data  |  Cancelled     |

Yes, those are Hashes inside a feature, which are then eval'd. Make sure to scroll to the right to experience the full horror of it all. I challenge anyone to find a worse cucumber feature than this. I assure you, that thing is real (from one of our rescue mission projects), and there is much more where it came from.

Writing better steps

So how do we write better steps? For me personally, I've found that sticking to the following rule seems to lead to nice, maintainable steps:

A step description should never contain regexen, CSS or XPath selectors, any kind of code or data structure. It should be easily understood just by reading the description.