Blog


Apr

My Craftsman Swap with Bendyworks


Last week I flew from Sweden to Madison, WI to work for a week together with the developers at Bendyworks, or as they call themselves, Bendyworkers.

After the 23 hour flight I was met by Stephen who didn’t hesitate for a moment to welcome me to Madison in the middle of the night on a Sunday. Over the week I’ve learned that a great talent for hospitality is something that all the Bendyworkers have in common. Even though it was late Stephen gave me a quick tour around the beautiful Capitol building located right next to Bendywork’s office.

Capitol building

Bendyworkers perform their craft in a rustic triangle-shaped building built before the 1900’s. The office is located right downtown and is surrounded by a bursting number of cafés, restaurants, and even a theatre. Inside I found that the rooms are all very open and people are moving naturally between desks and programming pairs. During Bendywork’s monthly “release valve” meeting I learned that not even the owners Stephen, Brad, and Jim take a dedicated office for granted. To me this illustrates well how flat and transparent the company structure is at Bendyworks.

I had the opportunity to work on two different projects over the week. On Monday I worked together with Chris on a CMS for Internet Week New York. The project was wrapping up, since all the major features already were delivered we got some time to spend on refactoring a few acceptance tests and have them execute faster. Tuesday through Thursday I paired up with Josh on work for SEOmoz. Josh has some serious shell and terminal vim skills going on, while I’m more of a mvim user depending a bit more on Mac OS X to do window handling for me. The SEOmoz work spanned across three different Rails-based applications with a very heavy emphasis on client side JavaScript.

Office

Bendyworkers all have a genuine passion for their craft. When they don’t attend meetups they’re working on numerous open source projects or catching up with their self-assigned book club related homework. Lunches are spent preparing for the book club or sharing knowledge through more organized presentations, like when Joe had a great walkthrough of his blogpost on giving yourself a security makeover.

With all that time spent on perfecting their craft you’d think Bendyworkers wouldn’t know how to have fun. Well, you’re wrong. Ping-pong games, comedy clubs, taco-tuesdays, arcade halls, great food and drinks just to name a few of the activities Bendyworkers have treated me to over the week.

With that, I’d like to thank Bendyworks for a week full of fun, productive, and educational experiences!

Apr

Steve Jobs on what's important in the development of a product


You know, one of the things that really hurt Apple was after I left John Sculley got a very serious disease. It’s the disease of thinking that a really great idea is 90% of the work. And if you just tell all these other people “here’s this great idea,” then of course they can go off and make it happen.

And the problem with that is that there’s just a tremendous amount of craftsmanship in between a great idea and a great product. And as you evolve that great idea, it changes and grows. It never comes out like it starts because you learn a lot more as you get into the subtleties of it. And you also find there are tremendous tradeoffs that you have to make. There are just certain things you can’t make electrons do. There are certain things you can’t make plastic do. Or glass do. Or factories do. Or robots do.

Designing a product is keeping five thousand things in your brain and fitting them all together in new and different ways to get what you want. And every day you discover something new that is a new problem or a new opportunity to fit these things together a little differently.

And it’s that process that is the magic.

— Steve Jobs, Triumph of the Nerds

Via 37signals and CNN Fortune Tech

Mar

Advanced topics in Ruby FFI


Short primer: what is FFI?

This article is not a tutorial on the basics of FFI. However, if you’ve never heard of FFI before, I’d like to wet your appetite before continuing on.

FFI is an alternative to writing C to use functionality locked within native libraries from Ruby. It allows you to explain, with an intuitive Ruby DSL, which functions your native library contain, and how they should be used. Once the functionality of your native library is mapped out, you can call the functions directly from Ruby.

Furthermore, gems using ffi do not need to be compiled, and will run without modifications on CRuby, JRuby and Rubinius! In practice there could be small differences between the platforms in the behaviour and usage of FFI, and if you find any you should report them to the Ruby FFI issue tracker so it can be dealt with.

As far as basic tutorials on using FFI, your best resource is the FFI wiki. It also has a list of projects using FFI, which is your second best resource on learning how to use FFI.

Aliasing with typedef

If we look at the header for a function from libspotify:

SP_LIBEXPORT(sp_error) sp_session_player_prefetch(sp_session *session, sp_track *track);

Naively mapping this to FFI we’ll need:

enum :error, [ … ]
attach_function :sp_session_player_prefetch, [ :pointer, :pointer ], :error

Unfortunately, we lost two pieces of valuable information here. Both sp_session and sp_track are types that occur many times in the library. When we look at the ruby implementation, there is no hint whatsoever of what type the two pointers should be of.

It does not need to be like this. Using typedef we can name our parameters, and bring back the information that we lost in our translation.

typedef :pointer, :session
typedef :pointer, :track
enum :error, [ … ]
attach_function :sp_session_player_prefetch, [ :session, :track ], :error

Functionality of our method does not change, but implementation is now slightly more clear and maintainable.

Specializing in attach_function

C libraries do not follow Ruby naming conventions, which makes sense since they’re not written in Ruby. However, bindings written with Ruby FFI are in Ruby and will be called from Ruby, so they should have the look and feel of Ruby.

Attach function allow you to call it in two ways:

attach_function :c_name, [ :params ], :returns, { :options => values } # 1
attach_function :ruby_name, :c_name, [ :params ], :returns, { :options => values } # 2

Using the first form will create your Ruby methods with the same name as your native library’s functions. Using the second form allows you to rename the bound method, giving it a more expected final name.

Native libraries you bind with FFI will have naming conventions of their own. For example, OpenAL will prefix it’s functions with al or alc, and camel case. libspotify will prefix it’s functions with sp_. Apart from removing the suffix, and snake_casing the function name, we want the Ruby method to be named similarly. We could repeat ourselves for every method:

attach_function :open_device, :alcOpenDevice, [ :string ], :device
attach_function :close_device, :alcCloseDevice, [ :device ], :bool

But remember! When you use FFI, you extend the FFI::Library inside a module of your own. This also means you can override the attach_function call, without your specialized version leaking to the outside world. By overriding attach_function we can avoid unnecessary noise in our FFI bindings.

def self.attach_function(c_name, args, returns)
    ruby_name = c_name.to_s.sub(/\Aalc?/, "").gsub(/(?\<\!\A)\p{Lu}/u, '_\0').downcase
    super(ruby_name, c_name, args, returns)
end

attach_function :alcOpenDevice, [ :string ], :device # gets bound to open_device
attach_function :alcCloseDevice, [ :device ], :bool # gets bound to close_device

This does not end here. After calling super inside attach_function you have the option of further specializing the newly bound method. You could implement automatic error checking for every API call, or alter the parameters based on native library conventions, and more. Just remember that the added complexity should be worth the savings.

FFI::Structs as parameters

Structs in FFI can be used as parameters, and is by default equivalent to specifying a type of :pointer.

class SomeStruct < FFI::Struct
end

attach_function :some_function, [ SomeStruct ], :void
# equivalent to:
attach_function :some_function, [ :pointer ], :void

callback :some_callback, [ SomeStruct ], :void
# equivalent to:
callback :some_callback, [ :pointer ], :void

I’d like to bring forth an alternative for your referenced struct parameters, namely FFI::Struct.by_ref. It behaves very similarly to the above, with the important difference in that it type-safety built-in!

attach_function :some_function, [ SomeStruct ], :void
some_function FFI::Pointer.new(0xADDE55) # this is possibly unsafe, but allowed

attach_function :some_function, [ SomeStruct.by_ref ], :void
some_function FFI::Pointer.new(0xADDE55) # BOOM, wrong argument type FFI::Pointer (expected SomeStruct) (TypeError)
some_function SomeOtherStruct.new # BOOM, wrong argument type SomeOtherStruct (expected SomeStruct) (TypeError)

Further more, if you use FFI::Struct.by_ref for your callback parameters or function return values, FFI will automatically cast the pointer to an instance of your struct for you!

callback :some_callback, [ SomeStruct.by_ref ], :void
attach_function :some_function, [ :some_callback ], :void

returned_struct = some_function(proc do |struct|
  # struct is an instance of SomeStruct, instead of an FFI::Pointer
end)

attach_function :some_other_function, [ ], SomeStruct.by_ref
some_other_function.is_a?(SomeStruct) # true, instead of being an FFI::Pointer

Keep in mind, that on JRuby 1.7.3, FFI::Struct.by_ref type accepts any descendant of FFI::Struct, and not only instances of YourStruct. See https://github.com/jruby/jruby/issues/612 for updates.

Piggy-back on Ruby’s garbage collection with regular FFI::Structs

If we take a look again at the above code with SomeStruct as return value.

attach_function :some_other_function, [ ], SomeStruct.by_ref

In some libraries, the memory for the pointer to SomeStruct returned from some_other_function is expected to be managed by us. This means we’ll most likely need to call some function free_some_struct to specifically free the memory used by SomeStruct when the object is no longer needed. Here’s how it would be used:

begin
  some_struct = some_other_function
  # do something with some_struct
ensure
  free_some_struct(some_struct)
end

Unfortunately, if we pass some_struct somewhere else beyond our control, we must be able to trust that the new guardian of some_struct calls free_some_struct in the future, or we will have a memory leak! Oh no!

Fear not, for FFI::Struct has a trick up it’s sleeve for us. Have a look at this.

class SomeStruct < FFI::Struct
  def self.release(pointer)
    MyFFIBinding.free_some_struct(pointer) unless pointer.null?
  end
end

attach_function :some_other_function, [], SomeStruct.auto_ptr

With the above binding code, some_other_function still returns an instance of SomeStruct. However, when our object is garbage collected FFI will call upon SomeStruct.release to free the native memory used by our struct. We can safely pass our instance of SomeStruct around everywhere and to everyone, and safely remember that when the object goes out of scope and Ruby garbage collects it, FFI will call upon us to free the underlying memory!

Related to this, you should look into FFI::ManagedStruct and FFI::AutoPointer if you have not already.

Writing our own data types

class Device < FFI::Pointer
end
attach_function :some_function, [ ], Device

Subclassing FFI::Pointers is a convenient way of working with pointers from native libraries less generic. Using the above code, when we call some_function we’ll receive an instance of Device, instead of the FFI::Pointer we would get if we specified the return value as a :pointer.

If objects in our native library are not pointers we can’t do what we’ve done above. For example, in OpenAL there’s a concept of audio sources, but they are represented by an integer, and not a pointer. Passing arbitrary integers around is not a nice practice, so what you could do is wrap the source in an object for further use.

class Source
  def initialize(id)
    @id = id
  end
  attr_reader :id
end

typedef :int, :source
attach_function :create_source, [], :source
attach_function :destroy_source, [ :source ], :void

# Usage
source = Source.new(create_source)
destroy_source(source.id)

While the code above is not bad, we could do much better by utilizing something in FFI called DataConverters. DataConverters are a way of writing code that tells FFI how to convert a native value to a ruby value and back. By doing this, we could have FFI automatically wrap source above in an object, making it completely transparent to the developer using the library.

class Source
  extend FFI::DataConverter
  native_type FFI::Type::INT

  class << self
    # `value` is a ruby object that we want to convert to a native object
    # this method should return a type of the native_type we specified above
    def to_native(value, context)
      if value
        value.id # in our case, we convert a Source to an int
      else
        -1 # if value is nil, we represent a `no source` value as -1
      end
    end

    # `value` is a type of the native_type specified above, we should return
    # a ruby object we wish to pass around in our application
    def from_native(value, context)
      new(value)
    end

    # this is needed when FFI needs to figure out the native size of your native type
    # for example, if you want to generate a pointer to hold something of this type
    # e.g. FFI::MemoryPointer.new(Source) # <= requires size to be defined and correct
    def size
      FFI.type_size(FFI::Type::INT)
    end

    # this method is a hint to FFI that the object returned from to_native needs to
    # be kept alive for the native value in the object to remain valid, so that if we
    # return an object that automatically frees itself on garbage collection, ffi will
    # prevent it from being garbage collected while it’s still needed, mainly useful
    # for to_native methods that allocate memory
    def reference_required?
      false
    end
  end

  def initialize(id)
    @id = id
  end

  attr_reader :id
end

attach_function :create_source, [], Source
attach_function :destroy_source, [ Source ], :void

source = create_source # an instance of Source, created through Source.from_native!
source.id # => the native value
destroy_source(source) # converts source to native value through Source.to_native!

You could do this to all types, even pointers. Even more, you are not constrained to only doing type conversion in to_native and from_native — you could perform validation, making sure your values have the correct type, length, or what ever you may need!

If you’d like some more example of custom types, I’ve written down a few in this gist: https://gist.github.com/elabs-dev/41c27fdb0a007ad4cac6

Implementing type safety

Do you remember what I mentioned earlier about FFI::Struct.by_ref automatically giving us some kind of type safety, preventing us from shenanigans where somebody sends invalid values to native functions? We can implement the very same kind of type safety ourselves for all types, by overriding to_native in our DataConverters.

# A to_native DataConverter method that raises an error if the value is not of the same type.
module TypeSafety
  def to_native(value, ctx)
    if value.kind_of?(self)
      super
    else
      raise TypeError, "expected a kind of #{name}, was #{value.class}"
    end
  end
end

We could now mix the above module into our own custom data types from the previous chapters.

# Even if we have another object that happens to look like a Source from our previous chapter,
# by having a #value method, we now won’t allow sending it down to C unless it’s an instance of
# Source or any of it’s subclasses.
Source.extend(TypeSafety)

# Remember Device from earlier? It’s a descendant of FFI::Pointer. Now all parameters of type Device
# will only accept instances of Device or any of it’s subclasses. All else results in a type error.
Device.extend(TypeSafety)

Duck-typing is very useful in Ruby, where raising an exception is the worst thing that can happen when we try to call a method on an object that does not respond to such a method. However, when interfacing with C libraries, passing in the wrong type will segfault your application with little information on what went wrong. Using this TypeSafety module, we can catch errors early, with a useful error message as a result.

Final words

Personally I really like using FFI. It’s a low-pain way of writing gems that use native libraries, and if you set your types up properly, not having a compiler that type-checks your code won’t be so bad. If you can work with native libraries through the means of FFI instead of writing a C extension, by all means do. Even if you intend on writing a C extension, using FFI can be a quick way of exploring a native API without wiring up C functions and data structures together with the Ruby C API.

Something that FFI excells at, in comparison to writing a C extension, is handling asynchronous callbacks from non-ruby threads in C. FFI can save you a lot of headache in that area.

Thank you.

References

Mar

Introducing Capybara 2.1


With the release of Capybara 2.0, we made a few changes to how Capybara acts in certain situations, which were designed to reduce unexpected and unintuitive behaviour. We got a lot of feedback that the new behaviour was too restrictive and that upgrading existing test suites from Capybara 1.x was too difficult.

Our goal with Capybara 2.1 has been to correct these problem. To provide more forgiving defaults, and more configurability for those who need it. To provide a smoother upgrade path for those on Capybara 1.x who failed to upgrade their apps due to too many breaking changes.

We focused on these key problems:

  • Matching exactly or allowing substrings
  • Behaviour when multiple elements match a query
  • Visibility
  • Asserting against the page title
  • Finding disabled elements

Aside from this, Capybara 2.1 contains many tweaks and new features.

Since we're following semver, we promise to maintain backward compatibility in all minor releases. Capybara 2.1 makes a compromise in maintaining backwards compatibilit but changing a few defaults. In order to have Capybara 2.1.0 behave identically to 2.0.x, set these options:

Capybara.configure do |config|
  config.match = :one
  config.exact_options = true
  config.ignore_hidden_elements = true
  config.visible_text_only = true
end

We have also enabled a smoother upgrade path for Capybara 1.x users. There are still changes which break compatibility between 1.x and 2.x, even with this configuration enabled, but most of them should be fairly easy to deal with. Try this:

Capybara.configure do |config|
  config.match = :prefer_exact
  config.ignore_hidden_elements = false
end

Matching exactly or allowing substrings

Capybara has always been very lenient about what it matches. Sometimes that's not what you want. To that end find, as well as all action methods like click_link and fill_in now accept an option called :exact which works together with the is expression inside the XPath gem. Without going too much into the internals, it allows you to specify whether for example click_link will allow you to specify a substring, or will need to match the entire link text exactly.

We have also added a global config option, Capybara.exact, which controls the default value of this property. Just as in Capybara 2.0.x however, this option defaults to false. That is, by default, matches are not exact, and substrings are allowed.

Behaviour when multiple elements match a query

When using metods such as click_link, and fill_in, what happens when more than one element matches? Under Capybara 1.0, we tried to find an element that matched exactly, and failing that, or if there were multiple such elements, we would simply return the first one and move on.

This has the obvious problem that sometimes, the element you end up interacting with isn't the one you expected at all.

We tried to solve this problem in Capybara 2.0 by being stricter and raising an exception in such a case instead. This was the biggest change in terms of compatibility between 1.x and 2.0 and has been very frustrating for many users.

In Capybara 2.1 we are making this behaviour configurable, and allowing users to pick which strategy they prefer, including reverting to the 1.x behaviour. We also changed the default behaviour to a slightly more lenient strategy.

We've added the match option, which takes four possible arguments: :one, :first, :prefer_exact, and :smart. Let's go through what they do:

:one is the current behaviour in Capybara 2.0.x. When two elements are found which both match the selector, a Capybara::Ambiguous error is raised.

:first is a looser behaviour which, when confronted with two elements which match the selector, simply grabs the first one and uses that one.

:prefer_exact is the behaviour present in Capybara 1.x. If multiple matches are found, some of which are exact, and some of which are not, then the first eaxctly matching element is returned.

:smart is the new default. The behaviour of :smart depends on the value of :exact. If :exact is true, the behaviour is identical to :one, that is, Capybara will perform a search for an exactly matching element, and if there is more than one, it will raise an error.

If :exact is false, things get more interesting. In that case, Capybara will first perform a search for elements which match the selector exactly, if there is exactly one, that element is returned. If there is more than one, a Capybara::Ambiguous error is raised. If no element matches, a new search is performed, allowing inexact matches. Again, if more than one element matches that search, Capybara::Ambiguous is raised.

This solves the much discussed Password confirmation problem, if there is a field with the exact label Password and another with Password Confirmation, and someone does this:

fill_in "Password", :with => "Capybara"

Then Capybara will pick the Password field and fill that in. If, however, the label of the password field would have been Password * (to indicate that it is required). Then an ambiguous error would have been raised, since both Password * and Password Confirmation are inexact matches for Password.

The idea is to strike a compromise between strictness and user friendliness. Those that prefer strictness can set:

Capybara.exact = true

Exactness of options

With Capybara 2.0 we changed the behaviour so that in the following case…

select "1", :from => "Number of people"

The option needed to match "1" exactly. So that if "10" was an option, it would be possible to differentiate between them. Options now use the same smart matching by default, as outlined above. To revert to the old behaviour of always requiring an exact match, the config options exact_options can be set to true.

Visibility

We have had an option which makes Capybara ignore all hidden elements for a long time: Capybara.ignore_hidden_elements. This option has always been false by default. This has confused people for a long time, on occasion even myself. We've made no further change to this behaviour other than the fact that this option now defaults to true. To revert to the old behaviour, simply set:

Capybara.ignore_hidden_elements = false

Visibility of text

In Capybara 1.x, the behaviour of text in the presence of invisible (display: none), DOM elements was undefined. While RackTest offers some rudimentary support for visibility, it was ignored for text, and even text in, for example, script tags was returned. Selenium ignored hidden text and other drivers did what they wanted.

In Capybara 2.0.x, we specified that text should only return text visible to the user, never hidden text, even for RackTest.

In Capybara 2.1.0, the visibility of text depends on ignore_hidden_elements. Setting ignore_hidden_elements to false means that even invisible text is returned.

We also make it possible to retrieve override this default by passing :all or :visible to the text method:

find("#thing").text           # depends on Capybara.ignore_hidden_elements
find("#thing").text(:all)     # all text
find("#thing").text(:visible) # only visible text

Since that is a departure from the Capybara 2.0.x API, we offer a special option for backward compatibility, which will make text always return only the text which is visible, even if ignore_hidden_elements is false.

Capybara.visible_text_only = true

Asserting against the page title

A lot of people migrating to Capybara 2.0 had problems with code like this:

page.should have_css("title", :text => "Whatever")

They found that title has no text, and thus this never matches

This is not however, a bug in Capybara. The above code is wrong, and should not work. To understand why, it's important to realize that the page title is quite distinct from the title element. The title element is invisible by default, and thus has no text which is visible to the user. Asking for its text very correctly returns nothing.

Don't believe me? Try pasting the following CSS into any web page:

head, head title { display: block }

You can now see the title element on the page. And making it visible this way will make the Selenium driver return the text inside that element, just as it should.

Instead of fixing Capybara to work with broken code, we are introducing a new API, which provides a nicer way of querying the page title:

page.title # => "The title"
page.has_title?("The title") # => true
page.should have_title("The title")

The has_title? and have_title matchers both have the same waiting behaviour as all other matchers in Capybara.

Finding disabled elements

Since Capybara 2.0, methods which interact with form fields and buttons, such as fill_in and click_button, do not allow interaction with disabled form elements and buttons. In addition, the matcher has_field? and the finder method find_field no longer match on disabled fields. It is still possible to locate disabled fields and buttons through other means, such as via find or has_selector?.

Capybara 2.1 makes it possible to override this behaviour by passing :disabled => true to any of these methods, which will find only disabled elements.

More

There are more new features in Capybara 2.1 which did not fit in this blog post. Please refer to the History file for a complete list.

Mar

Handle secret credentials in Ruby On Rails


This blog post aims to lay out a simple and concrete strategy for handling sensitive data in your Ruby On Rails applications, and to explain the importance of such a strategy.

Never, ever check them into source control

Even if your project is closed source and your trusted colleagues are the only ones with access, you never know when a freelancer or consultant might be joining the project. Even if that never occurs, how do you keep track of all the locations where that repository is checked out? Who knows on how many hard drives your company's credit card transaction secret API key might be stored. What happens when someone with a weak login password forgets their laptop on the bus or at the airport?

Also note that it's not always as simple as removing secrets after the fact, especially with version control. It's usually impossible to do this without drastically changing your entire project's history!

Do it right

For a long time, we've been using YAML files to store our application configuration. It's easy to manage and can be configured for different Rails environments. These YAML files could look like the following:

config/app.yml:

development: &defaults
  awesomeness_score: 3
  host: "localhost:3000"
  s3_bucket: "example-development-us"

production:
  <<: *defaults
  host: "example.com"
  s3_bucket: "example-production-us"

test:
  <<: *defaults

config/app_secret.yml.example:

  development: &defaults
  aws_access_key_id: ""
  aws_secret_access_key_id: ""

production:
  <<: *defaults

test:
  <<: *defaults

config/app_secret.yml:

development: &defaults
  aws_access_key_id: "ACTUAL-ID-WOULD-GO-HERE"
  aws_secret_access_key_id: "ACTUAL-SECRET-WOULD-GO-HERE"

production:
  <<: *defaults

test:
  <<: *defaults

Only the first two files would be checked in to source control, and the application's README would instruct developers to cp config/app_secret.yml.example config/app_secret.yml and fill in the gaps from the company keychain.

To make sure we never check in the secrets by mistake, we ignore the app_secret.yml file:

.gitignore:

# ...
/config/app_secret.yml

We then use the econfig gem written by Jonas Nicklas to easily merge them together:

Gemfile

# ...
gem "econfig", require: "econfig/rails"

config/application.rb

# ...
module YourApp
  extend Econfig::Shortcut
  # ...
end

Now we can access any configuration variable and secret credential:

YourApp.host # => "localhost:3000"
YourApp.aws_secret_access_key_id # => "ACTUAL-SECRET-WOULD-GO-HERE"

Deploy

When you deploy the application, you must manually manage the secrets on the server(s).

Capistrano

If you deploy with Capistrano, you'll want to place the app_secret.yml in your /shared folder. Once that's done, it can be copied to each release with symlink task:

deploy.rb

# ...
namespace :config do
  desc "Symlink application config files."
  task :symlink do
    run "ln -s {#{shared_path},#{release_path}}/config/app_secret.yml"  
  end
end

after "deploy", "config:symlink"

Heroku

If you're deploying your application where you don't have file access, such as Heroku, you're better off storing this kind of information in ENV. The econfig gem has built in support for this and a few other storage backends, but that's another blog post.

Conclusion

With this method, we now have a clear separation of sensitive and non-sensitive data. There's no risk of checking in any sensitive data, since we have only one place to put it all and it's hidden from source control. Data access within the application hasn't changed, and we no longer have to concern ourselves with how sensitive it is.

We can now be sure that giving access to a repository does not imply giving access to other systems.

Epilogue

If you have any feedback on how the blog post can be improved, or if you spot any errors, please let me know by posting a comment below!