From: Tim Bates Date: 2004-05-17T22:52:22+09:00 Subject: How to duck type? - the psychology of static typing in Ruby Hi all, Following a discussion in #ruby-lang, I have a suggestion about how to approach Duck Typing. Below is my dissertation on the subject. :P My intention is to incorporate any comments people might have into the text and then place it on the Wiki as an introduction to Duck Typing for the static typist. For those not in on the secret, the idea is that if an object walks like a duck and quacks like a duck, it may as well be a duck - this being a metaphor for an arbitrary object that may not be exactly the same class your code was expecting, but still behaves the same way - see [1] if you don't follow. --- Many people coming to Ruby from a statically-typed language are somewhat afraid of Ruby's dynamism, or "don't get it(TM)". David Black and I believe that this is in part because it is thought that the uncertainty and changeability built into Ruby are dangerous and one wants to find shelter from them. Please bear with me while I describe some of the possible approaches. 1) People with a Static Typing background often have the urge to do something like this: attr_reader :date def date=(val) raise ArgumentError.new("Not a Date") if val.class != Date end This is not duck typing - this is trying to get Ruby to do Static Typing. 2) Okay, you say, if that's not duck typing, let's do duck typing by accepting a whole bunch of different input formats and trying to turn them into something we know how to deal with, like this: def date=(val) class="keyword">case val when Date @date = val when Time @date = Date.new(val.year, val.month, val.day) when String if val =~ /(\d{4})\s*[-\/\\]\s*(\d{1,2})\s*[-\/\\]\s*(\d{1,2})/ @date = Date.new($1.to_i,$2.to_i,$3.to_i) else raise ArgumentError, "Unable to parse #{val} as date" end when Array if val.length == 3 @date = Date.new(val[0], val[1], val[2]) end else raise ArgumentError, "Unable to parse #{val} as date" end end This "normalization" approach has the advantage that the date attribute getter will always return a Date (producing certainty), but the setter can take input in a variety of formats. 2.a) Discussing this on #ruby-lang, David Black suggested the following optimization: def date=(val) begin @date = Date.new(val.year, val.month, val.day) rescue begin val =~ /(\d{4})\s*[-\/\\]\s*(\d{1,2})\s*[-\/\\]\s*(\d{1,2})/ @date = Date.new($1.to_i,$2.to_i,$3.to_i) rescue begin @date = Date.new(val[0], val[1], val[2]) rescue raise ArgumentError, "Unable to parse #{val} as date" end end end end This has the advantage over (2) that it doesn't depend upon the class of val - if it acts enough like a string to use the =~ operator, then that clause will handle it, even if it's not descended from String - unlike the previous example. This makes it "more duck-typed", but still addresses the static-typist's fear of uncertainty and dynamism by providing a predictable response from #date (it will always be a Date). Unfortunately it's also slow. 3) Even "more duck-typed" is the approach of just testing that it responds to the appropriate methods, like so: # Accepts an object which responds to the +year+, +month+ and +day+ # methods. def date=(val) [:year, :month, :day].each do |meth| raise ArgumentError unless val.responds_to?(meth) end @date = val end In this case, we have removed the normalization instituted in example (2), but we have still ensured that the #date attribute conforms to some sort of interface, providing certainty. It is now the caller's responsibility to make sure what they pass fits the [:year, :month, :day] specification - but this responsibility is documented. However, this approach violates the Don't Repeat Yourself principle - both the code and the comment contain the specification, and are not therefore guaranteed to be in sync. This approach is what many people believe to be embodied by "Duck Typing". Given an object, we're checking whether it walks and quacks like a duck; we're not forcing our caller to use a particular class, like example (1), but we are forcing our caller to put the data in a format we can understand, unlike (2) which attempts to deal with every possible representation of a date, causing volumes of maintenance work - imagine trying to write a normalization routine like that for every attribute of every class! In this way, we are moving the responsibility of putting the data into a reasonable format to the caller, who knows what format their data is in, from the receiver, who has to guess at every possible format the caller might send them. 4) The fourth and final approach, which I believe to be the Zen of Duck Typing, is as follows: # Accepts an object which responds to the +year+, +month+ and +day+ # methods. attr_accessor :date "What?" I hear you cry. "There's no checking there at all! You could pass it anything!" Yes, gentle reader, but why would you? After all, the documentation for this method is exactly the same as the one above. If the programmer using this method does what the documentation says then the class's behaviour is exactly the same. If they hand it the wrong thing (accidentally, we assume) then the only difference is that it breaks when the setter is called, rather than some time after the getter is called and we try and call a non-existent method on the result. A common response to this often contains the phrase "meaningless error messages", but the results of such a mistake are usually, if not always, far from meaningless. For the most part, they look something like this: NoMethodError: undefined method `year' for "notadate":String This tells me a lot: namely, that some part of my code (whose location is given in the subsequent backtrace) expected "notadate" to have a :year method, and it didn't. From this it is fairly trivial to deduce that something, somewhere, has fed the wrong thing to the date= setter method. Chances are that if your code is well-factored, there aren't a whole lot of places that set the date, and the location of the error can be found through a little judicious testing; you've lost the certainty and immediacy of the inline check, but not by much, and you've gained the flexibility of dynamic typing, and a whole lot less code to maintain. Now if you'd been writing and collecting unit tests as you went along, instead of NoMethodError: undefined method `year' for "notadate":String you would be seeing 1) Failure: test_stuff(MyClassTest) [./test/myclasstest.rb:13]: is not true. which makes the error even easier to find: you go to test/myclasstest.rb and see something like: 10: def test_date 11: @obj = Foo.new 12: @obj.date = MyClass.new.notadate 13: assert(@obj.date.respond_to?(:year)) 14: end and now the error is trivial to trace - the moral of the story being that when Duck Typing, do your checking in your unit tests, rather than in the live code. Type errors such as this one are usually the least common and easiest to trace of errors; if the attribute's documentation specifies what it is supposed to be, as in the example above, and the callers of both the getter and the setter methods make no assumptions about any more or less than what the documentation says, then apart from keyboarding accidents this will never be a problem. At [1], Dave Thomas describes Duck Typing as "a way of thinking about programming in Ruby." I think he means to go a step further than that - Duck Typing is the _best_ way of thinking about programming in Ruby, and possibly the _only_ way; as David Black puts it: "I think the concept of duck typing needs to be supplemented and expanded on. if, as seems to be the case, Dave thinks of it as a component of programming style, then it doesn't address language design itself. As long as duck typing is viewed as a stylistic choice, rather than a radical language principle, the door is always open to people saying 'I don't do duck typing', by which they usually mean that they use kind_of? a lot... of course Ruby itself *does* do duck typing, whether a given programmer thinks they're doing it or not." Using kind_of? (or responds_to?) a lot isn't "not doing Duck Typing", it's simply adding in at run time the kinds of checks that Statically Typed languages do at compile time, in a usually verbose and necessarily incomplete fashion. Rather than trying to make Ruby do Static Typing because one is from a Static Typing background and that's what one is comfortable with, one should become comfortable with the dynamic nature of Ruby instead. I have found that once I stopped assuming that the callers of my method (who may well be me, in five minutes time, or some user of my library on the other side of the planet) are stupid and don't know how to read my documentation (you did write some, didn't you?) then writing in Ruby became a whole lot more natural and somewhat less verbose. The unit tests took care of the psychological need to check, somewhere, that the method was getting passed the right thing, but in reality the whole debacle is a non-issue; type errors are the most trivial of bugs. And if you're still worried about that date example, an alternative solution is this: def set_date(year, month, day) @date = Date.new(year, month, day) end which, if year, month and day are not numeric, will catch the problem straight away - without resorting to Static Typing or some approximation of it. And the way it catches it is telling: irb(main):027:0> Date.new(2004.0, Rational(12,2), "17") ArgumentError: comparison of String with 0 failed from /usr/lib/ruby/1.8/date.rb:560:in `<' from /usr/lib/ruby/1.8/date.rb:560:in `valid_civil?' from /usr/lib/ruby/1.8/date.rb:590:in `new' from (irb):27 This is not "ArgumentError: parameters must be numbers" - the error is discovered when the Date class attempts to compare that parameter to zero and can't do it, after assuming that it was valid. And it didn't make the mistake any harder to find, did it? Notice that it didn't balk at Floats or Rationals, and with no extra coding from the implementor; Floats and Rationals look, and quack, like numbers. That's Duck Typing in action. [1] http://rubygarden.org/ruby?DuckTyping --- Tim. -- Tim Bates tim@bates.id.au