Friendly Ruby Objects

Eric Hodel | Thu, 18 Dec 2008 01:28:00 GMT

This post is intended to supplement the Ruby Quickref with the various ways you can make your objects play nicely with each other.

Most of the examples below are taken from RubyGems, some examples won't work until the next release of RubyGems.

Enumerable

The Enumerable module is based on the #each method and contains well known methods like #map and #each_with_index. If the enumerated objects implement the #<=> method you get a useful #sort, #min and #max.

Gem::SourceIndex has a Hash internally that it exposes via #each:

class Gem::SourceIndex
  include Enumerable

  # ...

  def each(&block)
    @gems.each(&block)
  end
end

Which allows handy things like:

dep = Gem::Dependency.new ARGV.shift, Gem::Requirement.default

found = Gem.source_index.any? do |name, spec|
  dep =~ spec
end

puts "found gem for #{dep.name}!" if found

Comparable

The Comparable module is based on the #<=> method and gives all the comparison methods.

Gem::Specification objects are sorted via name, version and platform:

class Gem::Specification
  include Comparable

  def <=>(other)
    my_platform = Gem::Platform::RUBY == @platform ? -1 : 1
    other_platform = Gem::Platform::RUBY == other.platform ? -1 : 1

    [@name, @version, platform] <=>
      [other.name, other.version, other.platform]
  end
end

This is used in RubyGems to sort objects both for display on the screen like in gem list and internally when installing gems.

For a way to reduce the repetition in the above code and some other sorting speed-ups, see my post on #sort_by and #sort_obj.

#to_s and #inspect

Overriding #to_s and #inspect prevent you from puking all over the screen when somebody wants to look at your object. To a certain extent limiting the amount of information shown can aid debugging.

Gem::Specification's #to_s that gives only the two most-important attributes:

class Gem::Specification
  def to_s
    "#<Gem::Specification name=#{@name} version=#{@version}>"
  end
end

Gem::Platform's #to_s gives a friendly string:

class Gem::Platform
  def to_a
    [@cpu, @os, @version]
  end

  def to_s
    to_a.compact.join '-'
  end
end

Gem::Version's inspect ignores internal instance variables and only exposes @version:

class Gem::Version
  def inspect # :nodoc:
    "#<#{self.class} #{@version.inspect}>"
  end
end

Case equality with #===, Matching with #=~

While #=== is more commonly overridden, it can also be useful to implement #=~ to allow your objects to be used in a very readable manner.

Gem::Platform overrides #===:

class Gem::Platform
  def ===(other)
    return nil unless Gem::Platform === other

    # cpu
    (@cpu == 'universal' or other.cpu == 'universal' or @cpu == other.cpu) and

    # os
    @os == other.os and

    # version
    (@version.nil? or other.version.nil? or @version == other.version)
  end
end

In short, one platform matches another if they have the same cpu (architecture) or either is universal, they have the same os and their versions match if they have versions.

You can use this to group gems:

platform_count = Hash.new 0

Gem.source_index.each do |name, spec|
  case spec.platform
  when Gem::Platform.new('linux') then
    platform_count['linux'] += 1
  # ...
  else
    platform_count['other'] += 1
  end
end

p platform_count

Gem::Dependency overrides #=~. It is a little strange because it converts the right-hand side to a Gem::Dependency object:

class Gem::Dependency
  def =~(other)
    other = case other
            when self.class then
              other
            else
              return false unless other.respond_to? :name and
                                  other.respond_to? :version

              Gem::Dependency.new other.name, other.version
            end

    pattern = @name
    pattern = /\A#{Regexp.escape @name}\Z/ unless Regexp === pattern

    return false unless pattern =~ other.name

    reqs = other.version_requirements.requirements

    return false unless reqs.length == 1
    return false unless reqs.first.first == '='

    version = reqs.first.last

    version_requirements.satisfied_by? version
  end
end

You can use this as a filter:

dep = Gem::Dependency.new(/ruby/, Gem::Requirement.default)

ruby_named = Gem.source_index.select do |name, spec|
  dep =~ spec
end

p ruby_named.map { |name, spec| name }

As a Hash key

Ruby uses #hash and #eql? to determine if two different objects really mean the same hash key.

Gem::Version is usable as a Hash key based on the internal version string:

class Gem::Version
  def hash
    @version.hash
  end

  def eql?(other)
    self.class === other and @version == other.version
  end
end

In Gem::Version, the internal version string looks like "1.3" or "1.3.0". In this implementation the two versions would belong to different hash keys.

Using #eql? instead of #== to determine if two keys are the same is a nice distinction since it allows you to have interesting behaviors (but I'm not sure they are useful). For Gem::Version, a version of "1.3" is equal to "1.3.0", but the occupy different slots in a Hash.

#intialize_copy

#initialize_copy is called during #dup and #clone to copy object-specific state beyond instance variables. The object being cloned from is passed to the new instance. #initialize_copy could be used for cleaning out a per-object cache:

def initialize_copy(other)
  @cache = []
end

#exception

#exception is called on objects given to #raise to cast them to Exception objects. It must return a subclass of Exception. You can use this to turn arbitrary objects into exceptions, centralizing all your exception raising code:

class Result
  class Error < RuntimeError; end

  def initialize(json)
    @result = JSON.parse json
  end

  def exception(message = nil)
    Error.new "#{message} (#{@result['error']})"
  end

  def [](key)
    @result[key]
  end
end

r = Result.new open('http://example.com/api/blah').read

raise r if r['error']

Marshal

Ruby will marshal most objects automatically, but sometimes you want a custom format to ignore cached data that can be reconstructed or to reduce the size of the data you're saving out. There are two ways to do this, the older way is #_dump/::_load and the newer way is #marshal_dump/#marshal_load which takes priority. If you want to upgrade to the newer way you can leave ::_load to restore older marshaled objects.

Using the older way #_dump returns a String representation of the object (usually another Marshal string) and ::_load receives that String representation. Note that it's a class method, so ::_load is responsible for creating the object, which may be important in some instances.

Using the newer way, #marshal_dump returns an Object and #marshal_load receives that Object. The object is already allocated, but #initialize won't be called. The newer way can result in a smaller marshal dump size since it uses the existing symbol and object reference tables.

Gem::Specification uses #_dump/::load and is fairly complicated because I designed it to be backward and forward-compatible. This is a slightly-stripped-down version:

class Gem::Specification

  CURRENT_SPECIFICATION_VERSION = 2

  # number of fields per version
  MARSHAL_FIELDS = { -1 => 16, 1 => 16, 2 => 16 }

  def self._load(str)
    array = Marshal.load str

    spec = Gem::Specification.new
    spec.instance_variable_set :@specification_version, array[1]

	# validate object
    current_version = CURRENT_SPECIFICATION_VERSION

    field_count = if spec.specification_version > current_version then
                    spec.instance_variable_set :@specification_version,
                                               current_version
                    MARSHAL_FIELDS[current_version]
                  else
                    MARSHAL_FIELDS[spec.specification_version]
                  end

    if array.size < field_count then
      raise TypeError, "invalid Gem::Specification format #{array.inspect}"
    end

	# restore object
    spec.instance_variable_set :@rubygems_version,          array[0]
    # ...
    spec.instance_variable_set :@platform,                  array[16].to_s
    spec.instance_variable_set :@loaded,                    false

    spec
  end

  def _dump(limit)
    Marshal.dump [
      @rubygems_version,
      @specification_version,
      # ...
      @new_platform,
    ]
  end
end

Gem::Version uses #marshal_dump/#marshal_load and ignores the internal instance variables, only dumping @version:

class Gem::Version
  def marshal_dump
    [@version]
  end

  def marshal_load(array)
    self.version = array[0]
  end
end

Pretty-print with PP

With a little work PP can give you easily readable output for your objects, even output that you can copy and paste back into a script. Primarily you'll use the PrettyPrint#group, PrettyPrint#text, PrettyPrint#breakable and PP#pp methods inside a #pretty_print method on your object. You can find documentation for these methods using ri.

Here's #pretty_print from Gem::Dependency and Gem::Requirement:

class Gem::Dependency
  def pretty_print(q)
    q.group 1, 'Gem::Dependency.new(', ')' do
      q.pp @name
      q.text ','
      q.breakable

      q.pp @version_requirements

      q.text ','
      q.breakable

      q.pp @type
    end
  end
end

class Gem::Requirement
  def pretty_print(q)
    q.group 1, 'Gem::Requirement.new(', ')' do
      q.pp as_list
    end
  end

  def as_list
    normalize
    @requirements.map do |op, version| "#{op} #{version}" end
  end
end

Together these make pretty, copy-pastable output:

require 'pp'

gem 'ParseTree'

pp Gem.loaded_specs["ParseTree"].dependencies
[Gem::Dependency.new("RubyInline",
  Gem::Requirement.new([">= 3.7.0"]),
  :runtime),
 Gem::Dependency.new("sexp_processor",
  Gem::Requirement.new([">= 3.0.0"]),
  :runtime),
 Gem::Dependency.new("hoe", Gem::Requirement.new([">= 1.8.0"]), :development)]

Posted in ,  | 6 comments

200thish Seattle.rb Meeting!

Eric Hodel | Wed, 10 Dec 2008 03:07:50 GMT

Next week is the 200thish Seattle Ruby Brigade meeting. We’re not sure exactly how many meetings we’ve had, but we’re celebrating right around our bicentennial!

Our 200thish meeting will be at the usual place and time. The meeting will feature cake and party hats! It’s also Aaron Patterson’s 28th birthday, so there will even be an extra cake!

Posted in  | 5 comments