Reflection, ObjectSpace, and Distributed Ruby

One of the many advantages of dynamic languages such as Ruby is the ability to introspect—to examine aspects of the program from within the program itself. Java, for one, calls this feature reflection.

The word “reflection” conjures up an image of looking at oneself in the mirror—perhaps investigating the relentless spread of that bald spot on the top of one's head. That's a pretty apt analogy: we use reflection to examine parts of our programs that aren't normally visible from where we stand.

In this deeply introspective mood, while we are contemplating our navels and burning incense (being careful not to swap the two tasks), what can we learn about our program? We might discover:

Armed with this information, we can look at particular objects and decide which of their methods to call at runtime—even if the class of the object didn't exist when we first wrote the code. We can also start doing clever things, perhaps modifying the program as it's running.

Sound scary? It needn't be. In fact, these reflection capabilities let us do some very useful things. Later in this chapter we'll look at distributed Ruby and marshaling, two reflection-based technologies that let us send objects around the world and through time.

Looking at Objects

Have you ever craved the ability to traverse all the living objects in your program? We have! Ruby lets you perform this trick with ObjectSpace::each_object. We can use it to do all sorts of neat tricks.

For example, to iterate over all objects of type Numeric, you'd write the following.

Hey, where did those last two numbers come from? We didn't define them in our program. If you look in the reference, you'll see that the Math module defines constants for e and PI; since we are examining all living objects in the system, these turn up as well.

Neither of the Fixnum objects we created showed up. That's because ObjectSpace doesn't know about objects with immediate values: Fixnum, true, false, and nil.

Looking Inside Objects

Once you've found an interesting object, you may be tempted to find out just what it can do. Unlike static languages, where a variable's type determines its class, and hence the methods it supports, Ruby supports liberated objects. You really cannot tell exactly what an object can do until you look under its hood. (Or under its bonnet, for objects created to the east of the Atlantic.)

For instance, we can get a list of all the methods to which an object will respond.

r = 1..10 # Create a Range object
list = r.methods
list.length → 60
list[0..3] → ["size", "end", "length", "exclude_end?"]

r.respond_to?("frozen?") → true
r.respond_to?("hasKey") → false
"me".respond_to?("==") → true

We can determine our object's class and its unique object id, and test its relationship to other classes.

num = 1
num.id → 3
num.class → Fixnum
num.kind_of? Fixnum → true
num.kind_of? Numeric → true
num.instance_of? Fixnum → true
num.instance_of? Numeric → false

Looking at Classes

Knowing about objects is one part of reflection, but to get the whole picture, you also need to be able to look at classes—the methods and constants that they contain.

Looking at the class hierarchy is easy. You can get the parent of any particular class using Class#superclass. For classes and modules, Module#ancestors lists both superclasses and mixed-in modules.

klass = Fixnum
begin
  print klass
  klass = klass.superclass
  print " < " if klass
end while klass
puts
p Fixnum.ancestors

Fixnum < Integer < Numeric < Object
[Fixnum, Integer, Precision, Numeric, Comparable, Object, Kernel]

If you want to build a complete class hierarchy, just run that code for every class in the system. We can use ObjectSpace to iterate over all Class objects:

Looking Inside Classes

We can find out a bit more about the methods and constants in a particular object. Instead of just checking to see whether the object responds to a given message, we can ask for methods by access level, we can ask for just singleton methods, and we can have a look at the object's constants.

class Demo
  private
    def privMethod
    end
  protected
    def protMethod
    end
  public
    def pubMethod
    end

  def Demo.classMethod
  end

  CONST = 1.23
end

Demo.private_instance_methods → ["privMethod"]
Demo.protected_instance_methods → ["protMethod"]
Demo.public_instance_methods → ["pubMethod"]
Demo.singleton_methods → ["classMethod"]
Demo.constants - Demo.superclass.constants → ["CONST"]

Module.constants returns all the constants available via a module, including constants from the module's superclasses. We're not interested in those just at the moment, so we'll subtract them from our list.

Given a list of method names, we might now be tempted to try calling them. Fortunately, that's easy with Ruby.

Calling Methods Dynamically

C and Java programmers often find themselves writing some kind of dispatch table: functions which are invoked based on a command. Think of a typical C idiom where you have to translate a string to a function pointer:

typedef struct {
  char *name;
  void (*fptr)();
} Tuple;

Tuple list[]= {
  { "play",   fptr_play },
  { "stop",   fptr_stop },
  { "record", fptr_record },
  { 0, 0 },
};

...

void dispatch(char *cmd) {
  int i = 0;
  for (; list[i].name; i++) {
    if (strncmp(list[i].name,cmd,strlen(cmd)) == 0) {
      list[i].fptr();
      return;
    }
  }
  /* not found */
}

In Ruby, you can do all this in one line. Stick all your command functions into a class, create an instance of that class (we called it commands), and ask that object to execute a method called the same name as the command string.

Oh, and by the way, it does much more than the C version—it's dynamic. The Ruby version will find new methods added at runtime just as easily.

You don't have to write special command classes for send: it works on any object.

"John Coltrane".send(:length) → 13
"Miles Davis".send("sub", /iles/, '.') → "M. Davis"

Another way of invoking methods dynamically uses Method objects. A Method object is like a Proc object: it represents a chunk of code and a context in which it executes. In this case, the code is the body of the method, and the context is the object that created the method. Once we have our Method object, we can execute it sometime later by sending it the message call.

trane = "John Coltrane".method(:length)
miles = "Miles Davis".method("sub")

trane.call → 13
miles.call(/iles/, '.') → "M. Davis"

You can pass the Method object around as you would any other object, and when you invoke Method#call, the method is run just as if you had invoked it on the original object. It's like having a C-style function pointer but in a fully object-oriented style.

def double(a)
  2*a
end

mObj = method(:double)

[ 1, 3, 5, 7 ].collect(&mObj) → [2, 6, 10, 14]

As good things come in threes, here's yet another way to invoke methods dynamically. The eval method (and its variations such as class_eval, module_eval, and instance_eval) will parse and execute an arbitrary string of legal Ruby source code.

trane = %q{"John Coltrane".length}
miles = %q{"Miles Davis".sub(/iles/, '.')}

eval trane → 13
eval miles → "M. Davis"

When using eval, it can be helpful to state explicitly the context in which the expression should be evaluated, rather than using the current context. You can obtain a context by calling Kernel#binding at the desired point.

class CoinSlot
  def initialize(amt=Cents.new(25))
    @amt = amt
    $here = binding
  end
end

a = CoinSlot.new
eval "puts @amt", $here
eval "puts @amt"

The first eval evaluates @amt in the context of the instance of class CoinSlot. The second eval evaluates @amt in the context of Object, where the instance variable @amt is not defined.

Performance Considerations

As we've seen in this section, there are several ways to invoke an arbitrary method of some object: Object#send, Method#call, and the various flavors of eval.

You may prefer to use any one of these techniques depending on your needs, but be aware that eval is significantly slower than the others (or, for optimistic readers, send and call are significantly faster than eval).

require "benchmark"   # from the Ruby Application Archive
include Benchmark

test = "Stormy Weather"
m = test.method(:length)
n = 100000

bm(12) {|x|
  x.report("call") { n.times { m.call } }
  x.report("send") { n.times { test.send(:length) } }
  x.report("eval") { n.times { eval "test.length" } }
}

                  user     system      total        real
call          0.220000   0.000000   0.220000 (  0.214065)
send          0.210000   0.000000   0.210000 (  0.217070)
eval          2.540000   0.000000   2.540000 (  2.518311)

System Hooks

A hook is a technique that lets you trap some Ruby event, such as object creation.

The simplest hook technique in Ruby is to intercept calls to methods in system classes. Perhaps you want to log all the operating system commands your program executes. Simply rename the method Kernel::system (This Eiffel-inspired idiom of renaming a feature and redefining a new one is very useful, but be aware that it can cause problems. If a subclass does the same thing, and renames the methods using the same names, you'll end up with an infinite loop. You can avoid this by aliasing your methods to a unique symbol name or by using a consistent naming convention.) and substitute it with one of your own that both logs the command and calls the original Kernel method.

module Kernel
  alias_method :old_system, :system
  def system(*args)
    result = old_system(*args)
    puts "system(#{args.join(', ')}) returned #{result}"
    result
  end
end

system("date")
system("kangaroo", "-hop 10", "skippy")

Sun Jun  9 00:09:44 CDT 2002
system(date) returned true
system(kangaroo, -hop 10, skippy) returned false

A more powerful hook is catching objects as they are created. If you can be present when every object is born, you can do all sorts of interesting things: you can wrap them, add methods to them, remove methods from them, add them to containers to implement persistence, you name it. We'll show a simple example here: we'll add a timestamp to every object as it's created.

One way to hook object creation is to do our method renaming trick on Class#new, the method that's called to allocate space for a new object. The technique isn't perfect—some built-in objects, such as literal strings, are constructed without calling new—but it'll work just fine for objects we write.

class Class
  alias_method :old_new,  :new
  def new(*args)
    result = old_new(*args)
    result.timestamp = Time.now
    return result
  end
end

We'll also need to add a timestamp attribute to every object in the system. We can do this by hacking class Object itself.

class Object
  def timestamp
    return @timestamp
  end
  def timestamp=(aTime)
    @timestamp = aTime
  end
end

Finally, we can run a test. We'll create a couple of objects a few seconds apart and check their timestamps.

class Test
end

obj1 = Test.new
sleep 2
obj2 = Test.new

obj1.timestamp → Sun Jun 09 00:09:45 CDT 2002
obj2.timestamp → Sun Jun 09 00:09:47 CDT 2002

All this method renaming is fine, and it really does work. However, there are other, more refined ways to get inside a running program. Ruby provides several callback methods that let you trap certain events in a controlled way.

Runtime Callbacks

These techniques are all illustrated in the library descriptions for each callback method. At runtime, these methods will be called by the system when the specified event occurs. By default, these methods do nothing. If you want to be notified when one of these events happens, just define the callback method, and you're in.

Keeping track of method creation and class and module usage lets you build an accurate picture of the dynamic state of your program. This can be important. For example, you may have written code that wraps all the methods in a class, perhaps to add transactional support or to implement some form of delegation. This is only half the job: the dynamic nature of Ruby means that users of this class could add new methods to it at any time. Using these callbacks, you can write code that wraps these new methods as they are created.

Tracing Your Program's Execution

While we're having fun reflecting on all the objects and classes in our programs, let's not forget about the humble statements that make our code actually do things. It turns out that Ruby lets us look at these statements, too.

Event	Callback Method
Adding an instance method	`Module#method_added`
Adding a singleton method	`Kernel::singleton_method_added`
Subclassing a class	`Class#inherited`
Mixing in a module	`Module#extend_object`

First, you can watch the interpreter as it executes code. set_trace_func executes a Proc with all sorts of juicy debugging information whenever a new source line is executed, methods are called, objects are created, and so on. There's a full description under Kernel::set_trace_func, but here's a taste.

class Test
  def test
    a = 1
    b = 2
  end
end

set_trace_func proc { |event, file, line, id, binding, classname|
  printf "%8s %s:%-2d %10s %8s\n", event, file, line, id, classname
}
t = Test.new
t.test

    line prog.rb:11               false
  c-call prog.rb:11        new    Class
  c-call prog.rb:11 initialize   Object
c-return prog.rb:11 initialize   Object
c-return prog.rb:11        new    Class
    line prog.rb:12               false
    call prog.rb:2        test     Test
    line prog.rb:3        test     Test
    line prog.rb:4        test     Test
  return prog.rb:4        test     Test

There's also a method Kernel::trace_var that lets you add a hook to a global variable; whenever an assignment is made to the global, your Proc object is invoked.

How Did We Get Here?

A fair question, and one we ask ourselves regularly. Mental lapses aside, in Ruby at least you can find out exactly “how you got there” by using the method caller, which returns an Array of String objects representing the current call stack.

def catA
  puts caller.join("\n")
end
def catB
  catA
end
def catC
  catB
end
catC

Marshaling and Distributed Ruby

Java features the ability to serialize objects, letting you store them somewhere and reconstitute them when needed. You might use this facility, for instance, to save a tree of objects that represent some portion of application state—a document, a CAD drawing, a piece of music, and so on.

Ruby calls this kind of serialization marshaling. (Think of railroad marshaling yards where individual cars are assembled in sequence into a complete train, which is then dispatched somewhere.) Saving an object and some or all of its components is done using the method Marshal::dump. Typically, you will dump an entire object tree starting with some given object. Later on, you can reconstitute the object using Marshal::load.

Here's a short example. We have a class Chord that holds a collection of musical notes. We'd like to save away a particularly wonderful chord so our grandchildren can load it into Ruby Version 23.5 and savor it, too. Let's start off with the classes for Note and Chord.

class Note
  attr :value
  def initialize(val)
    @value = val
  end
  def to_s
    @value.to_s
  end
end

class Chord
  def initialize(arr)
    @arr = arr
  end
  def play
    @arr.join('-')
  end
end

Now we'll create our masterpiece, and use Marshal::dump to save a serialized version of it to disk.

c = Chord.new( [ Note.new("G"),  Note.new("Bb"),
                 Note.new("Db"), Note.new("E") ] )

File.open("posterity", "w+") do |f|
  Marshal.dump(c, f)
end

Finally, our grandchildren read it in, and are transported by our creation's beauty.

File.open("posterity") do |f|
  chord = Marshal.load(f)
end

chord.play → "G-Bb-Db-E"

Custom Serialization Strategy

Not all objects can be dumped: bindings, procedure objects, instances of class IO, and singleton objects cannot be saved outside of the running Ruby environment (a TypeError will be raised if you try). Even if your object doesn't contain one of these problematic objects, you may want to take control of object serialization yourself.

Marshal provides the hooks you need. In the objects that require custom serialization, simply implement two methods: an instance method called _dump, which writes the object out to a string, and a class method called _load, which reads a string that you'd previously created and converts it into a new object.

For instance, here is a sample class that defines its own serialization. For whatever reasons, Special doesn't want to save one of its internal data members, “@volatile”.

class Special
  def initialize(valuable)
    @valuable = valuable
    @volatile = "Goodbye"
  end

  def _dump(depth)
    @valuable.to_str
  end

  def Special._load(str)
    result = Special.new(str);
  end

  def to_s
    "#{@valuable} and #{@volatile}"
  end
end

a = Special.new("Hello, World")
data = Marshal.dump(a)
obj = Marshal.load(data)
puts obj

Distributed Ruby

Since we can serialize an object or a set of objects into a form suitable for out-of-process storage, we can use this capability for the transmission of objects from one process to another. Couple this capability with the power of networking, and voilà: you have a distributed object system. To save you the trouble of having to write the code, we suggest downloading Masatoshi Seki's Distributed Ruby library (drb) from the RAA.

Using drb, a Ruby process may act as a server, as a client, or as both. A drb server acts as a source of objects, while a client is a user of those objects. To the client, it appears that the objects are local, but in reality the code is still being executed remotely.

A server starts a service by associating an object with a given port. Threads are created internally to handle incoming requests on that port, so remember to join the drb thread before exiting your program.

require 'drb'

class TestServer
  def doit
    "Hello, Distributed World"
  end
end

aServerObject = TestServer.new
DRb.start_service('druby://localhost:9000', aServerObject)
DRb.thread.join # Don't exit just yet!

A simple drb client simply creates a local drb object and associates it with the object on the remote server; the local object is a proxy.

require 'drb'
DRb.start_service()
obj = DRbObject.new(nil, 'druby://localhost:9000')
# Now use obj
p obj.doit

The client connects to the server and calls the method doit, which returns a string that the client prints out:

The initial nil argument to DRbObject indicates that we want to attach to a new distributed object. We could also use an existing object.

Ho hum, you say. This sounds like Java's RMI, or CORBA, or whatever. Yes, it is a functional distributed object mechanism—but it is written in just 200 lines of Ruby code. No C, nothing fancy, just plain old Ruby code. Of course, there's no naming service or trader service, or anything like you'd see in CORBA, but it is simple and reasonably fast. On the 233MHz test system, this sample code runs at about 50 remote message calls per second.

And, if you like the look of Sun's JavaSpaces, the basis of their JINI architecture, you'll be interested to know that drb is distributed with a short module that does the same kind of thing. JavaSpaces is based on a technology called Linda. To prove that its Japanese author has a sense of humor, Ruby's version of Linda is known as “rinda.”

Compile Time? Runtime? Anytime!

The important thing to remember about Ruby is that there isn't a big difference between “compile time” and “runtime.” It's all the same. You can add code to a running process. You can redefine methods on the fly, change their scope from public to private, and so on. You can even alter basic types, such as Class and Object.

Once you get used to this flexibility, it is hard to go back to a static language such as C++, or even to a half-static language such as Java.

Extracted from the book "Programming Ruby - The Pragmatic Programmer's Guide"

Copyright © 2001 by Addison Wesley Longman, Inc. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0 or later (the latest version is presently available at http://www.opencontent.org/openpub/).

Distribution of substantively modified versions of this document is prohibited without the explicit permission of the copyright holder.

Distribution of the work or derivative of the work in any standard (paper) book form is prohibited unless prior permission is obtained from the copyright holder.