Reflection, ObjectSpace, and Distributed Ruby

One of the many advantages of dynamic languages such as Ruby is the ability to introspect—to examine aspects of the program from within the program itself. Java, for one, calls this feature reflection.

The word “reflection” conjures up an image of looking at oneself in the mirror—perhaps investigating the relentless spread of that bald spot on the top of one's head. That's a pretty apt analogy: we use reflection to examine parts of our programs that aren't normally visible from where we stand.

In this deeply introspective mood, while we are contemplating our navels and burning incense (being careful not to swap the two tasks), what can we learn about our program? We might discover:

Armed with this information, we can look at particular objects and decide which of their methods to call at runtime—even if the class of the object didn't exist when we first wrote the code. We can also start doing clever things, perhaps modifying the program as it's running.

Sound scary? It needn't be. In fact, these reflection capabilities let us do some very useful things. Later in this chapter we'll look at distributed Ruby and marshaling, two reflection-based technologies that let us send objects around the world and through time.

Looking at Objects

Have you ever craved the ability to traverse all the living objects in your program? We have! Ruby lets you perform this trick with ObjectSpace::each_object. We can use it to do all sorts of neat tricks.

For example, to iterate over all objects of type Numeric, you'd write the following.

a = 102.7 b = 95.1 ObjectSpace.each_object(Numeric) {|x| p x }

produces:

95.1 102.7 2.718281828 3.141592654

Hey, where did those last two numbers come from? We didn't define them in our program. If you look in the reference, you'll see that the Math module defines constants for e and PI; since we are examining all living objects in the system, these turn up as well.

However, there is a catch. Let's try the same example with different numbers.

a = 102 b = 95 ObjectSpace.each_object(Numeric) {|x| p x }

produces:

2.718281828 3.141592654

Neither of the Fixnum objects we created showed up. That's because ObjectSpace doesn't know about objects with immediate values: Fixnum, true, false, and nil.

Looking Inside Objects

Once you've found an interesting object, you may be tempted to find out just what it can do. Unlike static languages, where a variable's type determines its class, and hence the methods it supports, Ruby supports liberated objects. You really cannot tell exactly what an object can do until you look under its hood. (Or under its bonnet, for objects created to the east of the Atlantic.)

For instance, we can get a list of all the methods to which an object will respond.

r = 1..10 # Create a Range object list = r.methods list.length 60 list[0..3] ["size", "end", "length", "exclude_end?"]

Or, we can check to see if an object supports a particular method.

r.respond_to?("frozen?") true r.respond_to?("hasKey") false "me".respond_to?("==") true

We can determine our object's class and its unique object id, and test its relationship to other classes.

num = 1 num.id 3 num.class Fixnum num.kind_of? Fixnum true num.kind_of? Numeric true num.instance_of? Fixnum true num.instance_of? Numeric false

Looking at Classes

Knowing about objects is one part of reflection, but to get the whole picture, you also need to be able to look at classes—the methods and constants that they contain.

Looking at the class hierarchy is easy. You can get the parent of any particular class using Class#superclass. For classes and modules, Module#ancestors lists both superclasses and mixed-in modules.

klass = Fixnum begin print klass klass = klass.superclass print " < " if klass end while klass puts p Fixnum.ancestors

produces:

Fixnum < Integer < Numeric < Object [Fixnum, Integer, Precision, Numeric, Comparable, Object, Kernel]

If you want to build a complete class hierarchy, just run that code for every class in the system. We can use ObjectSpace to iterate over all Class objects:

ObjectSpace.each_object(Class) do |aClass| # ... end

Looking Inside Classes

We can find out a bit more about the methods and constants in a particular object. Instead of just checking to see whether the object responds to a given message, we can ask for methods by access level, we can ask for just singleton methods, and we can have a look at the object's constants.

class Demo private def privMethod end protected def protMethod end public def pubMethod end def Demo.classMethod end CONST = 1.23 end Demo.private_instance_methods ["privMethod"] Demo.protected_instance_methods ["protMethod"] Demo.public_instance_methods ["pubMethod"] Demo.singleton_methods ["classMethod"] Demo.constants - Demo.superclass.constants ["CONST"]

Module.constants returns all the constants available via a module, including constants from the module's superclasses. We're not interested in those just at the moment, so we'll subtract them from our list.

Given a list of method names, we might now be tempted to try calling them. Fortunately, that's easy with Ruby.

Calling Methods Dynamically

C and Java programmers often find themselves writing some kind of dispatch table: functions which are invoked based on a command. Think of a typical C idiom where you have to translate a string to a function pointer:

typedef struct { char *name; void (*fptr)(); } Tuple; Tuple list[]= { { "play", fptr_play }, { "stop", fptr_stop }, { "record", fptr_record }, { 0, 0 }, }; ... void dispatch(char *cmd) { int i = 0; for (; list[i].name; i++) { if (strncmp(list[i].name,cmd,strlen(cmd)) == 0) { list[i].fptr(); return; } } /* not found */ }

In Ruby, you can do all this in one line. Stick all your command functions into a class, create an instance of that class (we called it commands), and ask that object to execute a method called the same name as the command string.

commands.send(commandString)

Oh, and by the way, it does much more than the C version—it's dynamic. The Ruby version will find new methods added at runtime just as easily.

You don't have to write special command classes for send: it works on any object.

"John Coltrane".send(:length) 13 "Miles Davis".send("sub", /iles/, '.') "M. Davis"

Another way of invoking methods dynamically uses Method objects. A Method object is like a Proc object: it represents a chunk of code and a context in which it executes. In this case, the code is the body of the method, and the context is the object that created the method. Once we have our Method object, we can execute it sometime later by sending it the message call.

trane = "John Coltrane".method(:length) miles = "Miles Davis".method("sub") trane.call 13 miles.call(/iles/, '.') "M. Davis"

You can pass the Method object around as you would any other object, and when you invoke Method#call, the method is run just as if you had invoked it on the original object. It's like having a C-style function pointer but in a fully object-oriented style.

You can also use Method objects with iterators.

def double(a) 2*a end mObj = method(:double) [ 1, 3, 5, 7 ].collect(&mObj) [2, 6, 10, 14]

As good things come in threes, here's yet another way to invoke methods dynamically. The eval method (and its variations such as class_eval, module_eval, and instance_eval) will parse and execute an arbitrary string of legal Ruby source code.

trane = %q{"John Coltrane".length} miles = %q{"Miles Davis".sub(/iles/, '.')} eval trane 13 eval miles "M. Davis"

When using eval, it can be helpful to state explicitly the context in which the expression should be evaluated, rather than using the current context. You can obtain a context by calling Kernel#binding at the desired point.

class CoinSlot def initialize(amt=Cents.new(25)) @amt = amt $here = binding end end a = CoinSlot.new eval "puts @amt", $here eval "puts @amt"

produces:

$0.25USD nil

The first eval evaluates @amt in the context of the instance of class CoinSlot. The second eval evaluates @amt in the context of Object, where the instance variable @amt is not defined.

Performance Considerations

As we've seen in this section, there are several ways to invoke an arbitrary method of some object: Object#send, Method#call, and the various flavors of eval.

You may prefer to use any one of these techniques depending on your needs, but be aware that eval is significantly slower than the others (or, for optimistic readers, send and call are significantly faster than eval).

require "benchmark" # from the Ruby Application Archive include Benchmark test = "Stormy Weather" m = test.method(:length) n = 100000 bm(12) {|x| x.report("call") { n.times { m.call } } x.report("send") { n.times { test.send(:length) } } x.report("eval") { n.times { eval "test.length" } } }

produces:

user system total real call 0.220000 0.000000 0.220000 ( 0.214065) send 0.210000 0.000000 0.210000 ( 0.217070) eval 2.540000 0.000000 2.540000 ( 2.518311)

System Hooks

A hook is a technique that lets you trap some Ruby event, such as object creation.

The simplest hook technique in Ruby is to intercept calls to methods in system classes. Perhaps you want to log all the operating system commands your program executes. Simply rename the method Kernel::system (This Eiffel-inspired idiom of renaming a feature and redefining a new one is very useful, but be aware that it can cause problems. If a subclass does the same thing, and renames the methods using the same names, you'll end up with an infinite loop. You can avoid this by aliasing your methods to a unique symbol name or by using a consistent naming convention.) and substitute it with one of your own that both logs the command and calls the original Kernel method.

module Kernel alias_method :old_system, :system def system(*args) result = old_system(*args) puts "system(#{args.join(', ')}) returned #{result}" result end end system("date") system("kangaroo", "-hop 10", "skippy")

produces:

Sun Jun 9 00:09:44 CDT 2002 system(date) returned true system(kangaroo, -hop 10, skippy) returned false

A more powerful hook is catching objects as they are created. If you can be present when every object is born, you can do all sorts of interesting things: you can wrap them, add methods to them, remove methods from them, add them to containers to implement persistence, you name it. We'll show a simple example here: we'll add a timestamp to every object as it's created.

One way to hook object creation is to do our method renaming trick on Class#new, the method that's called to allocate space for a new object. The technique isn't perfect—some built-in objects, such as literal strings, are constructed without calling new—but it'll work just fine for objects we write.

class Class alias_method :old_new, :new def new(*args) result = old_new(*args) result.timestamp = Time.now return result end end

We'll also need to add a timestamp attribute to every object in the system. We can do this by hacking class Object itself.

class Object def timestamp return @timestamp end def timestamp=(aTime) @timestamp = aTime end end

Finally, we can run a test. We'll create a couple of objects a few seconds apart and check their timestamps.

class Test end obj1 = Test.new sleep 2 obj2 = Test.new obj1.timestamp Sun Jun 09 00:09:45 CDT 2002 obj2.timestamp Sun Jun 09 00:09:47 CDT 2002

All this method renaming is fine, and it really does work. However, there are other, more refined ways to get inside a running program. Ruby provides several callback methods that let you trap certain events in a controlled way.

Runtime Callbacks

You can be notified whenever one of the following events occurs:

Event Callback Method
Adding an instance method Module#method_added
Adding a singleton method Kernel::singleton_method_added
Subclassing a class Class#inherited
Mixing in a module Module#extend_object

These techniques are all illustrated in the library descriptions for each callback method. At runtime, these methods will be called by the system when the specified event occurs. By default, these methods do nothing. If you want to be notified when one of these events happens, just define the callback method, and you're in.

Keeping track of method creation and class and module usage lets you build an accurate picture of the dynamic state of your program. This can be important. For example, you may have written code that wraps all the methods in a class, perhaps to add transactional support or to implement some form of delegation. This is only half the job: the dynamic nature of Ruby means that users of this class could add new methods to it at any time. Using these callbacks, you can write code that wraps these new methods as they are created.

Tracing Your Program's Execution

While we're having fun reflecting on all the objects and classes in our programs, let's not forget about the humble statements that make our code actually do things. It turns out that Ruby lets us look at these statements, too.

First, you can watch the interpreter as it executes code. set_trace_func executes a Proc with all sorts of juicy debugging information whenever a new source line is executed, methods are called, objects are created, and so on. There's a full description under Kernel::set_trace_func, but here's a taste.

class Test def test a = 1 b = 2 end end set_trace_func proc { |event, file, line, id, binding, classname| printf "%8s %s:%-2d %10s %8s\n", event, file, line, id, classname } t = Test.new t.test

produces:

line prog.rb:11 false c-call prog.rb:11 new Class c-call prog.rb:11 initialize Object c-return prog.rb:11 initialize Object c-return prog.rb:11 new Class line prog.rb:12 false call prog.rb:2 test Test line prog.rb:3 test Test line prog.rb:4 test Test return prog.rb:4 test Test

There's also a method Kernel::trace_var that lets you add a hook to a global variable; whenever an assignment is made to the global, your Proc object is invoked.

How Did We Get Here?

A fair question, and one we ask ourselves regularly. Mental lapses aside, in Ruby at least you can find out exactly “how you got there” by using the method caller, which returns an Array of String objects representing the current call stack.

def catA puts caller.join("\n") end def catB catA end def catC catB end catC

produces:

prog.rb:5:in `catB' prog.rb:8:in `catC' prog.rb:10

Once you've figured out how you got there, where you go next is up to you.

Marshaling and Distributed Ruby

Java features the ability to serialize objects, letting you store them somewhere and reconstitute them when needed. You might use this facility, for instance, to save a tree of objects that represent some portion of application state—a document, a CAD drawing, a piece of music, and so on.

Ruby calls this kind of serialization marshaling. (Think of railroad marshaling yards where individual cars are assembled in sequence into a complete train, which is then dispatched somewhere.) Saving an object and some or all of its components is done using the method Marshal::dump. Typically, you will dump an entire object tree starting with some given object. Later on, you can reconstitute the object using Marshal::load.

Here's a short example. We have a class Chord that holds a collection of musical notes. We'd like to save away a particularly wonderful chord so our grandchildren can load it into Ruby Version 23.5 and savor it, too. Let's start off with the classes for Note and Chord.

class Note attr :value def initialize(val) @value = val end def to_s @value.to_s end end class Chord def initialize(arr) @arr = arr end def play @arr.join('-') end end

Now we'll create our masterpiece, and use Marshal::dump to save a serialized version of it to disk.

c = Chord.new( [ Note.new("G"), Note.new("Bb"), Note.new("Db"), Note.new("E") ] ) File.open("posterity", "w+") do |f| Marshal.dump(c, f) end

Finally, our grandchildren read it in, and are transported by our creation's beauty.

File.open("posterity") do |f| chord = Marshal.load(f) end chord.play "G-Bb-Db-E"

Custom Serialization Strategy

Not all objects can be dumped: bindings, procedure objects, instances of class IO, and singleton objects cannot be saved outside of the running Ruby environment (a TypeError will be raised if you try). Even if your object doesn't contain one of these problematic objects, you may want to take control of object serialization yourself.

Marshal provides the hooks you need. In the objects that require custom serialization, simply implement two methods: an instance method called _dump, which writes the object out to a string, and a class method called _load, which reads a string that you'd previously created and converts it into a new object.

For instance, here is a sample class that defines its own serialization. For whatever reasons, Special doesn't want to save one of its internal data members, “@volatile”.

class Special def initialize(valuable) @valuable = valuable @volatile = "Goodbye" end def _dump(depth) @valuable.to_str end def Special._load(str) result = Special.new(str); end def to_s "#{@valuable} and #{@volatile}" end end a = Special.new("Hello, World") data = Marshal.dump(a) obj = Marshal.load(data) puts obj

produces:

Hello, World and Goodbye

For more details, see the reference section on Marshal.

Distributed Ruby

Since we can serialize an object or a set of objects into a form suitable for out-of-process storage, we can use this capability for the transmission of objects from one process to another. Couple this capability with the power of networking, and voilà: you have a distributed object system. To save you the trouble of having to write the code, we suggest downloading Masatoshi Seki's Distributed Ruby library (drb) from the RAA.

Using drb, a Ruby process may act as a server, as a client, or as both. A drb server acts as a source of objects, while a client is a user of those objects. To the client, it appears that the objects are local, but in reality the code is still being executed remotely.

A server starts a service by associating an object with a given port. Threads are created internally to handle incoming requests on that port, so remember to join the drb thread before exiting your program.

require 'drb' class TestServer def doit "Hello, Distributed World" end end aServerObject = TestServer.new DRb.start_service('druby://localhost:9000', aServerObject) DRb.thread.join # Don't exit just yet!

A simple drb client simply creates a local drb object and associates it with the object on the remote server; the local object is a proxy.

require 'drb' DRb.start_service() obj = DRbObject.new(nil, 'druby://localhost:9000') # Now use obj p obj.doit

The client connects to the server and calls the method doit, which returns a string that the client prints out:

"Hello, Distributed World"

The initial nil argument to DRbObject indicates that we want to attach to a new distributed object. We could also use an existing object.

Ho hum, you say. This sounds like Java's RMI, or CORBA, or whatever. Yes, it is a functional distributed object mechanism—but it is written in just 200 lines of Ruby code. No C, nothing fancy, just plain old Ruby code. Of course, there's no naming service or trader service, or anything like you'd see in CORBA, but it is simple and reasonably fast. On the 233MHz test system, this sample code runs at about 50 remote message calls per second.

And, if you like the look of Sun's JavaSpaces, the basis of their JINI architecture, you'll be interested to know that drb is distributed with a short module that does the same kind of thing. JavaSpaces is based on a technology called Linda. To prove that its Japanese author has a sense of humor, Ruby's version of Linda is known as “rinda.”

Compile Time? Runtime? Anytime!

The important thing to remember about Ruby is that there isn't a big difference between “compile time” and “runtime.” It's all the same. You can add code to a running process. You can redefine methods on the fly, change their scope from public to private, and so on. You can even alter basic types, such as Class and Object.

Once you get used to this flexibility, it is hard to go back to a static language such as C++, or even to a half-static language such as Java.

But then, why would you want to?

Show this content in its own window