View this PageEdit this PageAttachments to this PageHistory of this PageHomeRecent ChangesSearch the SwikiHelp Guide
Hotspots: Admin Pages | Turn-in Site |
Current Links: Cases Final Project Summer 2007

Smalltalk Metaprogramming

Note: A much shinier, more printer-friendly version of this sits at http://www.prism.gatech.edu/~gtg220x/hotdraw_description.pdf.

Introduction

Most newer languages offer some measure of metaprogramming ability. Lisp is based entirely on it(1). Java offers reflection. C++ itself offers nothing, but the Qt(2) toolkit layers on some metaprogramming abilities with the QObject framework. Dynamic languages, however, are particularly suited to the world of metaprogramming, because there are no distractions like types involved. Smalltalk proves this. The Ruby community takes it a step further and argues that ``...there's no natural separation between programming and metaprogramming...''(3).

It is in this context that I will discuss some basic Smalltalk metaprogramming that can be used to simplify code and enforce the principle of DRY (Don't Repeat Yourself).

Smalltalk Things People Miss

Every semester people learn Smalltalk, and, as is the case with any new
language, miss some of its most distinguishing features. So, here are a couple
that are critically important to understanding Smalltalk and its world:

Everything is an object.
More often than not, people don't quite realize what this means. Everything means everything, from actual objects you use to the classes that represent them to the messages inside them to the namespaces that contain them. Everything is an object. We'll see a little bit more of what that means in section 3.
You send messages to objects, you do not call methods on objects.
Why is this important? It's a viewpoint and paradigm difference. Objects are not merely programmatic constructs in your application, they are actors in your application's universe. These actors may run into a message they don't understand, but even then you can know that a message was sent, much like when you tell a person to do something they don't understand and they have to think about it before they tell you they don't understand it. More on this in section 4.
Every class is open to modification.
What significance does this have? Why do I call it critically important? Because it can be critically useful. Every time you type 'hello world', you create a new instance of a String object. Because the String class is open to modification, you can add messages to it that you can use later. For example - need to have something that'll let you repeat a string a certain number of times? Add a *message to the String class. We'll go into this in more depth in section 5.


Introspecting Namespaces

Namespaces are programmatic and structural groupings of classes that are
available at runtime. So this is how we separate classes and such. But, as
containers, they can also offer us a whole new way to simplify our code.

This is best seen as an example. In our most recent project, we had to put
together a graph manipulation program (graph in the sense of nodes and edges,
not in the sense of a plot). Part of the project was the presence of algorithms
that could run on the graphs (or specific nodes and edges therein). And part of
the extra credit was the possibility of adding algorithms.

Now, initially, we made a system whereby we would have an
AlgorithmManager class (we called it AlgorithmFactory, as it was
originally to follow the factory pattern, but things changed). This class's
purpose was to know about all the algorithms and provide an intermediate layer
through which the algorithms could be run. Why an intermediate layer? The
thinking was that at some point in the future this would offer the opportunity
of using algorithm pools much like connection pools for databases, where
multiple instances of the algorithms could be used to run on multiple graphs or
nodes or what have you. Eventually we realized our algorithms were all
stateless, and thus one instance could service multiple requests without too
much trouble. Regardless, that was the design.

Within this scope, our initial plan was to have an addAlgorithm message that
would take a name (for references to the algorithm to run) and a class (to
instantiate the class when appropriate). So one would send
AlgorithmManager addAlgorithm: #BFT withClass: BreadthFirstTraversal,
for example. Then, one could run it by sending AlgorithmManager run:
  1. BFT onNodes: nodes andEdges: edges. This all made sense, until the
realization hit that all of our algorithms were in their own namespace, and that
we weren't coding in C++, we were coding in Smalltalk. Thus, there was a simple
path to avoid the repetition of adding all of the algorithms at startup every
time: figure out the classes in the Algorithms namespace!

So let's have a look at the code we used:

algorithms = nil ifTrue:
[ algorithms := Dictionary new.
(Algorithms organization listAtCategoryNamed: #Algorithms) do: [ :algoName |
algorithms at: algoName put: ((Algorithms at: algoName) new). ] ].


The first thing we see is that no initialization takes place if the
algorithms instance variable is not nil. If it is, then we put a
Dictionary in there and we add things to it. This is where things get
interesting. In this context, Algorithms refers to the namespace of
that name. We send it the organization message, which answers a
NameSpaceOrganizer object. These organizers have various messages, but the one
we're interested in is the listAtCategoryNamed: categoryName message.

This message relies on something you may or may not have noticed in the class
declaration:

Smalltalk.Algorithms defineClass: #Connectivity
superclass: #{Core.Object}
indexedType: #none
private: false
instanceVariableNames: ''
classInstanceVariableNames: ''
imports: ''
category: 'Algorithms'

See that last line? That's where we define the ``category'' for the class. This
is particularly convenient in our case, since the Algorithms namespace
contains the algorithm classes as well as the AlgorithmManager itself. Thus, by
putting all the algorithm classes in a separate category, we eliminate the need
for additional filtering once we get our list.

The last bit of the code does another bit of magic. The list we have returned is
a list of class names (these are always symbols). To get the actual Class
object, we can send the Algorithms Namespace object the at:
message with the name of the class we want out of it as a parameter. once we
have that object, we send it the new message to get an instance of it.

respondsTo and Delegation with doesNotUnderstand

In Java and other statically typed languages, we often find ourselves asking
what type of object we have on our hands. It's not entirely uncommon
(though it is frowned upon) to see code that does:

\begin{lstlisting}[language=Java,basicstyle=\fontfamily{pcr}\selectfont]
if ( m...
...eof AwesomeClass )
// do something
else
// do something else
\end{lstlisting}

The question in a statically typed language is, in short, ``what kind of
object am I dealing with?'' In Smalltalk, this is not the case. In Smalltalk, the
question that is asked is ``does this object respond to this message?'' This is
a critically important distinction, and understanding it is key to understanding
Smalltalk's paradigm and its idioms.

It is in this context that we introduce the wonderful message
Object#respondsTo: aSymbool. This message answers true if the given
object has a message whose name matches the symbol. So if you create a new class
that has a message called size, calling object respondsTo: #size
on an instance of that class will answer true. So now that we have that
basic understanding, we can talk about delegating and the
doesNotUnderstand message.

doesNotUnderstand is the message that gets sent when an object realizes
that it doesn't understand a message (i.e., that it has no message by the given
name). It is a message of class Object, and as such can be overridden in any
child class. The default behavior, in class Object, is to raise an exception
notifying the user that a message was sent that the object didn't understand.
This is useful, but we can tap into the power of this message in other ways,
too. The primary one we'll discuss here is delegation.

Let's go back to the aforementioned Graph program. In our case, we used a
package called HotDraw (which is described in more detail later) to handle the
drawing aspects of our nodes and edges. Here, we ran into an issue: we needed a
class that kept track of both node/edge information and of a graphical
object's information. The typical approach to this is subclassing, but Smalltalk
(at least VisualWorks' implementation) does not support multiple inheritance.
Thus, we needed another approach.

Now, we could also have solved this by creating a subclass of the graphical
objects that also kept track of the actual Node or Edge objects that they were
associated with. However, we wanted to support multiple shapes, and that would
require multiple classes whose extensions to the base HotDraw classes would all
be fairly similar. Plus, when a new HotDraw shape was added, tapping into it
would require writing yet another subclass, and creating yet another replacement
for a HotDraw tool to draw the appropriate figure with the right setup. This
seemed fundamentally nasty, and a horrid violation of the DRY principle.

Thus, we come upon another solution: delegation. We created a single class,
GraphFigure, which set up the basic structure for nodes and edges,
including some delegation code. We'll look into this in a second. Subclassed
from this were NodeFigure and EdgeFigure, each of which held a
reference to a node or an edge, respectively. So what exactly is this delegation
thing?

The idea behind delegation is that the GraphFigure holds a reference to a
regular HotDraw Figure, and then, whatever messages it doesn't
understand, it passes them on to that object. There's a second phase to this,
however. Every node and edge in our program had to support being associated to
an arbitrary other object. Thus, we doubled the delegation up. If that
associated object knew about the message the GraphFigure didn't understand, then
it got the message. If it didn't, then the Figure got the message. If neither
understood it, then we raised an exception as usual. What does this result in?
Say the associated object was an image that we wanted to have associated with
our node. When HotDraw asked the GraphFigure to draw itself, it would ask the
associated object to draw itself, and we'd get the image.

Let's look, then, at the implementation of delegation (or ``proxying'', as we
labeled it). At its core, it simply consists of using doesNotUnderstand
to capture the moments when we miss a message. Here's the body of the
doesNotUnderstand message for GraphFigure:

\begin{lstlisting}[basicstyle=\fontfamily{pcr}\selectfont,breaklines=true]
doesN...
...Message arguments) ].
\par
^ super doesNotUnderstand: aMessage.
\end{lstlisting}

Here, we see a combination of Smalltalk metaprogramming magic and some saucy
respondsTo goodness. First, we check if the object (answered by the object
message on the current object) responds to the selector of the message we didn't
understand. If it does, then we perform that message with the specified
arguments on the object and answer the result. If that doesn't work, then we try
it with the figure. If that doesn't work, either, then we send the superclass
the doesNotUnderstand message. Our superclass being Object, this will result in
the classical behavior of having an exception raised.

We can still have our class behave as a ``subclass'' of sorts by simply
implementing our own message, which will be responded to instead of the
figure's. In particular, we use this in the case of returning HotDraw handles,
which are drawn to allow the user to manipulate the figure - be it by resizing,
moving, or drawing edges between them. Here's the handles message:

\begin{lstlisting}[basicstyle=\fontfamily{pcr}\selectfont,breaklines=true]
handl...
...ndles do: [ :handle \vert handle owner: self ].
\par
^ handles.
\end{lstlisting}

Notice how we send super handles - this isn't an actual message in the
superclass (which is Object); however, it triggers a send of
doesNotUnderstand. Thanks to polymorphism, it's still
GraphFigure's doesNotUnderstand that handles this message, so
we end up getting a proxied send to the figure's handles message. This
answers the handles for the figure, which we then manipulate to ensure that they
report that their owner is this object (i.e., the proxy object) instead of the
figure itself, so that we can get important messages like delete and
connectFromPoint:at:.

Proxying and delegation can be very powerful tools to use, and they're made
particularly easy in Smalltalk. The above approach of proxying every message
send is perhaps a little too open, and a delegation mechanism can be conceived
whereby you specify that you only want certain messages proxied without too much
difficulty. Regardless, it's definitely a tool to have in your toolkit.

Open Classes

Open classes are the last of Smalltalk's greatly facilitating features. There's
something to be said for closed classes, of course. Open classes allow other
programmers to break your encapsulation. They give you plenty enough rope to
hang yourself with, if you aren't careful. But they also give you great power.
Perhaps one of the greatest examples is Ruby on Rails, a web application
framework that extends (as an example) the Fixnum class (used for numbers) with
enough useful methods to allow this construction:

5.days.from_now

This chain of method calls returns a Time object representing the time 5 days
from right now.

So let's talk about how Smalltalk does it. In VisualWorks, you can browse to a
class and just modify the message in place without any trouble. A far wiser
course of action, however, is to find the class, right-click on it, and select
``Override'' $\rightarrow$ ``In Package...'' (or ``In Parcel...''). This
allows you to override anything in the class, extend it, etc, in your own
package. Then, whenever you unload your package, your modifications go with it,
and whenever you load your package, they also accompany it.

Once you've done this, you can extend the classes to your heart's content.
Frustrated that there's no equivalent in OrderedCollection to
JavaScript's ['hello','goodbye'].join(' ') => 'hello goodbye'
functionality? Add one:


joinWith: aString
"Answers a String containing the string representation
of the elements joined by the specified Sring."
| joinStream finalString |
joinStream := WriteStream on: String new.

self do: [ :elt | joinStream nextPutAll: elt asString, aString ].

"Drop the last instance of the string."
finalString := joinStream contents.
^ finalString copyFrom: 1 to: finalString size - aString size.


Want to easily repeat Strings as if they had a multiplier operator like they
do in Ruby? Add one:

* anInteger
	"Answers this string repeated the specified number of times."
	| wholeStr |
	wholeStr := ''.

	anInteger timesRepeat: [ wholeStr := wholeStr, self ].

	^ wholeStr.
And the related unit test:
testMultiplier
	"Tests the String#*message."
	| result |
	result := 'hello' 5.

	self should: [ result = 'hellohellohellohellohello' ].


The possibilities are rather limitless, and open classes can be quite a boon to
building up a Domain Specific Language that can help you express your
application's actions in a cleaner, more natural way. Being more familiar with
the Ruby community, I advise you look at some of the things they have achieved
as they constantly explore how to create DSLs in Ruby.

It also bears mentioning that such extension is an easy path to shooting
yourself in the foot. Be very wary of overwriting messages in the core Smalltalk
libraries, as parts of the development environment itself may rely on the
behavior of those messages, and may break thereafter. If that happens, you're
pretty much out of luck.

Closing

I was going to include a bit about HotDraw, but alas ran out of time. Maybe I'll
add it in at some later time.

Footnotes

(1) Lisp, of course, is one of the oldest languages, but bear with me.

(2) The Qt tookit being a GUI toolkit for C++; see http://www.trolltech.com/.

(3) See http://dablog.rubypal.com/2007/1/7/meta-shmeta-learning-ruby-horizontally for more on this argument.

Links to this Page