View this PageEdit this PageAttachments to this PageHistory of this PageHomeRecent ChangesSearch the SwikiHelp Guide
Hotspots: Admin Pages | Turn-in Site |
Current Links: Cases Final Project Summer 2007

subversion revision control with squeak

Robert Edmonds
Spring 2006

Revision control with a typical source code management tool like CVS or Subversion is difficult with Squeak for several reasons:

1) Line endings are those used on Mac OS classic (the carriage return). When viewed on a Windows or Unix machine, Squeak's file-outs appear to have no line-breaks (i.e., everything is on one line). Revision control systems are line-based, which poses a problem.

2) Everything is an object in Squeak, so files are only tolerated by the system in limited situations; file-ins and file-outs.

Subversion is a world-class open source SCM tool, so one might be tempted to use it to manage the code output of a 4 person team. We will have to overcome the line-ending problem and establish a good mapping between files and classes.

File-ins have an additional problem in that parent classes must occur earlier in the file-in than child classes or else the class hierarchy will be lost during the import. We solve this problem with a manifest file which lists the order in which files representing our classes will be concatenated together for the generated file-in.

With the technical problems solved, we have to come up with a good workflow. Our team found that working on individual classes was the right level of granularity – once we reached a stable state in our changes of a class, we filed it out (using a handy option on the middle click menu in the system browser) and executed a script that imports newly created file-ins into the revision-controlled source code tree. When we want to check out the latest version and test the code at the tip of trunk (the very latest revision of the checked-in source), we execute another script that reads the manifest file and generates a single file-in for all of our classes.

Here is the manifest file for our Squeakmation package (we also had a Squeakmation-Test package):

AnimationMagic
AnimationMagicMorph
Camera
Event
EventSequence
ParseMagic
Squeakmation
SqueakmationMorph
Visual
CharacterMagic
VisualMorph
PersonMorph
BlankFaceMorph
CharacterMagicMorph
AttributesInputMorph
ScriptData
SpeakerMagic
SoundMagic


We typically added only a few classes per week while working on the project, so it's not an unreasonable amount of overhead to add new classes to the manifest file in the right order (parent classes before child classes). Another way to solve this problem would be to add logic in our export script to generate a dependency graph automatically by reading the class headers.

We used a Makefile to control the flow of code between our Squeak systems and our "working copy" of the source tree. (A working copy is the individual programmer's personal copy of the code; it may not be the latest revision and it may have uncommitted changes.) Our 'fileout' target looks like this:

fileout:
        @echo "=> fileout"
        @lib/import.py local/*.st


'local' is the directory we stored our Squeak images in (which were under individual control of the programmer, not the SCM) and is thus the default directory for our file-outs. A few clicks in the system browser creates a new file-out for a class, and we then execute this make target to introduce the new code into our repository.

Here is our filein target:

filein:
        @echo "=> filein"
        @for pkg in $(PACKAGES); do \
        echo "   src/$$pkg.manifest -> local/FILEIN-$$pkg.st"; \
                lib/export.py src/$$pkg.manifest local/FILEIN-$$pkg.st; \
        done


Our PACKAGES were Squeakmation and Squeakmation-Test, so the manifest file for these two packages are passed to our export script which then creates two files in our 'local' directory named FILEIN-Squeakmation.st and FILEIN-Squeakmation-Test.st. A few clicks in the Squeak environment are then necessary to bring one's image up to date. Note that we had around 200 revisions in our final source tree, so this represents only a few dozen of these import/export cycles per team member per week, so it doesn't end up as onerous as I make it out to be here.

The enterprising reader will note that these Makefile targets just call out scripts in our source tree's 'lib' directory. Our utility scripts were written in Python, which is a decent system scripting language – exactly what we needed to do a little filtering and composition of text files. Here is the import.py script. Comments are inline.

import os
import sys

# this function will find the category of a file-out so that the source can be placed in the correct directory.
def findcat(buf):
    cat = None
    for line in buf.splitlines():
        if line.startswith('\tcategory: '):
            cat = line
    if cat != None:
        left = cat.find("'") + 1
        right = cat.rfind("'")
        return cat[left:right]
    else:
        raise Exception, 'unable to determine class category'

# this function is run on each file-out.
def import_class(classfile):
    # this extracts the class name from the file pathname to the file.
    classname = classfile.split('/')[-1].split('.')[0]

    # we reserve the 'FILEIN-' prefix for our own generated file-ins, so we don't want to re-import these files back into our repository.
    if classname.startswith('FILEIN'):
        return

    # this group of code imports the contents of the file-out into a buffer and then corrects the line-endings into a format that is easily parsed by our source code management tools.
    buf = open(classfile).read()
    buf = buf.replace('\r\n', '\n')
    buf = buf.replace('\r', '\n')
    buf = '\n'.join(buf.splitlines()[1:])

    # now we determine the category and select a directory to place the file in.
    cat = findcat(buf)
    srcdir = 'src/' + cat

    # here we write out the contents of the buffer to the new file.
    fname = '%s/%s.st' % (srcdir, classname)
    print 'importing', fname
    open(fname, 'w').write(buf)

    # and now we remove the original file-out since we've successfully imported the class into our repository.
    print 'removing', classfile
    os.unlink(classfile)

# our main code -- this just calls the import_class() function on each filename specified on the command line.
if __name__ == '__main__':
    if len(sys.argv) >= 2:
        for c in sys.argv[1:]:
            import_class(c)
    else:
        print >>sys.stderr, 'usage: import.py class.st [class2.st] [...] [classn.st]'
        sys.exit(1)


And here we have the export.py script:

import sys

# the function that does all the work
def export_class(manifest, filein):
    # we assume the manifest file will be stored alongside the code in the tree
    srcdir = manifest.split('.')[0]

    # we create an empty list called 'classes'
    classes = []

    # each line in the manifest file represents a different class and thus a different file, so we iterate over each item in the manifest file
    for f in open(manifest).read().splitlines():
        # here we read an individual class into a buffer
        buf = open('%s/%s.st' % (srcdir, f)).read()

        # now we append that buffer into the classes list, while at the same time converting the newlines to the format squeak expects
        classes.append(buf.replace('\n', '\r'))

    # now we write a single file-in
    out = open(filein, 'w')
    # the '\x0c'.join(classes) call will take each item in the classes list and interleave it with hex 0c characters (ASCII form-feed) which is how the squeak file-in/file-out format separates individual classes.
    out.write('\x0c'.join(classes))
    out.close()

# our main code -- first argument is the manifest file, second argument is the filein to generate.
if __name__ == '__main__':
    if len(sys.argv) == 3:
        export_class(sys.argv[1], sys.argv[2])
    else:
        print >>sys.stderr, 'usage: export.py [manifest] [filein.st]'
        sys.exit(1)


This is all that's needed to manage Squeak source code in a manner similar to that in which most people are familiar with in managing Java source code. We make no judgment about the merits of Squeak's 'everything is an object' paradigm versus Java's 'everything is an object which is defined in a file' except to say that we found our system to be highly reliable once we'd worked out the bugs and the greater flexibility Subversion allowed us made difficult things like merging conflicts between two developers quite easy.

Links to this Page