The Design and Implementation of FCL

FCL is an extension of the Tcl language that provides persistence for simple values, transactions, and a rudimentary object system. This document describes the high level design of the system along with some implementation details where appropriate. For a description of the user visible aspects of the system, see the accompanying user manual, FCL: The FAM Command Language.

FCL is implemented in three layers. The lowest level layer is a C interface between the persistent storage manager and Tcl. The second layer is some Tcl code that implements a simple object system. The final layer is a set of abstract classes for design objects, which is implemented in the object system. This document will describe the lower two layers, and will also show where the layers interact. The toplevel layer is described in the user's manual. The following diagram shows the major components of the FCL system.

Figure 1. Architectural Diagram for FCL.

2 Preliminaries

FCL is designed to use several pre-existing components that provide support for persistent objects, an extensible, interpreted command language, and a user interface toolkit. In the current implementation, we use the Exodus system for persistence and the Tcl language for the interpreter and the Tk toolkit for user interfaces. To make our future discussion more concrete, I'll spend some time here talking about each of our system's pre-existing components.

2.1 Exodus

The Exodus system is split into a client part and a server part. The Exodus server is a separate process that provides disk management, transactions, recovery and distributed data access for clients. Exodus clients are built as extensions on top of a runtime library that contains stub routines that call into the server via RPC. FCL is built on top of this library.

In Exodus, all data are stored in volumes, which are managed by one or more servers. A client must use the client library to mount volumes before they can access any data. Mounting a volume will also result in opening a connection to the server that is managing that volume. All operations on data in a volume must be performed within the scope of a transaction. Transactions encapsulate a set operations on a volume which must either succeed or fail atomically. If a transaction succeeds, then the results of the operations in the transaction become visible on the volume. Otherwise, the volume is left in the state that it was in before the transaction began. Transactions also manage concurrent access to the database through a set of locking protocols that make concurrent updates appear to be happening in some consistent, sequential order.(1)

Objects in an Exodus volume are logically clustered into files, which also provide efficient operations for scanning and bulk loading the objects that they contain. Within a file, the library provides primitives for creating, destroying and modifying data objects on the volume. EAch objects comes with its own OID, identifying the volume, file, and file address of the object on disk. FCL provides mechanisms for dealing with OIDs in a fairly transparent, though right now very inefficient, manor. Finally, the library provides indexes that allow applications to build efficient data structures for searching and scanning related objects in a volume.

FCL provides an explicit interface to transactions and objects only. There is a lower level interface to volumes and files that is not seen at the user level. In addition, and interface to indexes is implemented but it is not used in the current system.

2.2 Tcl/Tk

Tcl/Tk is Outserhout's embeddable command language and user interface toolkit system. Tcl provides a rudimentary, string-based command language that is easy to embed into existing applications. Tk is an X-based user interface toolkit providing a large set of widgets that implement a Motif-like "look-and-feel." Together, Tcl and Tk are a fairly easy to use environment for application prototyping, which is why the first cut of the FCL system was implemented using them. For real work, a more robust programming language is needed, along with a higher level toolkit.(2)

2.3 Other things we might steal

Other subsystems that may get integrated eventually include FDR for model checking, the Synthesizer Generator for custom, language based editors, Lucid Emacs for everything else :-), and ToolTalk for event-based tool integration.

3 Exodus Basics

This section describes the parts of the Exodus client library that the are in use in the current system. First, some terminolgy. We will call Exodus objects stored on disk data objects to distinguish them from FCL objects, which have higher level semantics. Data objects are just byte-ranges managed by the storage server. The Exodus library calls that the FCL runtime library uses come in two flavors, those used to manipulate objects, those used for manipulating tables, and those used for other things.

3.1 Object Manipulation

Each FCL object is stored as a string in a data object and is made available through the handle mechanism described above. Internally, the runtime library uses the following structure to represent a reference to a data object:

struct fcl_RefStruct {
 OID myOID;
 USERDESC *myHandle;
};

The myOID field of this record stores the unique object identifier used by Exodus to locate the object in the database. The myHandle field of this record stores a pointer into the client libraries buffer pool. Whenever an object is fetched from disk, the client library fixes the page or pages that the object occupies in the buffer pool and returns a handle to the FCL runtime system. We can then use this handle to manipulate the object in memory. The runtime library uses the following calls for these manipulations:

sm_ReadObject

Given an OID and other information, this fixes a data object in the buffer pool for later use. By default, it also obtains a shared-read lock on the page that the object occupies. If such a lock cannot be obtained, the client library blocks until the object is available.

sm_WriteObject

This call overwrites some portion of a data object with new data and logs the update with the server as part of a transaction. By defualt, it will automatically upgrade any read locks on the object to a write lock. If this lock upgrade can't happen, the client librart blocks. In some cases, the lock upgrade may cause the deadlock detection system in the server to abort the current transaction.

Note that this call cannot change the size of an object. For that we need to use other mechanisms.

sm_AppendToObject

This appends new data on the end of a data object. It is used by the FCL runtime when an object update makes the string representation of the object larger.

sm_DeleteFromObject

This procedure deletesa range of bytes from a data object. When an update causes the string representation of an object to shrink, the FCL runtime uses this call to shrink the data object.

sm_CreateObject

Creates space for a new dat a object on disk, returning its OID.

sm_ReleaseObject

This call tells the client that a particular data object can be flushed from the buffer pool if needed. Since the size of the buffer pool is fixed and limited, the current FCL implementation sacrifices performance for safety by always releasing object handles after every primitive operation. This also guarantees that aliasing of object handles will not cause problems.

3.2 Tables

There is an interface to the Exodus index mechanism, but it isn't currenty in use, so I won't talk about it there. Just ignore that code.

3.3 Other Calls

The other calls that we make into the client library are there for initialization, tranasction management and namspace management. The current runtime library allows the client to open one file, in one server with one buffer group. The client library restricts the client to running a single, flat transaction at a time. The files that are available for use are stored in a small table associated with the volume called the root table. The Exodus server maintains this table.

The higher levels of the FAM library use this primitive namespace to maintain many "logical" design files within a single "physical" Exodus file. This mechanism is described in the Users Manual.

sm_ReadConfigFile

This tells the client library to read the .sm_config file and configure itself appropriately.

sm_ParseCommandLine

Parses the command line for anything related to Exodus.

sm_Initialize

Initializes the client library. Doesn't open a server connection until the server is needed for something.

sm_OpenBufferGroup

Initializes a local buffer cache for our client library. Currently, we open a cache with 100 pages in it.

sm_GetRootEntry

Each Exodus volume has a table of 80 {name, data} pairs in it that clients can use for a rudimentary namespace. This fetches an entry from this table.

sm_SetRootEntry

Create or update an entry in the root table of the current volume.

sm_CreateFile

Create a file in the current volume. This is used by the persist init command to create a physical file if needed.

sm_{Begin,Commit,Abort}Transaction

Opens, commits or aborts the current transaction.

4 The C Layer

The C layer of the FCL runtime implements a Tcl veneer on top of the basic object manipulation calls in the Exodus library. For the most part, the implementation of this layer should be transparent to the higher levels of the FCL implementation. I have tried to keep the interface between Tcl and the C layer as "thin" as possible so that it is easy to implement replacements for it. We have already implemented a version of this layer that keeps all of its object in memory and does not talk to Exodus at all. This is useful for local debugging, and working when the server is broken.

Exodus objects are referenced in FCL through "handles", which are based on the OID of the object. For the sake of clarity, throughout this document, we will call the physical OID an object reference and its user-level form an object handle.

Object handles are actually Tcl commands. The name of each command at the interpreter level depends on the OID of the object in the database. Each commands calls the same dispatch routine which is written in C. The dispatch routine handles a couple of low level built-in methods and passes control off to the main method dispatcher which is written in Tcl. Note that this structure implies that before an object can be used to call methods, its handle must be registered as a command with the interpreter.

The C layer defines the following new commands:

persist <command>

Low level operations in the Exodus client The following sub-commands are available here:

init <filename>- Initializes the database. In addition, it opens the Exodus file called filename if it exists. The available files are stored in the Exodus volume's root table. See the Exodus documentation for how this works.

deref - "Dereferences" object handles. Object handles are encoded OIDs. This command decodes the OID and registers a command for the object so it will be possible to use the handle to dispatch methods.

getroot - Returns the first object in the currently open file.

new <value> - Create a new object the current file on the current volume.

table - Create a new index file on the current volume. Unused.

open, commit, abort - low level transaction control. Unused.

Note: The FCL runtime allows for one open volume and one open file. The user-level initialization of the runtime system manages this automatically, so users should never use the persist command at all.

trans, abort

Transaction control commands. trans opens a transaction block. abort explicitly aborts transactions.

<objectHandle> <method> <args>

Dispatch a method call on the object named by the command. See the user's manual for how this works.

<tableHandle> <command>

Commands for manipulating index files. These are not used. The dispatch works just like the dispatch for objects described above.

4.1 Basic Data Structures

The C layers uses two primitive data structures, one for representing object references and one for representing object values. An object reference is used only at runtime and is never stored to disk, although a machine-readable encoding might be. It is just a record holding the Exodus OID and also a field to hold a pointer to the in-memory copy of the object if it is currently in the Exodus buffer pool.

An object value is the data that is actually stored to disk in a transaction. It contains information about the object's size, reference count, and the actual data in the object. Values are stored as variable sized records with a fixed header and a variable sized data field. In C, this is implemented this way:

struct varRecord {
  HEADER header;
  char data[0];
}

Here, the data field is just a dummy to mark the beginning of the data block. When we create a value, we allocate enough space to contain both the header and the object's data and then use the dereference through the data field to fetch the object's data string.(3) We store objects like this to avoid using an extra level of indirection to get to the object data. This could be expensive, since it would be at the database level.

This design allows us to store an arbitrary simple Tcl value in the database as a string. Therefore, all encodings of persistent application data structures in Tcl code must be based on strings. This is generally the case anyway, but having hooks to Tcl tables would be nice.

Note: It should be possible to set up a transparent hook between Tcl arrays and an Exodus index file using Tcl calls that allow us to watch for all side effects to a variable. Index files and arrays go together naturally.

4.2 Creating and Using Persistent Objects

The persist new command creates a new object on disk with a given value. You say:

set foo [ persist new "some value" ].

The result of persist new is the object handle, which we put into a variable for later reference. To access this handle across sessions, it must be stored in the database somewhere. A side effect of this operation is that the object handle is registered as a command. In addition, a reference record containing the new object's OID is registered with this new command in the clientData portion of the Tcl command record.(4) Tcl specifies that whenever the new command is called, this data will be passed to the object dispatch routine so it knows what object to operate on.

The C layer of the FCL system provides four low level object management commands that are used by the upper layers of the system:

<objHandle> getvalue

Fetches the value of the object referred to in the handle.

<objHandle> setvalue <value>

Sets the value of the object referred to by the handle to <value>

<objHandle> destroy

Removes the disk image of the object referred to by the handle.

<objHandle> release

Releases the object referred to be the handle from the buffer pool. This is not used, as the current implementation never explicitly fixes anything in the buffer pool across more than one method call.

Object handles are encoded as strings of the form "_OBJ(xxx,yyy,zzz,www)" where each of the members to the 4-tuple correspond to a part of the object's OID. When objects are created, this encoding is used to register a command by that name in the Tcl interpreter. In addition, object handles for existing objects must be registered as commands they can be used to run methods.

Note: This mechanism is largely transparent to the higher layers of the system except for the fact that anything built on top of the C library must be careful to use persist deref to register object handles as commands before the objects themselves are referenced. In the current implementation, this shows up in the Tcl code implementing objects and classes. This should be the responsibility of the C portion of the library, probably.

If a non-primitive method is dispatched to an object, the C library calls a Tcl routine called "__fcl__methodDispatch" to resolve the call. This is the main low-level dependency between the C code and the Tcl code. The default dispatch is implemented in Tcl to make it easier to modify and prototype. Eventually, it will be moved to C.

4.3 Transactions

The trans command implements simple single level transactions. You say:

trans { <some Tcl commands > }.

The trans command acts just like a call to eval, but if any error is generated it implicitly aborts the current transaction. In addition, if all the commands succeed, it does an implicit commit. The code can also do an explicit abort using the abort command. This will abort the current transaction and signal an error.

Each instance of FCL can be running one top level transaction at a time. Calls to trans can be nested, in which case only the top-level call actually commits, but any abort aborts all the way to the top level.

Finally, the persist command has hooks for creating transaction blocks that span multiple Tcl commands, but they should not be used.

4.4 Initialization

The persist init command initializes the low level parts of the FCL and establishes a connection with the Exodus storage manager. In addition, it opens up a file buffer and a physical file to do its work in. Available file names are stored in the root table of the Exodus volume that it connects to. This table can hold up to 80 entries, and isn't suitable for production work.

In general, applications should build a higher level convention on top of this command. In our system, the basic database class library defines procedures for storing all designs in one physical file, and keeps its own directory of the logical files available in this space. This directory is stored as a Tcl list in the first page of the physical file.

Note: We desperately need a real namespace for design files.

All of the other FCL commands check for initialization and the existence of transaction before attempting database operations, so if you don't call persist init, you can't do anything.

5 The Object System

The FCL object system provides a simple class-based object system with single inheritance. The system design emphasizes simplicity and flexibility over performance and robustness. In particular, the underlying type system is largely ad-hoc and full of loopholes. The class mechanism and method dispatch code implemented in Tcl for maximum flexibility. They could be re-implemented in C for more performance when we nail down all the aspects of the final object model. The current class system is inspired mostly by Smalltalk, but it has a prototype-based flavor as well.

5.1 Classes

A class is a collection of attributes and code that many objects may share. A class can either be a root class, in which case all of its attributes are newly defined, or it can be a subclass of some other class. We may also call subclasses derived classes, following the C++ terminology. To define a new class, you say:

class <class-name> [ <super-name> ].

If the optional super-name argument is present, then the new class is a subclass of super-name.

In addition to encapsulating a set of common characteristics, classes create objects that are instances of themselves. An instance contains all of the newly defined characteristics of its class, plus everything defined in the superclass of its class and so on. The characteristics that a class can define are an object's slots and its methods. These are described in the following sections.

In the implementation, classes are represented as records stored in Tcl strings. Each record contains the name of the class, the name of its superclass (or "root" if it is a root class) a list of slots that should be in each instance, a list of default values for those slots, and a list of methods defined for each instance. This list structure is rather inefficient, and will eventually be replaced with something reasonable.

Note: Our classes are used in a way that is very close to prototypes in systems like SELF. But, systems like SELF have no classes at all, and use special objects to perform the duties that we have placed in our classes.

In FCL, each object is a member of some class. The primitive methods isa, myClass, in addition to the super procedure provide information about the type of an object at runtime. These are described in the user's manual.

5.2 Slots

Slots are attributes of an object that hold data. The slot command introduces a new slot to a class. If it is given one argument, the default value for the slot is "". Otherwise the second argument is used as the default value for a slot.

Sub-classes can redefine the slots of their super-classes. This has the effect of overriding the default value defined by the super-class. You cannot, however, define two slots by the same name in a particular class.

When an object is created, it is represented as a list with three components. The first is the name of its class, the second is a list of the names of its slots. The third is a list of the values of those slots. The getslot and setslot builtin methods provide access to these values. A slot is fetched by looking it up in the names list and then indexing into the values list based on that result. If an unknown slot name is given, an error occurs.

At creation time, an object is given all the slots defined for its class and recursively for all of the super-classes of its class. This implies that if new slots are later defined for any of these classes, they will not be available for objects created before the new definitions were added.

5.3 Methods and Method Dispatch

Methods are attributes of an object that hold code. The method command introduces a new method. The command's syntax is similar to the proc command. You say:

method <className> <methodName> <methodArgs> <body>

This defines a new method called methodName for the class className which takes the arguments methodArgs and runs the code in body. In addition, every method has an implicit argument called self that refers to the object that the method was dispatched from.

The method can be dispatched from any object that is of the right class. Syntactically, method dispatch is done this way:

<objectHandle> <methodName> args

The method dispatcher looks at the class of the object referred to by the object handle and sees if the method methodName has a definition in that class. If not, the super-class is searched and so on up to the root of the inheritance tree.

Sub-classes can redefine methods that were defined in their super-classes. The effect is to override whatever was there before, including the argument specification. This should probably be checked in future versions. of the system.

Note: Since methods names are not stored directly in an object, re-definitions of methods are available to all new and existing objects. This is inconsistent with the behavior for slots, and should probably be fixed.

Footnotes

(1): For more details on all of this, look at a good databases book.
(2): I'm not saying a better toolkit exists, I don't think one does.
(3): This is a standard, if crufty C trick, though it won't work right in general under C++ due to the fact that C++ hides extra fields in structures.
(4): See the Tcl manual for details on this mechanism.

Peter Su

Table of Contents

1 The Design an Implementation of FCLPeter SuIntroduction