Catacomb is a tool for generating and optimizing customized runtime libraries. The customized code takes advantage of parameters known at compile time, and solves the problem of composing multiple invocations of a runtime library. Catacomb is a compile-time module that interfaces with the source compiler. Instead of generating a call to a runtime library, the source compiler passes the parameters to Catacomb. Catacomb generates a customized C function to do the operation at run time, and the source compiler emits a call to this C function.
To generate the customized C code, Catacomb reads in external "stub files". The stub files contain both code to be generated and control constructs to direct the code generation. The control constructs, which Catacomb executes at compile time, can test properties of the input parameters, such as whether they are constants, symbols, array references, etc.; the control constructs also do the work (that the source compiler previously had to do) of composing multiple invocations of the algorithm.
The next section describes how to program in the Catacomb language. This assumes that the source compiler has already invoked a Catacomb stub with the appropriate arguments. The section thereafter describes the details of how the source compiler interfaces with Catacomb.
Catacomb executes the control constructs in a stub in the context of control variables and data variables. Data variables are the normal C variables that will be part of the emitted code. Control variables are only used during Catacomb's execution of the control constructs.
Control constructs consist of the control assignment statement (setting the value of a control variable), the control-IF statement (executing one branch and throwing out the other), the control-WHILE statement (repeat control execution of some code until the condition becomes false), the INCLUDE statement (execute another stub, passing arguments to it), and declarations (declare the type of a data variable, possibly in relation to the type of another variable).
Control variables are either global or local. Local variables are the symbols declared as LOCAL, and the stub arguments. References to other control variables are global. Control variables are statically scoped. When an INCLUDE is executed, Catacomb creates a new local symbol table, and the original local table is restored at the end of the INCLUDE execution. A stub can return values to the caller through "var" parameters. Var parameters are passed using "call by value-result" (i.e., copied in at the beginning and copied out at the end), and non-var parameters are passed using "call by value".
Catacomb stubs contain normal C code surrounded by control constructs. Not all C constructs are supported, but the syntax and semantics of the supported C constructs are the same as in the C language. The following C constructs are absent from Catacomb.
stub: "STUB" identifier ( stubvar-list ) depth-opt local-opt { stmts }
stubvar: identifier
| "var" identifier
depth: "DEPTH" integer ;
local: "LOCAL" identifier-list ;
Example:
STUB example(input1, input2, var result)
DEPTH 5;
LOCAL tmp1, tmp2, idx;
{
/* lines of code */
}
The stub is declared on the first line. The stub name must be a
normal identifier, as must its arguments. "var" arguments are copied
back to the caller at the end of stub execution. An optional "depth"
line specifies the maximum depth that this stub may be recursively
included (see the include section below for details). The last line
declares an optional list of scratch control variables, which are
local to the stub. The "LOCAL" variables plus the stub arguments
constitute the set of local control variables.
control_construct: cifblock
| cwhileblock
| include
| decl
| ctrlassn
cifblock: "cif" ( expr ) stmt
| "cif" ( expr ) stmt "else" stmt
cwhileblock: "cwhile" ( expr ) stmt
include: "include" simplevar ( expr_no_comma-list ) ;
decl: <see below>
ctrlassn: expr := expr ;
Examples:
int a[10][20];
int fun1();
double b;
void fun2();
cif (i == 3)
{
j := 0;
cwhile (j < 10)
{
include teststub(i*2,j);
j := j + 1;
}
}
A typical use of the control-WHILE construct is to loop over the dimensions of an array, producing a similar piece of code for each dimension.
A stub is allowed to include itself recursively. (In fact, this is the only way to create a loop nest whose depth is dependent on an input parameter.) Such a recursive include needs to be guarded by a control construct, usually a control-IF. But it is likely that once in a while, the stub programmer will accidentally use a regular IF statement to guard the recursion, which does not guard it at all. This programming error results in infinite inclusion. As a safeguard against infinite inclusion, Catacomb imposes an upper limit on how many instances of a stub may be recursively active. The default maximum depth is 20. Note that this limit applies separately to each stub; e.g., if two stubs are involved in a recursive include chain, the total include depth for the two of them combined can be up to 40. If the default of 20 is too small (or too large), the stub header allows the DEPTH specification, which sets the maximum number of times that this stub may be recursively active.
The left portion of a concatenation expression must evaluate to a variable name. The right portion must evaluate to either an integer constant or a string constant. If the right portion is an integer constant, a new variable is created by taking the left variable name and appending an underscore character and the integer. If the right portion is a string constant, a new variable is created by taking the left variable name and appending the string. For example, x#3 becomes x_3, x#"blah" becomes xblah, and x#(1+2)#"blah" becomes x_3blah.
Catacomb provides several external functions to handle this need. Each external function has a name (a normal identifier). When Catacomb sees a function call in a stub, it checks whether the function name matches one of the external functions. If so, it processes the function arguments, evaluates the external function with the arguments, and replaces the function call with the result of the external function evaluation. For example, CONSTANT(3) would be replaced by 1, and CONSTANT(x[3]) would be replaced by 0.
A list of the standard external functions, along with documentation, is provided elsewhere. Note that this is not necessarily a complete list, because Catacomb is designed so that new data structures, along with associated external functions for accessing the data structures, can be easily added.
Even if a control construct appears within a non-control construct that is obviously not executed, Catacomb still executes the control construct. For example, in the following code:
x := 0;
if (0) {
...
x := 1;
}
The value of control variable x at the end is 1, even though the code
in the IF branch clearly will not be executed at run time.
In this sense, Catacomb acts much like a macro processor, in that it follows its own sequential flow of control when doing substitutions. In fact, the INCLUDE construct is similar to the C preprocessor's #include statement, the control assignment statement is similar to #define, and the control-IF is similar to #if or #ifdef.
But Catacomb has a feature that makes it much more powerful than a macro processor. In conjunction with the processing of the control constructs, Catacomb also analyzes the C code and does constant and copy propagation. If a data variable appears within the condition of a control construct, and Catacomb can prove that the data variable has a valid propagation value at that point, the propagation value will be used in the condition instead. For example, consider the code:
In the CONSTANT(x) test, the value of 3 will be substituted for x and the special_case stub will be included, even though x is not a control variable. (CONSTANT() is an external function that tests whether its input evaluates to a constant. It always returns either 0 or 1.)x = 3; cif (CONSTANT(x)) include special_case(x); else include common_case(x);
A useful construct to use in stub programming is:
We can't omit CONSTANT(v) from the condition, because if v is not a constant, then "v==3" also will not evaluate to a constant.cif (CONSTANT(v) && v == 3) /* Do code for the special v==3 case. */ else /* Do the general-purpose code. */
The C-like syntax of the stub files makes it relatively simple to use emacs for automatically indenting and highlighting the stub code. To make this work correctly using cc-mode under xemacs, put the following in your .emacs file.
(setq auto-mode-alist (append '(("\\.ccom$" . c-mode)) auto-mode-alist))
(require 'font-lock)
(add-hook 'c-mode-hook
'(lambda ()
(setq c-conditional-key
"\\b\\(for\\|if\\|cif\\|do\\|else\\|while\\|cwhile\\|switch\\)\\b[^_]")
))
(setq c-font-lock-keywords-2
(cons
"[ \t]\\(cwhile\\|cif\\|else cif\\)[ \t\n(){};,]\\|[ \t]include[ \t]"
c-font-lock-keywords-2))
(defun c-backward-to-start-of-if (&optional lim)
;; Move to the start of the last "unbalanced" if and return t. If
;; none is found, and we are looking at an if clause, nil is
;; returned. If none is found and we are looking at an else clause,
;; an error is thrown.
(let ((if-level 1)
(here (c-point 'bol))
(case-fold-search nil)
(lim (or lim (c-point 'bod)))
(at-if (looking-at "\\(c\\|\\)if\\b[^_]")))
(catch 'orphan-if
(while (and (not (bobp))
(not (zerop if-level)))
(c-backward-syntactic-ws)
(condition-case nil
(backward-sexp 1)
(error
(if at-if
(throw 'orphan-if nil)
(error "No matching `if' found for `else' on line %d."
(1+ (count-lines 1 here))))))
(cond
((looking-at "else\\b[^_]")
(setq if-level (1+ if-level)))
((looking-at "\\(c\\|\\)if\\b[^_]")
;; check for else if... skip over
(let ((here (point)))
(c-safe (forward-sexp -1))
(if (looking-at "\\<else\\>[ \t]+\\<\\(c\\|\\)if\\>")
nil
(setq if-level (1- if-level))
(goto-char here))))
((< (point) lim)
(setq if-level 0)
(goto-char lim))
))
t)))