CONTENTS
What's a Runtime Environment System?
Example
The NounVerb Runtime environment system
Using NounVerb
----- (1) UNIX_TTY_PLATFORM
----- (2) UNIX_XW_PLATFORM
----- (3) PC_TTY_PLATFORM
----- (4) PC_MVIS_PLATFORM
----- Using the library
The Abstract Data Types in NounVerb
----- Verbs
----- Nouns
----- Reports
----- nstate
----- env
----- Noun information
----- Return Codes
The Functions on NounVerb ADTs
----- Verb Application functions --- provided by the programmer
----- Argument fetching functions --- provided by the programmer
----- Env functions
What's it all for?
What's a Runtime Environment System?
A common experience, especially if you are designing new software utilities in C, is that you write all your clever code, but then you find there's a whole bunch of extra code needed in order to test it, and to provide a simple user interface to it. That wastes time and in the resulting interface code one usually finds rather horrible, inconsistent programming that only the programmer can remember how to use, how not to crash. And even the programmer forgets after a few days.
A decent runtime environment system can help here. If done properly, it consists of a set of C library calls that allow you to set up a user interface without having to do all the tedious bits. With luck, it might also enable automatic development of a crude graphical user interface and greatly ease eventual developments of production quality graphical user interfaces if and when they become necessary.
Note that someone who is a fan of Lisp has good reason to be smug here. There you at least always have some kind of runtime environment support built in. If such a person gets too smug, remember: just ask them to go away and compute the eigenvectors of a 10,000 x 10,000 matrix. That will shut them up.
You have been writing decision tree code, and the major functions you've ended up writing are:
decparams *mk_decparams_from_file(char *filename);
decparams *mk_user_edit_decparams(decparams *dp);
void save_decparams(FILE *s,decparams *dp);
dataset *mk_dataset_from_file(char *filename);
dectree *mk_dectree_from_dataset(decparams *dp,dataset *ds);
expo *mk_expo_from_dectree(dectree *dt);
You now want to put your work into a user interface. Here goes:
env *mk_dectree_env()
{
env *e = mk_minimal_env("Decision Trees");
add_noun(e,DECPARAMS_NN,"Decision Parameters","decparams",
"The parameters used during decision tree training",
env_free_decparams);
add_noun_autocreate(e,DECPARAMS_NN,env_autocreate_decparams);
add_noun(e,DATASET_NN,"The training data","dataset",
"The current training data",
env_free_dataset);
add_noun(e,DECTREE_NN,"Decision Tree","dectree",
"The current decision tree",
env_free_dectree);
add_verb(e,TRAIN_VB,"train","Train up a learning algorithm");
add_action(e,TRAIN_VB,DATASET_NN,env_train_dataset);
/* NB: LOAD_VB doesn't need to be defined as it is in the minimal env */
add_action(e,LOAD_VB,DECPARAMS_NN,env_load_decparams);
add_action(e,LOAD_VB,DATASET_NN,env_load_dataset);
add_action(e,EDIT_VB,DECPARAMS_NN,env_edit_decparams);
add_action(e,SAVE_VB,DECPARAMS_NN,env_save_decparams);
add_action(e,INSPECT_VB,DECTREE_NN,env_inspect_dectree);
return(e);
}
The various functions, which are passed as parameters, with names like "env_train_dataset" are defined as slightly tiresome functions that cast generic (char *) arguments back and forth. Here's an example
int env_train_dataset(env *e,nstate *ns,int vb,int nn,report **r_rep)
{
int rc = OKAY; /* Return code */
decparams *dp;
dataset *ds;
dectree *dt;
if ( rc_ok(rc) ) dp = (decparams *) safe_ref(e,ns,DECPARAMS_NN,&rc);
if ( rc_ok(rc) ) ds = (dataset *) safe_ref(e,ns,DATASET_NN,&rc);
if ( rc_ok(rc) ) dt = mk_dectree_from_dataset(dp,ds);
if ( rc_ok(rc) ) nstate_set(e,ns,DECTREE_NN,(char *) dt);
if ( r_rep != NULL && rc_ok(rc) )
r_rep = mk_report_incfree_expo(mk_expo_from_dectree(dt));
return(rc);
}
To run your user interface, put this in your main:
env *e = mk_dectree_env();
env_interface(e);
free_env(e);
On UNIX, or MSDOS, a console-like command-line interface will appear when it's run. Under NT, a crude but neat-and-tidy graphical user interface will appear. Here's how it might look to use the console version:
Decision Tree> help
Enter help <command> for help on a particular command.
Enter help <object> for help on a particular data object.
Currently legal commands are: help load save inspect edit
Data objects (some undefined now) are: DecisionTree DecParams DataSet
Decision Tree> help decparams
DecParams : The parameters used during decision tree training.
DecParams is currently loaded. Use inspect decparams to view it.
The following commands on decparams are available: load save edit help
Decision Tree> load
QUESTION> Which would you like to load? (1) DecParams (2) DataSet> 2
QUESTION> From what datafile?> polp.dat
Sorry, I can't open "polp.dat" when running the load command on DataSet.
Decision Tree> load DataSet plop.dat
plop.dat loaded successfully.
Dataset has 26 attributes, and 310 datapoints.
By default, the first 25 attributes are treated
as inputs and attribute 26 (called "Died?") is
the output.
Decision Tree> train
Decision Tree trained in 46 seconds. The following
attributes are used in the decision tree:
WasSick WentToHospital HadThreeEars
and the rest were ignored.
Decision Tree>
and so on.
The NounVerb Runtime environment system
This is a proposed environment system, based on our experiences building a prototype of EYE---a memory-based learning blackbox function approximator and experiment designer. It uses a nice metaphor suggested by Tom: There are things hanging around the program you've written (nouns) and there are things you can do to them (verbs). Nouns have obvious relations to instances of an ADT or a class. Verbs have obvious relations to functions (or, for the oopy amongst us, methods or messages).
We'll now look at all this in terms of a proposed C-library implementation.
The NounVerb C library will be available on the following four platforms:
this code will run under Unix and not produce any graphics. Any graphics calls will be ignored, or possibly produce a printed warning
this code will run under Unix and is free to produce graphics using damut/amgr.h graphics routines
this code will run on PCs (compiled by Visual C++) and is controlled from a console-type command-line interface. Any graphics calls will be ignored, or possibly produce a printed warning, or possibly pop up graphics windows, though all control will remain with the standard input.
this code will run on PCs, compiled by Visual C++ as part of a single-document project. It requires the use of (as yet unwritten) visualnv files. There will not be a stdio/stderr input. Though we hope that for debugging convenience there will be a stdout/stderr output console window. User communication uses expos, apicts, lingraphs, and aform interface functions. The programmer is free to produce graphics using damut/apict.h graphics routines.
You will link your code to a nounverb library: clinv for command-line and visualnv for Graphical User Interface. Your will #include the header file "nounverb.h". You will create a data structure called an env with
env *en = mk_minimal_env("My Project Name");
You will then add Nouns and Verbs to the environment, in ways described below. You will then call
env_interface(en)
to run the interface.
And you will then call
free_env(en)
when you're finished. That's basically it, apart from one or two detailettes..
The Abstract Data Types in NounVerb
int vb;
Verbs are represented by non-negative integers. Each of them lies numerically between 0 .. num_verbs(e)-1 inclusive. For each such number you will have #defined a constant such as
#define BUILDTREE_VB (BASE_VB + 0)
#define RECOMMEND_VB (BASE_VB + 1)
#define TRAIN_VB (BASE_VB + 2)
etc..
There will also be some predefined verbs around such as
SAVE_VB , LOAD_VB , INSPECT_VB , EDIT_VB , HELP_VB , SET_VB
int nn;
Nouns are represented by non-negative integers. Each of them lies numerically between 0 .. num_nouns(en)-1 inclusive.
For each such number you will have #defined a constant such as
#define DTREE_NN (BASE_NN + 0)
#define PRINCOMP_MATRIX_NN (BASE_NN + 1)
#define DATASET_NN (BASE_NN + 2)
#define INCOLS_NN (BASE_NN + 3)
#define OUTCOLS_NN (BASE_NN + 4)
..etc.....
There may be some default NN's defined, though I'm not what they'd be.
A report is a typedef'd C structure that contains information to be presented to a user. It may be NULL, denoting no information. It may include one or more expos , apicts , lingraphs , surgraphs and other user-information ADTs. See doc/xdamut.{doc,html} for information on those goodies. (Note: not properly documented yet).
There are/will be functions to display reports.
nstate represents, for each noun, the current value of that noun.
char *nstate_ref(env *e,nstate *ns,int nn);
returns the pointer (represented as a C char pointer) to the data structure associated with the nn'th noun. Casting is used to convert the raw pointer (char *) value into the type appropriate for that noun. One could imagine this piece of code:
dectree *dt = (dectree *) nstate_ref(e,ns,DECTREE_NN);
and one could also imagine the following utility function
dectree *dectree_from_nstate(env *e,nstate *ns)
{
return((dectree *) nstate_ref(e,ns,DECTREE_NN));
}
nstate_ref will return NULL if the chosen noun is not currently
defined.
void nstate_set(env *e,nstate *ns,int nn,char *value);
updates the noun specified by nn to take the value (WARNING--NOT A
COPY OF THE VALUE) pointed to by "value". Assumes on entry that the
previous value is undefined... if not, it's an error.
void nstate_free(env *e,nstate *ns,int nn);
If noun "nn" is defined, frees the value of the noun specified by nn.
Makes no assumptions on entry...if noun not currently defined, does nothing.
char *safe_ref(env *e,nstate *ns,int nn,int *r_rc);
Just the same as nstate_ref, except that if the noun "nn" is NULL,
it sets *r_rc to contain an error code.
The only other nstate functions are: expo *mk_expo_from_nstate(env *e,nstate *ns); void free_nstate(env *e,nstate *ns);
There may also eventually be save_nstate , load_nstate , mk_copy_nstate functions, though these will not necessary load, save, etc all their noun values.
An env represents the set of nouns and verbs available to the user, and pointers to the code to be executed when the user requests that verbs be applied to nouns.
An env contains:
* An array of noun information
* An array of verb information
* A sparse array of (verb,noun) pair information.
Noun information consists of
* the name
* one-line information string
* a free function, that frees a noun value.
* Optionally, an autocreate function that generates noun values according to other noun values. If the noun value is undefined (which is represented by a NULL value) then an autocreate function tries to bring the noun value into existence if it can deduce a good default value from other defined noun values.
Verb information consists of
* the name
* one-line information string
* Optionally a mk_arg function. A mk_arg function has the job of making an argument string to pass to the verb_apply function defined shortly. Often this will consist of bringing up a popup menu to get information, such as a filename to load from, from the user. This is optional. Some verbs don't need arguments and so won't need a mk_arg function.
* An array of (nn , apply_verb function) pairs. Each verb has a set of nouns that it can be applied to. Many verbs, especially user-defined ones, have only one applicable noun, in which case this array will be sized 1.
A return code is represented by a #define'd integer and must be one of the following values:
OKAY Everything executed fine, and I've now finished.
CONT Everything executed fine, but I want to do more work.
ERROR Some other kind of error. Other more specific error codes (e.g. file can't be opened, necessary parameter is undefined) will appear later.
Here are a couple of simple return code functions:
bool rc_ok(int rc)
Returns TRUE if and only if rc==OKAY or rc==CONT
char *name_of_rc(int rc);
Returns a not-to-be-updated and not-to-be freed string explaining the meaning of the rc.
The Functions on NounVerb ADTs
Verb Application functions --- provided by the programmer
int <verb_apply_function>(env *e,nstate *ns,int vb,int nn, char *arg,report **rep);
This is a function written by the programmer to implement a particular verb being applied to a particular noun. As a result of it the nstate might change. The result is a return code, defined above.
vb and nn give the noun-number and verb-number involved. Upon reflection this might appear odd: the verb_apply_function has been written for a particular verb-noun combination so surely it must know what nn and vb values are involved? Passing these values in allows a programmer to write the same verb-apply function for several different noun-verb pairs using case statements to distinguish between nouns or verbs. A poor-man's form of inheritance.
arg gives an optional argument. This must either be NULL or else a NULL terminated string.
The final argument points to a (report *). If it is NULL, that denotes that the caller does not want a report. If it is non-null then a report may be allocated, and stored in the pointer. (If so it must eventually be freed by someone). If no report is generated, NULL will be stored in the pointer. The programmer may assume that if r_rep is non-null, then on entry *r_rep will be NULL.
Here's an example use:
report *rep;
int rc = env_train_dectree(e,ns,TRAIN_VB,DECTREE_NN,NULL,&rep);
if ( rep == NULL ) { /* Then no report was provided */ }
...
...
free_report(rep);
Argument fetching functions --- provided by the programmer
char *<mk_arg_function>(env *e,nstate *ns,int vb,int nn);
This is a function written by the programmer to implement the obtaining of an argument. Often, it will be obtained by querying a user.
It is anticipated that only a few mk_arg_function's will need to be written: by accessing their vb , nn and prompt fields they will decide what to do. (e.g. if vb==LOAD_VB, a generic string requester will ask the user which file to load xxx from, where xxx is the name of the noun involved).
env *mk_minimal_env(char *project_title)
void free_env(env *e)
expo *mk_expo_from_env(env *e)
env *mk_copy_env(env *e)
void add_noun(env *e,
int nn,
char *name,
char *type,
char *oneline,
void (*free_fn)(env *e,nstate *ns,int nn)
);
Adds the noun with noun-number "nn" to the environment. The name for this noun is "name", and its C type is "type". A oneline string explaining what the noun does is in "oneline".
"free_fn" frees the current noun value (it may assume that the current noun value is defined) and makes it undefined.
void add_noun_autocreate(env *e,int nn,
void (*autocreate_fn)(env *e,nstate *ns,int nn));
"create_fn" tries to make and store a value for the current noun value as a function of other noun values in the nstate. (it may assume that on entry the current noun value is undefined. On exit it may or may not be defined).
void add_verb(env *e,
int vb,
char *name,
char *oneline
);
Defines a new verb in the environment. Adds a one-line description
and a name.
add_action(e,TRAIN_VB,DATASET_NN,env_train_dataset);
void add_action(env *e,
int vb,
int nn,
int (*act_fn)(env *e,nstate *ns,int vb,int nn,report **r_rep)
);
Adds an action (a noun-verb pair) to the environment. This is what is
called when someone asks to apply verb "vb" to noun "nn". Sometimes the
noun involved seems irrelevant: perhaps an action affects many nouns in
the system. In this case the programmer must nevertheless pick one noun
to associate with the action. It will be harmless to do so.
void add_mk_arg(env *e,int vb,char_ptr (*mk_arg_fn)(env *e,nstate *ns,
int nn,int vb));
int num_verbs(env *e)
int num_nouns(env *e)
char *mk_arg(env *e,nstate *ns,int vb,int nn)
Finds the appropriate argument for calling the action involving vb and nn
in the current nstate. Usually this will be by requesting information from
the user. If there is no argument required, or of no action involving
vb and nn is defined, returns NULL.
int env_apply(env *e,nstate *ns,int vb,int nn,char *arg,report **r_rep)
Applies verb vb to noun nn. arg must be NULL or else a zero-terminated string. r_rep may be NULL (denoting a lack of interest on the part of the caller in getting a report). If not NULL, then upon exit, *r_rep will point to a NEWLY MALLOCKED report.
Returns a return code.
If the requested verb-noun pair isn't defined, returns an error code and if r_rep is non-null issues a suitable complaint in r_rep.
If the arg is NULL and the vb-noun-pair expects an argument, returns an error code and if r_rep is non-null issues a suitable complaint in r_rep.
char *verb_name(env *e,int vb)
char *noun_name(env *e,int nn)
bool noun_is_defined(env *e,nstate *ns,int nn);
bool env_autocreate(env *e,nstate *ns,int nn);
Must not be called if nn is currently defined.
Returns TRUE if it succeeds in creating the nn.
int env_arg_and_apply(env *e,nstate *ns,int vb,int nn,report **r_rep)
This function calls the arg-gatherer if necessary in order to find
a required arg for calling the apply function.
int env_noun_arg_and_apply(env *e,nstate *ns,int vb,report **r_rep)
This function checks to see how many nouns the verb is applicable
to:
If none, returns and error code, and if r_rep is non-null, reports the problem.
If one, goes ahead and calls env_arg_and_apply.
If more than one, queries the user for which noun to apply the verb to.
Once you've written your code and put it in a noun-verb interface, and once the necessary supporting software is written, you'll be able to...
* Run your code on Unix in a teletype mode similar to (gdb) operation or a lisp run-time system
* Run your code under MSDOS, Windows, Windows95, WindowsNT in teletype fashion as above.
* Run your code under Window95 and WindowsNT in a Graphical User Interface manner.
* Run your code as a CGI server under Netscape (if and when such supporting libraries are written)
* Your lower level functions can produce formatted text reports mixed with line/circle/string color diagrams mixed with line graphs mixed with histograms mixed with surface plots mixed with scattergrams etc..
* The user interface can permit histories, easy data saving, loading etc..
* The user interface can permit scripts of commands, custom initialization etc.
* The code you write can be used by programmers of fancy GUI systems without them needing to understand the details of your code.
* Most importantly, it will be easy to compatibalize your code: we have made no substantial platform commitments in this environment: merely the use of ANSI C, with the optional availability of a graphics window to draw in. If program-commercializers wish to make your code run under a fancier user interface style you won't need to rewrite bits of your own code.