22.1.4. Standard Dispatching Macro Character Syntax

Common Lisp the Language, 2nd Edition

Next: The Readtable Up: Printed Representation of Previous: Macro Characters

22.1.4. Standard Dispatching Macro Character Syntax

The standard syntax includes forms introduced by the # character. These take the general form of a #, a second character that identifies the syntax, and following arguments in some form. If the second character is a letter, then case is not important; #O and #o are considered to be equivalent, for example.

Certain # forms allow an unsigned decimal number to appear between the # and the second character; some other forms even require it. Those forms that do not explicitly permit such a number to appear forbid it.

----------------------------------------------------------------
Table 22-4: Standard # Macro Character Syntax

#!  undefined *                #<backspace>  signals error 
#"  undefined                  #<tab>        signals error 
##  reference to #= label      #<newline>    signals error 
#$  undefined                  #<linefeed>   signals error 
#%  undefined                  #<page>       signals error 
#&  undefined                  #<return>     signals error 
#'  function abbreviation      #<space>      signals error 
#(  simple vector              #+      read-time conditional 
#)  signals error              #-      read-time conditional 
#*  bit-vector                 #.      read-time evaluation 
#,  load-time evaluation       #/      undefined 
#0  used for infix arguments   #A, #a  array 
#1  used for infix arguments   #B, #b  binary rational 
#2  used for infix arguments   #C, #c  complex number 
#3  used for infix arguments   #D, #d  undefined 
#4  used for infix arguments   #E, #e  undefined 
#5  used for infix arguments   #F, #f  undefined 
#6  used for infix arguments   #G, #g  undefined 
#7  used for infix arguments   #H, #h  undefined 
#8  used for infix arguments   #I, #i  undefined 
#9  used for infix arguments   #J, #j  undefined 
#:  uninterned symbol          #K, #k  undefined 
#;  undefined                  #L, #l  undefined 
#<  signals error              #M, #m  undefined 
#=  label following object     #N, #n  undefined 
#>  undefined                  #O, #o  octal rational 
#?  undefined *                #P, #p  pathname 
#@  undefined                  #Q, #q  undefined 
#[  undefined *                #R, #r  radix-n rational 
#\  character object           #S, #s  structure 
#]  undefined *                #T, #t  undefined 
#^  undefined                  #U, #u  undefined 
#_  undefined                  #V, #v  undefined 
#`  undefined                  #W, #w  undefined 
#{  undefined *                #X, #x  hexadecimal rational 
#|  balanced comment           #Y, #y  undefined 
#}  undefined *                #Z, #z  undefined    
#~  undefined                  #<rubout> undefined

The combinations marked by an asterisk are explicitly reserved to the user
and will never be defined by Common Lisp.



X3J13 voted in June 1989 (PATHNAME-PRINT-READ) to
specify #P and #p (undefined in the first edition).



----------------------------------------------------------------

The currently defined # constructs are described below and summarized in table 22-4; more are likely to be added in the future. However, the constructs #!, #?, #[, #], #{, and #} are explicitly reserved for the user and will never be defined by the Common Lisp standard.

#\

#\x reads in as a character object that represents the character x. Also, #\name reads in as the character object whose name is name. Note that the backslash allows this construct to be parsed easily by EMACS-like editors.

In the single-character case, the character x must be followed by a non-constituent character, lest a name appear to follow the #\. A good model of what happens is that after #\ is read, the reader backs up over the and then reads an extended token, treating the initial as an escape character (whether it really is or not in the current readtable).

Uppercase and lowercase letters are distinguished after #\; #\A and #\a denote different character objects. Any character works after #\, even those that are normally special to read, such as parentheses. Non-printing characters may be used after #\, although for them names are generally preferred.

#\name reads in as a character object whose name is name (actually, whose name is (string-upcase name); therefore the syntax is case-insensitive). The name should have the syntax of a symbol. The following names are standard across all implementations:

newline         The character that represents the division between lines
space           The space or blank character

The following names are semi-standard; if an implementation supports them, they should be used for the described characters and no others.

rubout          The rubout or delete character.
page            The form-feed or page-separator character
tab             The tabulate character
backspace       The backspace character
return          The carriage return character
linefeed        The line-feed character

In some implementations, one or more of these characters might be a synonym for a standard character; the #\Linefeed character might be the same as #\Newline, for example.

When the Lisp printer types out the name of a special character, it uses the same table as the #\ reader; therefore any character name you see typed out is acceptable as input (in that implementation). Standard names are always preferred over non-standard names for printing.

The following convention is used in implementations that support non-zero bits attributes for character objects. If a name after #\ is longer than one character and has a hyphen in it, then it may be split into the two parts preceding and following the first hyphen; the first part (actually, string-upcase of the first part) may then be interpreted as the name or initial of a bit, and the second part as the name of the character (which may in turn contain a hyphen and be subject to further splitting). For example:

#\Control-Space         #\Control-Meta-Tab 
#\C-M-Return            #\H-S-M-C-Rubout

If the character name consists of a single character, then that character is used. Another may be necessary to quote the character.

#\Control-%             #\Control-Meta-\" 
#\Control-\a             #\Meta->

If an unsigned decimal integer appears between the # and , it is interpreted as a font number, to become the font attribute of the character object (see char-font).

change_begin

X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to replace the notion of bits and font attributes with that of implementation-defined attributes. Presumably this eliminates the portable use of this syntax for font information, although the vote did not address this question directly.

#'

#'foo is an abbreviation for (function foo). foo may be the printed representation of any Lisp object. This abbreviation may be remembered by analogy with the ' macro character, since the function and quote special forms are similar in form.

#(

A series of representations of objects enclosed by #( and ) is read as a simple vector of those objects. This is analogous to the notation for lists.

If an unsigned decimal integer appears between the # and (, it specifies explicitly the length of the vector. In that case, it is an error if too many objects are specified before the closing ), and if too few are specified, the last object (it is an error if there are none in this case) is used to fill all remaining elements of the vector. For example,

#(a b c c c c) #6(a b c c c c) #6(a b c) #6(a b c c)

all mean the same thing: a vector of length 6 with elements a, b, and four instances of c. The notation #() denotes an empty vector, as does #0() (which is legitimate because it is not the case that too few elements are specified).

#*

A series of binary digits (0 and 1) preceded by #* is read as a simple bit-vector containing those bits, the leftmost bit in the series being bit 0 of the bit-vector.

If an unsigned decimal integer appears between the # and *, it specifies explicitly the length of the vector. In that case, it is an error if too many bits are specified, and if too few are specified the last one (it is an error if there are none in this case) is used to fill all remaining elements of the bit-vector. For example,

#*101111     #6*101111     #6*101     #6*1011

all mean the same thing: a vector of length 6 with elements 1, 0, 1, 1, 1, and 1. The notation #* denotes an empty bit-vector, as does #0* (which is legitimate because it is not the case that too few elements are specified).

Compare this to #B, used for expressing integers in binary notation.

#:: #:foo requires foo to have the syntax of an unqualified symbol name (no embedded colons). It denotes an uninterned symbol whose name is foo. Every time this syntax is encountered, a different uninterned symbol is created. If it is necessary to refer to the same uninterned symbol more than once in the same expression, the #= syntax may be useful.
#.: #.foo is read as the object resulting from the evaluation of the Lisp object represented by foo, which may be the printed representation of any Lisp object. The evaluation is done during the read process, when the #. construct is encountered.

X3J13 voted in June 1989 (DATA-IO) to add a new reader control variable, *read-eval*. If it is true, the #. reader macro behaves as described above; if it is false, the #. reader macro signals an error.

The #. syntax therefore performs a read-time evaluation of foo. By contrast, #, (see below) performs a load-time evaluation.

Both #. and #, allow you to include, in an expression being read, an object that does not have a convenient printed representation; instead of writing a representation for the object, you write an expression that will compute the object.

#,: #,foo is read as the object resulting from the evaluation of the Lisp object represented by foo, which may be the printed representation of any Lisp object. The evaluation is done during the read process, unless the compiler is doing the reading, in which case it is arranged that foo will be evaluated when the file of compiled code is loaded. The #, syntax therefore performs a load-time evaluation of foo. By contrast, #. (see above) performs a read-time evaluation. In a sense, #, is like specifying (eval load) to eval-when, whereas #. is more like specifying (eval compile). It makes no difference when loading interpreted code; when code is to be compiled, however, #. specifies compile-time evaluation and #, specifies load-time evaluation.

change_begin
X3J13 voted in January 1989 (SHARP-COMMA-CONFUSION) to remove #, from the language. X3J13 noted that the first edition failed to make it clear that #, can be meaningful only within quoted forms. All sorts of anomalies can arise, including inconsistencies between the interpreter and compiler, if #, is not properly restricted. See load-time-eval.
change_end

#B: #brational reads rational in binary (radix 2). For example, #B1101 == 13, and #b101/11 == 5/3.

Compare this to #*, used for expressing bit-vectors in binary notation.

#O

#orational reads rational in octal (radix 8). For example, #o37/15 == 31/13, and #o777 == 511.

#X

#xrational reads rational in hexadecimal (radix 16). The digits above 9 are the letters A through F (the lowercase letters a through f are also acceptable). For example, #xF00 == 3840.

#nR

#radixrrational reads rational in radix radix. radix must consist of only digits, and it is read in decimal; its value must be between 2 and 36 (inclusive).

For example, #3r102 is another way of writing 11, and #11R32 is another way of writing 35. For radices larger than 10, letters of the alphabet are used in order for the digits after 9.

#nA

The syntax #nAobject constructs an n-dimensional array, using object as the value of the :initial-contents argument to make-array.

The value of n makes a difference: #2A((0 1 5) (foo 2 (hot dog))), for example, represents a 2-by-3 matrix:

0       1       5 
foo     2       (hot dog)

In contrast, #1A((0 1 5) (foo 2 (hot dog))) represents a length-2 array whose elements are lists:

(0 1 5)    (foo 2 (hot dog))

Furthermore, #0A((0 1 5) (foo 2 (hot dog))) represents a zero-dimensional array whose sole element is a list:

((0 1 5) (foo 2 (hot dog)))

Similarly, #0Afoo (or, more readably, #0A foo) represents a zero-dimensional array whose sole element is the symbol foo. The expression #1Afoo would not be legal because foo is not a sequence.

#S

The syntax #s(name slot1 value1 slot2 value2 ...) denotes a structure. This is legal only if name is the name of a structure already defined by defstruct and if the structure has a standard constructor macro, which it normally will. Let cm stand for the name of this constructor macro; then this syntax is equivalent to

#.(cm keyword1 'value1 keyword2 'value2 ...)

where each keywordj is the result of computing

(intern (string slotj) 'keyword)

(This computation is made so that one need not write a colon in front of every slot name.) The net effect is that the constructor macro is called with the specified slots having the specified values (note that one does not write quote marks in the #S syntax). Whatever object the constructor macro returns is returned by the #S syntax.

#P: X3J13 voted in June 1989 (PATHNAME-PRINT-READ) to define the reader syntax #p"..." to be equivalent to #.(parse-namestring "..."). Presumably this was meant to be taken descriptively and not literally. I would think, for example, that the committee did not wish to quibble over the package in which the name parse-namestring was to be read. Similarly, I would presume that the #p syntax operates normally rather than signaling an error when *read-eval* is false. I interpret the intent of the vote to be that #p reads a following form, which should be a string, that is then converted to a pathname as if by application of the standard function parse-namestring.

#n=

The syntax #n=object reads as whatever Lisp object has object as its printed representation. However, that object is labelled by n, a required unsigned decimal integer, for possible reference by the syntax #n# (below). The scope of the label is the expression being read by the outermost call to read. Within this expression the same label may not appear twice.

#n#

The syntax #n#, where n is a required unsigned decimal integer, serves as a reference to some object labelled by #n=; that is, #n# represents a pointer to the same identical (eq) object labelled by #n=. This permits notation of structures with shared or circular substructure. For example, a structure created in the variable y by this code:

(setq x (list 'p 'q)) 
(setq y (list (list 'a 'b) x 'foo x)) 
(rplacd (last y) (cdr y))

could be represented in this way:

((a b) . #1=(#2=(p q) foo #2# . #1#))

Without this notation, but with *print-length* set to 10, the structure would print in this way:

((a b) (p q) foo (p q) (p q) foo (p q) (p q) foo (p q) ...)

A reference #n# may occur only after a label #n=; forward references are not permitted. In addition, the reference may not appear as the labelled object itself (that is, one may not write #n= #n#), because the object labelled by #n= is not well defined in this case.

#+

The #+ syntax provides a read-time conditionalization facility; the syntax is

#+feature form

If feature is ``true,'' then this syntax represents a Lisp object whose printed representation is form. If feature is ``false,'' then this syntax is effectively whitespace; it is as if it did not appear.

The feature should be the printed representation of a symbol or list. If feature is a symbol, then it is true if and only if it is a member of the list that is the value of the global variable *features*.

Compatibility note: MacLisp uses the status special form for this purpose, and Lisp Machine Lisp duplicates status essentially only for the sake of (status features). The use of a variable allows one to bind the features list, when compiling, for example.

Otherwise, feature should be a Boolean expression composed of and, or, and not operators on (recursive) feature expressions.

For example, suppose that in implementation A the features spice and perq are true, and in implementation B the feature lispm is true. Then the expressions on the left below are read the same as those on the right in implementation A:

(cons #+spice "Spice" #+lispm "Lispm" x) (cons "Spice" x) (setq a '(1 2 #+perq 43 #+(not perq) 27)) (setq a '(1 2 43)) (let ((a 3) #+(or spice lispm) (b 3)) (let ((a 3) (b 3)) (foo a)) (foo a)) (cons a #+perq #-perq b c) (cons a c) In implementation B, however, they are read in this way:

(cons #+spice "Spice" #+lispm "Lispm" x) (cons "Lispm" x) (setq a '(1 2 #+perq 43 #+(not perq) 27)) (setq a '(1 2 27)) (let ((a 3) #+(or spice lispm) (b 3)) (let ((a 3) (b 3)) (foo a)) (foo a)) (cons a #+perq #-perq b c) (cons a c)

The #+ construction must be used judiciously if unreadable code is not to result. The user should make a careful choice between read-time conditionalization and run-time conditionalization.

The #+ syntax operates by first reading the feature specification and then skipping over the form if the feature is ``false.'' This skipping of a form is a bit tricky because of the possibility of user-defined macro characters and side effects caused by the #. and #, constructions. It is accomplished by binding the variable *read-suppress* to a non-nil value and then calling the read function. See the description of *read-suppress* for the details of this operation.

change_begin

X3J13 voted in January 1989 (SHARP-COMMA-CONFUSION) to remove #, from the language.

X3J13 voted in March 1988 (SHARPSIGN-PLUS-MINUS-PACKAGE) to specify that the keyword package is the default package during the reading of a feature specification. Thus #+spice means the same thing as #+:spice, and #+(or spice lispm) means the same thing as #+(or :spice :lispm). Symbols in other packages may be used as feature names, but one must use an explicit package prefix to cite one after #+.

#-

#-feature form is equivalent to #+(not feature) form.

#|

#|...|# is treated as a comment by the reader, just as everything from a semicolon to the next newline is treated as a comment. Anything may appear in the comment, except that it must be balanced with respect to other occurrences of #| and |#. Except for this nesting rule, the comment may contain any characters whatsoever.

The main purpose of this construct is to allow ``commenting out'' of blocks of code or data. The balancing rule allows such blocks to contain pieces already so commented out. In this respect the #|...|# syntax of Common Lisp differs from the /*...*/ comment syntax used by PL/I and C.

#<

This is not legal reader syntax. It is conventionally used in the printed representation of objects that cannot be read back in. Attempting to read a #< will cause an error. (More precisely, it is legal syntax, but the macro-character function for #< signals an error.)

The usual convention for printing unreadable data objects is to print some identifying information (the internal machine address of the object, if nothing else) preceded by #< and followed by >.

X3J13 voted in June 1989 (DATA-IO) to add print-unreadable-object, a macro that prints an object using #<...> syntax and also takes care of checking the variable *print-readably*.

#<space>, #<tab>, #<newline>, #<page>, #<return>: A # followed by a whitespace character is not legal reader syntax. This prevents abbreviated forms produced via *print-level* cutoff from reading in again, as a safeguard against losing information. (More precisely, this is legal syntax, but the macro-character function for it signals an error.)
#): This is not legal reader syntax. This prevents abbreviated forms produced via *print-level* cutoff from reading in again, as a safeguard against losing information. (More precisely, this is legal syntax, but the macro-character function for it signals an error.)

Next: The Readtable Up: Printed Representation of Previous: Macro Characters

AI.Repository@cs.cmu.edu