%* NEWIO.S
%************************************************************************
%*									*
%*		PC Scheme/Geneva 4.00 Scheme-WEB code			*
%*									*
%* (c) 1985-1988 by Texas Instruments, Inc. See COPYRIGHT.TXT		*
%* (c) 1992 by L. Bartholdi & M. Vuilleumier, University of Geneva	*
%*									*
%*----------------------------------------------------------------------*
%*									*
%*			I/O written in Scheme				*
%*									*
%*----------------------------------------------------------------------*
%*									*
%* Created by: Larry Bartholdi		Date: 1985			*
%* Revision history:							*
%* - 18 Jun 92:	Renaissance (Borland Compilers, ...)			*
%*									*
%*					``In nomine omnipotentii dei''	*
%************************************************************************

\documentstyle[a4,astyped]{article}
\setlength{\oddsidemargin}{0cm}
\setlength{\evensidemargin}{0cm}
\setlength{\topmargin}{0cm}
\setlength{\textheight}{22cm}
\setlength{\textwidth}{16cm}
\newcommand{\scheme}{{\sc GeScheme}}

\title{New Input/Output Routines in \scheme}
\author{Larry Bartholdi}
\date{\today}

\begin{document}
\maketitle

\begin{abstract}
We describe Input/Output routines written in Scheme. Programming
I/O in a high-level language has the advantage of speedy
development, high portablility and safety, as well as being useful
for hammering up a final lower-level version. Nonetheless, performance
of the Scheme routines is shown to be acceptable.
\end{abstract}

\section{Introduction}
This module contains definitions for user-level scheme input/output
routines. The first versions of \scheme\ (in its golden Texas days)
contained I/O routines written in Scheme and in C. For efficiency and
job security purposes, the better of this code has been translated
to assembly language.

Now that portability and stability have outweighted the clock-cycle
counting habits we all had some time ago, it became justified to
develop, at least for prototyping reasons, a purely Scheme-based I/O
library. Only then, with full knowledge of the pros and cons of different
approaches, could it be decided to translate {\em minor} parts of the
code to another more efficient language.

|Read| has traditionally always been programmed in Scheme. It returns
the next item available in the input, be it an atom or a list (maybe
itself containing lists). It calls |Read-atom| that returns the
next atom in the input if a valid atom has been entered;
or a singleton list containing the special syntactic
element entered in case of `|.|', `|'|', etc.

Output is basically made of two routines: |write| and |display|.
|Write| is intended to human interaction and will thus produce the
most compact and elegant representation of an item; a representation
that carries meaning directed towards human intellect, with the
minimum syntactic frosting.
|Display|, on the contrary, is required to produce an ASCII representation
of an item such that it can be re-read by |Read|.

Besides these high-class routines, the library includes |read-line| that
reads a full line from the input, |peek-char| that returns the next
character without removing it from the buffer (of |#F| if no caracter is
available), |unread-char| that reinserts the character just read into
the buffer, and so on. These procedures are not well-liked because they
imply the programmer is not using the full features of an interpreted
language with a powerful command-line shell.

\section{Header}
The purpose of the header is to create an environment that will contain the
new bindings. The exact code will be set later on, inside a |let|, so that
local objects common to all routines can be defined.

(begin
  (define @write)
  (define @display)
  (define @newline)
  (define @write-char)

  (define @read)
  (define @read-atom)
  (define @eof-object?)
  (define @char-ready?)
  (define @peek-char)
  (define @read-char)
  (define @read-line)
  (define @unread-char)

  (define @number->string)
  (define @integer->string)
)

\section{Helpers}
Among the routines coded in low-level (inside the microcode) are
character input and output, string output and a linkage to the
C formatting functions |sprintf| and |sscanf|\footnote{Consult
your local guru on |\%escape| interfacing}.

(define (sprintf . args)
  (%execute (compile `(%esc 39 ,@args))))

(define (sscanf string template)
  (%esc 40 string template))

(define %putchar display)
(define %putstring display)

\section{The bulk of it}
This huge |letrec| contains a few procedures of global interest,
and code for each of the interface routines described above.

|Quotify| surrounds its string argument with the separator (a character),
replacing all occurences of `|\|' by `|\\|' and all occurences of the
separator itself by `|\|{\em separator}'.

|Specchars| is an association list giving the printed representation
of a few special characters. It is used by |write|.

|Strange?| returns whether its argument requires quotification. Strings
are always quotified, but atoms will be quotified only if they contain
weird characters, are `|.|', or generally speaking would not evaluate
to a symbol if re-read unquoted.

|Put-list| prints the tail of a list (the heading `|(|' is printed by the
main routine). Note that this code does not check whether the list
is circular.

|Delimby| is used for input and can be understood as reversing the action
of a |Quotify|: it reads characters until it reaches a specified separator.

|Get-symbol| will return the next symbol from the input (unquoted:
surrounded neither by `\verb+"+' nor by `\verb+||+')

|Get-number| and |Get-number-prefixed| help reading a number prefixed
by `|#|': |Get-number| is called before the `|#|' occured, but when we
already know we're dealing with a number. |Get-number-prefixed| is
called when the sharp has been read, and we want to interpret the following
character. In this version, inexact and exact numbers are handled likewise.

{\footnotesize
(letrec ((quotify (lambda (string separator)
		    (define (s l)
		      (cond ((null? l) (list separator))
			    ((or (equal? (car l) separator)
				 (equal? (car l) #\\))
			     (cons #\\ (cons (car l) (s (cdr l)))))
			    (else (cons (car l) (s (cdr l))))))
		    (list->string (cons separator (s (string->list string))))))
	 (specchars '((#\space . "#\\SPACE")
		      (#\escape . "#\\ESCAPE")
		      (#\tab . "#\\TAB")
		      (#\newline . "#\\NEWLINE")
		      (#\page . "#\\PAGE")
		      (#\return . "#\\RETURN")))
	 (strange? (lambda (s)
		     (or (string->number s)
			 (string-null? s)
			 (equal? s ".")
			 (equal? (substring (string-append s "  ") 0 2) "#\\")
			 (do ((l (string->list s) (cdr l)))
			     ((or (null? l)
				  (not (char=? (car l) (char-upcase (car l))))
				  (member (car l) (list* #\| #\, #\' #\; #\"
							 #\: #\( #\) #\` #\\
							 (map car specchars))))
			      (not (null? l)))))))
	 (put-list (lambda (l printer output)
		     (cond ((null? l) (%putchar #\) output))
			   ((atom? l) (%putstring " . " output)
			    (printer l output) (%putchar #\) output))
			   (else (printer (car l) output)
				 (if (pair? (cdr l)) (%putchar #\space output))
				 (put-list (cdr l) printer output)))))

	 (get-char (lambda (input)
		     (let ((ch (@read-char input)))
		       (cond ((char-whitespace?	ch) (get-char input))
			     ((equal? ch #\;) (begin (@read-line input)
						     (get-char input)))
			     (else ch)))))
	 (delimby (lambda (ch input)
		    (let ((in (@read-char input)))
		      (cond ((@eof-object? in) (error "Unmatched " ch))
			    ((equal? in #\\)
                             (let ((next (@read-char input)))
			       (if (@eof-object? next)
				   (error "Escaped void")
				   (cons next (delimby ch input)))))
			    ((equal? ch in) ())
			    (else (cons in (delimby ch input)))))))
	 (list->symbol (lambda (l)
			 (string->symbol (list->string l))))
	 (get-symbol (lambda (input)
		       (let ((first (@read-char input)))
			 (cond ((@eof-object? first) '())
			       ((char-whitespace? first) '())
			       ((member first '(#\( #\) #\' #\` #\;
                                                #\, #\" #\| #\[ #\]
                                                #\{ #\}))
                                (@unread-char input) '())
			       (else (cons (char-upcase first)
					   (get-symbol input)))))))
	 (get-number (lambda (base input)
		       (let ((first (@read-char input)))
			 (if (equal? first #\#)
			     (get-number-prefixed base input)
			     (let* ((atom (list->string
					    (cons first (get-symbol input))))
				    (number (string->number atom base)))
			       (if (number? number)
				   number
				   (error "Number expected" atom)))))))
	 (get-number-prefixed (lambda (base input)
				(let ((next (char-upcase (@read-char input))))
				  (case next
				    ((#\B) (get-number 2 input))
				    ((#\O) (get-number 8 input))
				    ((#\D) (get-number 10 input))
				    ((#\X) (get-number 16 input))
				    ((#\E #\I) (get-number base input))
				    (else (error "Illegal use of #" next))))))
	 )
  (set! @write
	(lambda (datum . port)
	  (let ((output (cond ((null? port) (current-output-port))
			      ((port? (car port)) (car port))
			      (else (error "Port expected" output)))))
	    (cond ((number? datum) (%putstring (number->string datum) output))
		  ((pair? datum)
		   (%putchar "(" output)
		   (put-list datum @write output))
		  ((null? datum) (%putstring "()" output))
		  ((string? datum) (%putstring (quotify datum #\") output))
		  ((symbol? datum) (let ((datum (symbol->string datum)))
				     (%putstring (if (strange? datum)
						     (quotify datum #\|)
						     datum)
						 output)))
		  ((char? datum)
                   (if (assoc datum specchars)
		       (%putstring (cdr (assoc datum specchars)) output)
		       (%putstring (string #\# #\\ datum) output)))
		  ((vector? datum)
		   (%putstring "#(" output)
		   (do ((i 0 (1+ i)))
		       ((= i (vector-length datum)) (%putstring ")" output))
		       (@write (vector-ref datum i) output)
		       (if (< i (-1+ (vector-length datum)))
			   (%putstring " " output))))
		  )
	    *the-non-printing-object*)))
  (set! @display
	(lambda (datum . port)
	  (let ((output (if (null? port) (current-output-port) (car port))))
	    (cond ((number? datum) (%putstring (number->string datum) output))
		  ((pair? datum) (%putchar #\( output)
		   (put-list datum @display output))
		  ((null? datum) (%putstring "()" output))
		  ((string? datum) (%putstring datum output))
		  ((symbol? datum) (%putstring (symbol->string datum) output))
		  ((char? datum) (%putchar datum output))
		  ((vector? datum) (%putstring "#(" output)
		   (do ((i 0 (1+ i)))
		       ((= i (vector-length datum)) (%putstring ")" output))
		       (@display (vector-ref datum i))
		       (if (< i (-1+ (vector-length datum)))
			   (%putstring " "))))
		  )
	    *the-non-printing-object*)))
  (set! @newline
	(lambda port
	  (let ((output (if (null? port) (current-output-port) (car port))))
	    (%putchar #\newline output)
	    *the-non-printing-object*)))
  (set! @write-char
	(lambda (char . port)
	  (let ((output (if (null? port) (current-output-port) (car port))))
	    (if (char? char)
		(@write char output)
		(error '@WRITE-CHAR "Argument must be char" char))
	    *the-non-printing-object*)))

  (set! @read read)
  (set! @read-atom
	(lambda port
	  (let* ((input (if (null? port) (current-input-port) (car port)))
		 (first (get-char input)))
	    (case first
	      ((#\[ #\] #\{ #\} #\( #\) #\' #\`)
	       (list (list->symbol (list first))))
	      ((#\") (list->string (delimby first input)))
	      ((#\|) (list->symbol (delimby first input)))
	      ((#\,) (list (list->symbol
			     (cons first 
				   (if (member (@peek-char input) '(#\@ #\.))
				       (list (@read-char input))
				       ())))))
	      ((#\#) (let ((next (char-upcase (@read-char input))))
		       (case next
			 ((#\T) #T)
			 ((#\F) #F)
			 ((#\() '(|#(|))
			 ((#\\)
			  (let* ((third (char-upcase (@read-char input)))
				 (atom (list->string (list* first next third
							    (get-symbol input))))
				 (char (assoc atom
					      (map (lambda (l)
						     (cons (cdr l) (car l)))
						   specchars))))
			    (if (null? char)
				(if (= (string-length atom) 3)
				    third
				    (error "Illegal character constant" atom))
				(cdr char))))
			 ((#\!) (list->symbol (list* first next
						     (get-symbol input))))
			 (else (@unread-char input)
			       (get-number-prefixed 10 input)))))
	      (else (if (@eof-object? first)
			first
			(let* ((atom (list->string
				       (cons (char-upcase first)
					     (get-symbol input))))
			       (number (string->number atom)))
			  (cond ((number? number) number)
				((equal? atom ".") '(|.|))
				(else (string->symbol atom))))))))))
  (set! @eof-object? eof-object?)
  (set! @char-ready? char-ready?)
  (set! @peek-char peek-char)
  (set! @read-char read-char)
  (set! @read-line
	(lambda port
	  (let ((input (if (null? port) (current-input-port) (car port))))
	    (do ((l '() (cons char l))
		 (char (@read-char input) (@read-char input)))
		((or (equal? char #\NEWLINE)
		     (equal? char #\RETURN)
		     (@eof-object? char))
		 (if (and (null? l) (@eof-object? char))
		     char
		     (list->string (reverse! l))))))))
  (set! @unread-char unread-char)
)
}

\section{On Numbers and Strings}
Here we devise some auxiliary code that will convert a number to a string
and vice versa. These two routines are |number->string| and |string->number|.
The standard requires that |(string->number| |(number->string n))| be |n|
for all numerical |n|, but the reverse composition does not have the same
property.

The code handles in scheme the special case of integers, as they can only
be dealt with through |bignum| operations, which are not appropriately
callable from the C kernel. Floating-point conversion, though, is done
by calls to |sprintf| and |sscanf|. These has the advantage of being
highly standard, as well as requiring little effort from myself\ldots

{\footnotesize
(letrec ((0-hexa (char->integer #\0))
	 (10-hexa (- (char->integer #\A) 10))
	 (get-base (lambda (args)
		     (if (null? (cdr args))
			 (case (car args)
			   (() 10)
			   ((2 8 10 16) (car args))
			   (else (error "Invalid base" (car args))))
			 (error "Gimme 2 args, please" args))))
	 (integer->string
	   (lambda (num base)
	     (define (integer->list num)
	       (do ((num num (quotient num base))
		    (result '() (let ((digit (remainder num base)))
				  (cons (integer->char (+ digit (if (> digit 9)
								    10-hexa
								    0-hexa)))
					result))))
		   ((zero? num) result)))
	     (cond ((negative? num)
		    (string-append "-" (integer->string (- num) base)))
		   ((zero? num) "0")
		   (else (list->string (integer->list num))))))
	 (list->integer
	   (lambda (lyst base)
	     (call-with-current-continuation
	       (lambda (cont)
		 (let ((negative (equal? (car lyst) #\-)))
		   (do ((lyst (if negative (cdr lyst) lyst) (cdr lyst))
			(result 0 (+ (* result base)
				     (let* ((c (char-upcase (car lyst)))
					    (head (cond ((char>=? c #\A)
							 (- (char->integer c)
							    10-hexa))
							((char>=? c #\0)
							 (- (char->integer c)
							    0-hexa))
							(else (cont #F)))))
				       (if (< head base)
					   head
					   (cont #F))))))
		       ((null? lyst) (if negative (- result) result))))))))
	 )
  (set! number->string
	(lambda (num . args)
	  (let ((base (get-base args)))
	    (cond ((integer? num) (integer->string num base))
		  ((number? num) (if (= base 10)
				     (sprintf "%.16g" num)
				     (error "Invalid base" base)))
		  (else (error "Argument must be number" num))))))
  (set! string->number
	(lambda (str . args)
	  (let* ((base (get-base args))
		 (first-shot (list->integer (string->list str) base)))
	    (if (and (integer? first-shot) (not (equal? str "-")))
		first-shot
		(if (= base 10)
		    (let ((second-shot (sscanf str "%lg%c")))
		      (if (= (length second-shot) 1)
			  (car second-shot)
			  #F))
		    #F)))))
)
}

\section{Testing}
An important step in development was to be sure the replacement routines
were in all ways similar to the previous ones. For this reason, this piece
of code was written. One hands it two procedures (a reader and a writer)
and two file names (an input and an output). The tester will repeatedly
read an item with the reader and write it back with the writer.
|read-atom|, for instance, was tested on itself with\\
\begin{verbatim}
        (test read-atom write "sources.s/newio.sw" "old")
        (test @read-atom write "sources.s/newio.sw" "new")
        (dos-call "" "fc old new || more")
\end{verbatim}

(define (test reader writer in out)
  (define (move i o)
    (let ((a (reader i)))
      (when (not (@eof-object? a))
	    (writer a o)
	    (move i o))))
  (let ((i (open-input-file in))
	(o (open-output-file out)))
    (move i o)
    (close-input-port i)
    (close-output-port o)))

\section{Is it worth it?}
No definite answer can be given to this question. The algorithms
decribed have the same asymptotic behaviour as the previous |.ASM| code.
They do, though, require significant amounts of temporary storage
and thus strain somewhat the garbage collector. Performance could be
enhanced only at the price of re-coding significant amounts of code,
losing the advantages of these small, highly modular routines.

We wish to stress this document, with all its \LaTeX\ text, is less
than 500 lines long. The assembly routines used previously wound well
above the 1500 lines. If writing 10 lines of assembly removes 10
neurons from a programmer's brain, how many neurons are lost by a
10-person team writing these 1500 lines?
\end{document}

