| This assignment is due before 9am on 6 Mar 1998. | 
| Follow the instruction on the course web page for submitting your solution. Submit one solution per group. All your source files should be in the same directory. Do not use subdirectories. To submit your solution, you should copy just your source files to the submission directory. In particular, do not carry out your program development in the submission area. | 
| Points will be given for good documentation and style. See the coding style document on the course web page. For this assignment, there'll be a 60-40 split for correctness and style, respectively. | 
| See the course web page on collaboration and late submission policies. Make sure your working directory is private and cannot be read by others! | 
if else while), delimiters (such as 
         + - * / = ; { >= ), identifiers (such as  x foo 
        nStudents), and integers (such as 8 11 0 178123222432).
	Tokens form the terminal symbols of the context free grammar for the language.
      BigInteger in java.math
    to support such objects.)
    
    
    The basic lexical units or lexemes in SL are the following strings:
    if else while break read write function declare return
      ([a-z] | [A-Z]) ([a-z] | [A-Z] | [0-9])*[0-9][0-9]*
        +  -  *  /  %    <  >  !=  ==  >=  <=    =  
	(others)   {  }  (  )  ;  , 
      re turn is different from return,
    and 123 456 is two integer strings, not one.)  Also, one or more
    whitespaces are required between any adjacent pair of keywords, identifiers,
    and integers.  (Otherwise return 23 would be indistinguishable from
    return23.)
    
    
    This brief description of SL doesn't describe the syntax (how to write
    valid programs) or semantics (what a program actually does) of SL, but
    only the lexicon (the valid lexeme strings) of SL.  However, this is all
    you need to write a lexer for SL.
    
        Token getToken()null).
    
    
    The input stream might contain data that cannot be recognized as any lexeme.
    For example, a quote character is illegal input (there is no quote symbol in SL).
    Likewise, the "lexeme" 123a is illegal since it is not any of the
    valid tokens.  Upon any such error, the getToken() method should
    print an error message giving the line number where the error occurred, and the
    offending lexeme or substring itself.  It should then immediately exit, ending
    program execution.
    xyzzy is encountered one or more times in the input, the symbol table
    should contain exactly one entry for it.  Likewise, if the input contains the
    integer string 1234567890 (perhaps several occurrences), there should
    be one entry for it in the table.  Each entry in the symbol table should also
    contain some means of identifying the associated token type.
    
    This class should include the following public methods:
    insertSymbol(lexeme), where lexeme is some string type.
	(Remember there are many ways to build strings in Java.  Pick an appropriate
	one.)
      lookupSymbol(lexeme), which looks for the given lexeme in the
	table, and returns the table entry if found, and some other indication if not.
      printSymbolTable(), which prints the entire symbol table in some
	readable format.
      lookupSymbol() method should be implemented efficiently.  It should not
    simply search through the entire set of entries.  (Hint: Use a hash table.)
    
        main method for this assignment.  It is just
    a wrapper for exercising the lexer, building the symbol table, and producing
    test output.  It will not be used in future assignments.
    
    In addition to whatever initialization is needed for the Lexer and Token classes,
    the symbol table should be created inside the main method.  The
    main method should then initialize it with entries for all the
    reserved keywords and operator symbols in the language.  It should then repeatedly
    invoke the getToken() method to obtain the next token in the input,
    and insert the associated lexeme in the SymbolTable if not already
    there.  In addition, it should print the lexeme information as described in the
    following section.
    
    main method in class Proj2 should print the following
    information.  For each token returned when it invokes getToken, it
    should print one line containing two strings, the lexeme and the token.
    
    
    To illustrate, let's say the file sample.sl contains the following:
    
	{
	    while (i < 0)))) {i = anIdent234 12
	    if i22!=0 return 4;
        }
    
    Then, after compiling your code into Proj2.class and
    typing cat sample.sl | java Proj2, you should get the following
    output: 
    
	{		{
	while		while
	(		(
	i		Ident
	<		<
	0		Number
	)		)
	)		)
	)		)
	)		)
	{		{
	i		Ident
	=		=
	anIdent234	Ident
	12		Number
	if		if
	i22		Ident
	!=		!=
	0		Number
	return		return
	4		Number
	;		;
	}		}
    
    (We have deliberately chosen the above example input to emphasize that the
    lexical analyzer is concerned merely with separating the tokens, not with
    syntax checking.)
    
    
    Some programs are not lexically correct. For example, since SL doesn't
    allow string literals, the following SL code is lexically incorrect:
    
	{
	    while (i < 0)))) {i = anIdent234 12
	    if i22!=0 return "foobar";
        }
    
    As mentioned earlier your lexical analyzer should, if it detects a lexically
    illegal input, print out an error message.  This should include the source line
    number containing the error, and the erroneous text.  It should then halt execution
    immediately.