239 lines
		
	
	
		
			10 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			239 lines
		
	
	
		
			10 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| 		  awklisp: a Lisp interpreter in awk
 | |
| 			     version 1.2
 | |
| 
 | |
| 			     Darius Bacon
 | |
| 			 darius@accesscom.com
 | |
| 		  http://www.accesscom.com/~darius/
 | |
| 
 | |
| 
 | |
| 1. Usage
 | |
| 
 | |
|     mawk [-v profiling=1] -f awklisp <optional-Lisp-source-files>
 | |
| 
 | |
| The  -v profiling=1  option turns call-count profiling on.
 | |
| 
 | |
| If you want to use it interactively, be sure to include '-' (for the standard 
 | |
| input) among the source files.  For example:
 | |
| 
 | |
|     mawk -f awklisp startup -
 | |
| 
 | |
| It should work with nawk and gawk, too, but even less quickly.
 | |
|     
 | |
| 
 | |
| 2. Overview
 | |
| 
 | |
| This program arose out of one-upmanship.  At my previous job I had to
 | |
| use MapBasic, an interpreter so astoundingly slow (around 100 times
 | |
| slower than GWBASIC) that one must wonder if it itself is implemented
 | |
| in an interpreted language.  I still wonder, but it clearly could be:
 | |
| a bare-bones Lisp in awk, hacked up in a few hours, ran substantially
 | |
| faster.  Since then I've added features and polish, in the hope of
 | |
| taking over the burgeoning market for stately language
 | |
| implementations.
 | |
| 
 | |
| This version tries to deal with as many of the essential issues in
 | |
| interpreter implementation as is reasonable in awk (though most would
 | |
| call this program utterly unreasonable from start to finish, perhaps...).
 | |
| Awk's impoverished control structures put error recovery and tail-call
 | |
| optimization out of reach, in that I can't see a non-painful way to code
 | |
| them.  The scope of variables is dynamic because that was easier to 
 | |
| implement efficiently.  Subject to all those constraints, the language
 | |
| is as Schemely as I could make it: it has a single namespace with 
 | |
| uniform evaluation of expressions in the function and argument positions,
 | |
| and the Scheme names for primitives and special forms.
 | |
| 
 | |
| The rest of this file is a reference manual.  My favorite tutorial would be
 | |
| _The Little LISPer_ (see section 5, References); don't let the cute name
 | |
| and the cartoons turn you off, because it's a really excellent book with 
 | |
| some mind-stretching material towards the end.  All of its code will work
 | |
| with awklisp, except for the last two chapters.  (You'd be better off
 | |
| learning with a serious Lisp implementation, of course.)
 | |
| 
 | |
| The file Impl-notes in this distribution gives an overview of the 
 | |
| implementation.
 | |
| 
 | |
| 
 | |
| 3. Expressions and their evaluation
 | |
| 
 | |
| Lisp evaluates expressions, which can be simple (atoms) or compound (lists).
 | |
| 
 | |
| An atom is a string of characters, which can be letters, digits, and most
 | |
| punctuation; the characters may -not- include spaces, quotes, parentheses,
 | |
| brackets, '.', '#', or ';' (the comment character).  In this Lisp, case is
 | |
| significant ( X  is different from  x ).
 | |
| 
 | |
| Atoms:		atom 42 1/137 + ok? hey:names-with-dashes-are-easy-to-read
 | |
| Not atoms: 	don't-include-quotes 	(or spaces or parentheses)
 | |
| 
 | |
| A list is a '(', followed by zero or more objects (each of which is an atom
 | |
| or a list), followed by a ')'.
 | |
| 
 | |
| Lists:		()   (a list of atoms)	((a list) of atoms (and lists))
 | |
| Not lists:	)    ((())		(two) (lists)
 | |
| 
 | |
| The special object  nil  is both an atom and the empty list.  That is,
 | |
| nil = ().  A non-nil list is called a -pair-, because it is represented by a
 | |
| pair of pointers, one to the first element of the list (its -car-), and one to
 | |
| the rest of the list (its -cdr-).  For example, the car of ((a list) of stuff)
 | |
| is (a list), and the cdr is (of stuff).  It's also possible to have a pair
 | |
| whose cdr is not a list; the pair with car A and cdr B is printed as (A . B).
 | |
| 
 | |
| That's the syntax of programs and data.  Now let's consider their meaning.  You
 | |
| can use Lisp like a calculator: type in an expression, and Lisp prints its 
 | |
| value.  If you type 25, it prints 25.  If you type (+ 2 2), it prints 4.  In 
 | |
| general, Lisp evaluates a particular expression in a particular environment
 | |
| (set of variable bindings) by following this algorithm:
 | |
| 
 | |
| If the expression is a number, return that number.
 | |
| 
 | |
| If the expression is a non-numeric atom (a -symbol-), return the value of that
 | |
| symbol in the current environment.  If the symbol is currently unbound, that's
 | |
| an error.
 | |
| 
 | |
| Otherwise the expression is a list.  If its car is one of the symbols: quote, 
 | |
| lambda, if, begin, while, set!, or define, then the expression is a -special-
 | |
| -form-, handled by special rules.  Otherwise it's just a procedure call,
 | |
| handled like this: evaluate each element of the list in the current environment,
 | |
| and then apply the operator (the value of the car) to the operands (the values
 | |
| of the rest of the list's elements).  For example, to evaluate (+ 2 3), we
 | |
| first evaluate each of its subexpressions: the value of + is (at least in the
 | |
| initial environment) the primitive procedure that adds, the value of 2 is 2,
 | |
| and the value of 3 is 3.  Then we call the addition procedure with 2 and 3 as
 | |
| arguments, yielding 5.  For another example, take (- (+ 2 3) 1).  Evaluating
 | |
| each subexpression gives the subtraction procedure, 5, and 1.  Applying the
 | |
| procedure to the arguments gives 4.
 | |
| 
 | |
| We'll see all the primitive procedures in the next section.  A user-defined 
 | |
| procedure is represented as a list of the form (lambda <parameters> <body>),
 | |
| such as (lambda (x) (+ x 1)).  To apply such a procedure, evaluate its body
 | |
| in the environment obtained by extending the current environment so that the
 | |
| parameters are bound to the corresponding arguments.  Thus, to apply the above
 | |
| procedure to the argument 41, evaluate (+ x 1) in the same environment as the
 | |
| current one except that x is bound to 41.
 | |
| 
 | |
| If the procedure's body has more than one expression -- e.g., 
 | |
| (lambda () (write 'Hello) (write 'world!)) -- evaluate them each in turn, and
 | |
| return the value of the last one.
 | |
| 
 | |
| We still need the rules for special forms.  They are:
 | |
| 
 | |
| The value of (quote <x>) is <x>.  There's a shorthand for this form: '<x>.
 | |
| E.g., the value of '(+ 2 2) is (+ 2 2), -not- 4.
 | |
| 
 | |
| (lambda <parameters> <body>) returns itself: e.g., the value of (lambda (x) x)
 | |
| is (lambda (x) x).
 | |
| 
 | |
| To evaluate (if <test-expr> <then-exp> <else-exp>), first evaluate <test-expr>.
 | |
| If the value is true (non-nil), then return the value of <then-exp>, otherwise
 | |
| return the value of <else-exp>.  (<else-exp> is optional; if it's left out, 
 | |
| pretend there's a  nil  there.)  Example: (if nil 'yes 'no) returns no.
 | |
| 
 | |
| To evaluate (begin <expr-1> <expr-2>...), evaluate each of the subexpressions
 | |
| in order, returning the value of the last one.
 | |
| 
 | |
| To evaluate (while <test> <expr-1> <expr-2>...), first evaluate <test>.  If 
 | |
| it's nil, return nil.  Otherwise, evaluate <expr-1>, <expr-2>,... in order,
 | |
| and then repeat.
 | |
| 
 | |
| To evaluate (set! <variable> <expr>), evaluate <expr>, and then set the value
 | |
| of <variable> in the current environment to the result.  If the variable is
 | |
| currently unbound, that's an error.  The value of the whole set! expression
 | |
| is the value of <expr>.
 | |
| 
 | |
| (define <variable> <expr>) is like set!, except it's used to introduce new
 | |
| bindings, and the value returned is <variable>.
 | |
| 
 | |
| It's possible to define new special forms using the macro facility provided in
 | |
| the startup file.  The macros defined there are: 
 | |
| 
 | |
| (let ((<var> <expr>)...) 
 | |
|   <body>...)
 | |
|   
 | |
|   Bind each <var> to its corresponding <expr> (evaluated in the current
 | |
|   environment), and evaluate <body> in the resulting environment.
 | |
| 
 | |
| (cond (<test-expr> <result-expr>...)... (else <result-expr>...))
 | |
|   
 | |
|   where the final  else  clause is optional.  Evaluate each <test-expr> in
 | |
|   turn, and for the first non-nil result, evaluate its <result-expr>.  If
 | |
|   none are non-nil, and there's no  else  clause, return nil.
 | |
| 
 | |
| (and <expr>...)
 | |
| 
 | |
|   Evaluate each <expr> in order, until one returns nil; then return nil.
 | |
|   If none are nil, return the value of the last <expr>.
 | |
| 
 | |
| (or <expr>...)
 | |
| 
 | |
|   Evaluate each <expr> in order, until one returns non-nil; return that value.
 | |
|   If all are nil, return nil.
 | |
| 
 | |
| 
 | |
| 4. Built-in procedures
 | |
| 
 | |
| List operations:
 | |
| (null? <x>) returns true (non-nil) when <x> is nil.
 | |
| (atom? <x>) returns true when <x> is an atom.
 | |
| (pair? <x>) returns true when <x> is a pair.
 | |
| (car <pair>) returns the car of <pair>.
 | |
| (cdr <pair>) returns the cdr of <pair>.
 | |
| (cadr <pair>) returns the car of the cdr of <pair>. (i.e., the second element.)
 | |
| (cddr <pair>) returns the cdr of the cdr of <pair>.
 | |
| (cons <x> <y>) returns a new pair whose car is <x> and whose cdr is <y>.
 | |
| (list <x>...) returns a list of its arguments.
 | |
| (set-car! <pair> <x>) changes the car of <pair> to <x>.
 | |
| (set-cdr! <pair> <x>) changes the cdr of <pair> to <x>.
 | |
| (reverse! <list>) reverses <list> in place, returning the result.
 | |
| 
 | |
| Numbers:
 | |
| (number? <x>) returns true when <x> is a number.
 | |
| (+ <n> <n>) returns the sum of its arguments.
 | |
| (- <n> <n>) returns the difference of its arguments.
 | |
| (* <n> <n>) returns the product of its arguments.
 | |
| (quotient <n> <n>) returns the quotient.  Rounding is towards zero.
 | |
| (remainder <n> <n>) returns the remainder.
 | |
| (< <n1> <n2>) returns true when <n1> is less than <n2>.
 | |
| 
 | |
| I/O:
 | |
| (write <x>) writes <x> followed by a space.
 | |
| (newline) writes the newline character.
 | |
| (read) reads the next expression from standard input and returns it.
 | |
| 
 | |
| Meta-operations:
 | |
| (eval <x>) evaluates <x> in the current environment, returning the result.
 | |
| (apply <proc> <list>) calls <proc> with arguments <list>, returning the result.
 | |
| 
 | |
| Miscellany:
 | |
| (eq? <x> <y>) returns true when <x> and <y> are the same object.  Be careful
 | |
|     using eq? with lists, because (eq? (cons <x> <y>) (cons <x> <y>)) is false. 
 | |
| (put <x> <y> <z>)
 | |
| (get <x> <y>) returns the last value <z> that was put for <x> and <y>, or nil
 | |
|     if there is no such value.
 | |
| (symbol? <x>) returns true when <x> is a symbol.
 | |
| (gensym) returns a new symbol distinct from all symbols that can be read.
 | |
| (random <n>) returns a random integer between 0 and <n>-1 (if <n> is positive).
 | |
| (error <x>...) writes its arguments and aborts with error code 1.
 | |
| 
 | |
| 
 | |
| 5. References
 | |
| 
 | |
| Harold Abelson and Gerald J. Sussman, with Julie Sussman.
 | |
|   Structure and Interpretation of Computer Programs.  MIT Press, 1985.
 | |
| 
 | |
| John Allen.  Anatomy of Lisp.  McGraw-Hill, 1978.
 | |
| 
 | |
| Daniel P. Friedman and Matthias Felleisen.  The Little LISPer.  Macmillan, 1989.
 | |
| 
 | |
| Roger Rohrbach wrote a Lisp interpreter, in old awk (which has no
 | |
| procedures!), called walk .  It can't do as much as this Lisp, but it
 | |
| certainly has greater hack value.  Cooler name, too.  It's available at
 | |
| http://www-2.cs.cmu.edu/afs/cs/project/ai-repository/ai/lang/lisp/impl/awk/0.html
 | |
| 
 | |
| 
 | |
| 6. Bugs
 | |
| 
 | |
| Eval doesn't check the syntax of expressions.  This is a probably-misguided
 | |
| attempt to bump up the speed a bit, that also simplifies some of the code.
 | |
| The macroexpander in the startup file would be the best place to add syntax-
 | |
| checking.
 |