239 lines
		
	
	
		
			10 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
		
		
			
		
	
	
			239 lines
		
	
	
		
			10 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
|  | 		  awklisp: a Lisp interpreter in awk | ||
|  | 			     version 1.2 | ||
|  | 
 | ||
|  | 			     Darius Bacon | ||
|  | 			 darius@accesscom.com | ||
|  | 		  http://www.accesscom.com/~darius/ | ||
|  | 
 | ||
|  | 
 | ||
|  | 1. Usage | ||
|  | 
 | ||
|  |     mawk [-v profiling=1] -f awklisp <optional-Lisp-source-files> | ||
|  | 
 | ||
|  | The  -v profiling=1  option turns call-count profiling on. | ||
|  | 
 | ||
|  | If you want to use it interactively, be sure to include '-' (for the standard  | ||
|  | input) among the source files.  For example: | ||
|  | 
 | ||
|  |     mawk -f awklisp startup - | ||
|  | 
 | ||
|  | It should work with nawk and gawk, too, but even less quickly. | ||
|  |      | ||
|  | 
 | ||
|  | 2. Overview | ||
|  | 
 | ||
|  | This program arose out of one-upmanship.  At my previous job I had to | ||
|  | use MapBasic, an interpreter so astoundingly slow (around 100 times | ||
|  | slower than GWBASIC) that one must wonder if it itself is implemented | ||
|  | in an interpreted language.  I still wonder, but it clearly could be: | ||
|  | a bare-bones Lisp in awk, hacked up in a few hours, ran substantially | ||
|  | faster.  Since then I've added features and polish, in the hope of | ||
|  | taking over the burgeoning market for stately language | ||
|  | implementations. | ||
|  | 
 | ||
|  | This version tries to deal with as many of the essential issues in | ||
|  | interpreter implementation as is reasonable in awk (though most would | ||
|  | call this program utterly unreasonable from start to finish, perhaps...). | ||
|  | Awk's impoverished control structures put error recovery and tail-call | ||
|  | optimization out of reach, in that I can't see a non-painful way to code | ||
|  | them.  The scope of variables is dynamic because that was easier to  | ||
|  | implement efficiently.  Subject to all those constraints, the language | ||
|  | is as Schemely as I could make it: it has a single namespace with  | ||
|  | uniform evaluation of expressions in the function and argument positions, | ||
|  | and the Scheme names for primitives and special forms. | ||
|  | 
 | ||
|  | The rest of this file is a reference manual.  My favorite tutorial would be | ||
|  | _The Little LISPer_ (see section 5, References); don't let the cute name | ||
|  | and the cartoons turn you off, because it's a really excellent book with  | ||
|  | some mind-stretching material towards the end.  All of its code will work | ||
|  | with awklisp, except for the last two chapters.  (You'd be better off | ||
|  | learning with a serious Lisp implementation, of course.) | ||
|  | 
 | ||
|  | The file Impl-notes in this distribution gives an overview of the  | ||
|  | implementation. | ||
|  | 
 | ||
|  | 
 | ||
|  | 3. Expressions and their evaluation | ||
|  | 
 | ||
|  | Lisp evaluates expressions, which can be simple (atoms) or compound (lists). | ||
|  | 
 | ||
|  | An atom is a string of characters, which can be letters, digits, and most | ||
|  | punctuation; the characters may -not- include spaces, quotes, parentheses, | ||
|  | brackets, '.', '#', or ';' (the comment character).  In this Lisp, case is | ||
|  | significant ( X  is different from  x ). | ||
|  | 
 | ||
|  | Atoms:		atom 42 1/137 + ok? hey:names-with-dashes-are-easy-to-read | ||
|  | Not atoms: 	don't-include-quotes 	(or spaces or parentheses) | ||
|  | 
 | ||
|  | A list is a '(', followed by zero or more objects (each of which is an atom | ||
|  | or a list), followed by a ')'. | ||
|  | 
 | ||
|  | Lists:		()   (a list of atoms)	((a list) of atoms (and lists)) | ||
|  | Not lists:	)    ((())		(two) (lists) | ||
|  | 
 | ||
|  | The special object  nil  is both an atom and the empty list.  That is, | ||
|  | nil = ().  A non-nil list is called a -pair-, because it is represented by a | ||
|  | pair of pointers, one to the first element of the list (its -car-), and one to | ||
|  | the rest of the list (its -cdr-).  For example, the car of ((a list) of stuff) | ||
|  | is (a list), and the cdr is (of stuff).  It's also possible to have a pair | ||
|  | whose cdr is not a list; the pair with car A and cdr B is printed as (A . B). | ||
|  | 
 | ||
|  | That's the syntax of programs and data.  Now let's consider their meaning.  You | ||
|  | can use Lisp like a calculator: type in an expression, and Lisp prints its  | ||
|  | value.  If you type 25, it prints 25.  If you type (+ 2 2), it prints 4.  In  | ||
|  | general, Lisp evaluates a particular expression in a particular environment | ||
|  | (set of variable bindings) by following this algorithm: | ||
|  | 
 | ||
|  | If the expression is a number, return that number. | ||
|  | 
 | ||
|  | If the expression is a non-numeric atom (a -symbol-), return the value of that | ||
|  | symbol in the current environment.  If the symbol is currently unbound, that's | ||
|  | an error. | ||
|  | 
 | ||
|  | Otherwise the expression is a list.  If its car is one of the symbols: quote,  | ||
|  | lambda, if, begin, while, set!, or define, then the expression is a -special- | ||
|  | -form-, handled by special rules.  Otherwise it's just a procedure call, | ||
|  | handled like this: evaluate each element of the list in the current environment, | ||
|  | and then apply the operator (the value of the car) to the operands (the values | ||
|  | of the rest of the list's elements).  For example, to evaluate (+ 2 3), we | ||
|  | first evaluate each of its subexpressions: the value of + is (at least in the | ||
|  | initial environment) the primitive procedure that adds, the value of 2 is 2, | ||
|  | and the value of 3 is 3.  Then we call the addition procedure with 2 and 3 as | ||
|  | arguments, yielding 5.  For another example, take (- (+ 2 3) 1).  Evaluating | ||
|  | each subexpression gives the subtraction procedure, 5, and 1.  Applying the | ||
|  | procedure to the arguments gives 4. | ||
|  | 
 | ||
|  | We'll see all the primitive procedures in the next section.  A user-defined  | ||
|  | procedure is represented as a list of the form (lambda <parameters> <body>), | ||
|  | such as (lambda (x) (+ x 1)).  To apply such a procedure, evaluate its body | ||
|  | in the environment obtained by extending the current environment so that the | ||
|  | parameters are bound to the corresponding arguments.  Thus, to apply the above | ||
|  | procedure to the argument 41, evaluate (+ x 1) in the same environment as the | ||
|  | current one except that x is bound to 41. | ||
|  | 
 | ||
|  | If the procedure's body has more than one expression -- e.g.,  | ||
|  | (lambda () (write 'Hello) (write 'world!)) -- evaluate them each in turn, and | ||
|  | return the value of the last one. | ||
|  | 
 | ||
|  | We still need the rules for special forms.  They are: | ||
|  | 
 | ||
|  | The value of (quote <x>) is <x>.  There's a shorthand for this form: '<x>. | ||
|  | E.g., the value of '(+ 2 2) is (+ 2 2), -not- 4. | ||
|  | 
 | ||
|  | (lambda <parameters> <body>) returns itself: e.g., the value of (lambda (x) x) | ||
|  | is (lambda (x) x). | ||
|  | 
 | ||
|  | To evaluate (if <test-expr> <then-exp> <else-exp>), first evaluate <test-expr>. | ||
|  | If the value is true (non-nil), then return the value of <then-exp>, otherwise | ||
|  | return the value of <else-exp>.  (<else-exp> is optional; if it's left out,  | ||
|  | pretend there's a  nil  there.)  Example: (if nil 'yes 'no) returns no. | ||
|  | 
 | ||
|  | To evaluate (begin <expr-1> <expr-2>...), evaluate each of the subexpressions | ||
|  | in order, returning the value of the last one. | ||
|  | 
 | ||
|  | To evaluate (while <test> <expr-1> <expr-2>...), first evaluate <test>.  If  | ||
|  | it's nil, return nil.  Otherwise, evaluate <expr-1>, <expr-2>,... in order, | ||
|  | and then repeat. | ||
|  | 
 | ||
|  | To evaluate (set! <variable> <expr>), evaluate <expr>, and then set the value | ||
|  | of <variable> in the current environment to the result.  If the variable is | ||
|  | currently unbound, that's an error.  The value of the whole set! expression | ||
|  | is the value of <expr>. | ||
|  | 
 | ||
|  | (define <variable> <expr>) is like set!, except it's used to introduce new | ||
|  | bindings, and the value returned is <variable>. | ||
|  | 
 | ||
|  | It's possible to define new special forms using the macro facility provided in | ||
|  | the startup file.  The macros defined there are:  | ||
|  | 
 | ||
|  | (let ((<var> <expr>)...)  | ||
|  |   <body>...) | ||
|  |    | ||
|  |   Bind each <var> to its corresponding <expr> (evaluated in the current | ||
|  |   environment), and evaluate <body> in the resulting environment. | ||
|  | 
 | ||
|  | (cond (<test-expr> <result-expr>...)... (else <result-expr>...)) | ||
|  |    | ||
|  |   where the final  else  clause is optional.  Evaluate each <test-expr> in | ||
|  |   turn, and for the first non-nil result, evaluate its <result-expr>.  If | ||
|  |   none are non-nil, and there's no  else  clause, return nil. | ||
|  | 
 | ||
|  | (and <expr>...) | ||
|  | 
 | ||
|  |   Evaluate each <expr> in order, until one returns nil; then return nil. | ||
|  |   If none are nil, return the value of the last <expr>. | ||
|  | 
 | ||
|  | (or <expr>...) | ||
|  | 
 | ||
|  |   Evaluate each <expr> in order, until one returns non-nil; return that value. | ||
|  |   If all are nil, return nil. | ||
|  | 
 | ||
|  | 
 | ||
|  | 4. Built-in procedures | ||
|  | 
 | ||
|  | List operations: | ||
|  | (null? <x>) returns true (non-nil) when <x> is nil. | ||
|  | (atom? <x>) returns true when <x> is an atom. | ||
|  | (pair? <x>) returns true when <x> is a pair. | ||
|  | (car <pair>) returns the car of <pair>. | ||
|  | (cdr <pair>) returns the cdr of <pair>. | ||
|  | (cadr <pair>) returns the car of the cdr of <pair>. (i.e., the second element.) | ||
|  | (cddr <pair>) returns the cdr of the cdr of <pair>. | ||
|  | (cons <x> <y>) returns a new pair whose car is <x> and whose cdr is <y>. | ||
|  | (list <x>...) returns a list of its arguments. | ||
|  | (set-car! <pair> <x>) changes the car of <pair> to <x>. | ||
|  | (set-cdr! <pair> <x>) changes the cdr of <pair> to <x>. | ||
|  | (reverse! <list>) reverses <list> in place, returning the result. | ||
|  | 
 | ||
|  | Numbers: | ||
|  | (number? <x>) returns true when <x> is a number. | ||
|  | (+ <n> <n>) returns the sum of its arguments. | ||
|  | (- <n> <n>) returns the difference of its arguments. | ||
|  | (* <n> <n>) returns the product of its arguments. | ||
|  | (quotient <n> <n>) returns the quotient.  Rounding is towards zero. | ||
|  | (remainder <n> <n>) returns the remainder. | ||
|  | (< <n1> <n2>) returns true when <n1> is less than <n2>. | ||
|  | 
 | ||
|  | I/O: | ||
|  | (write <x>) writes <x> followed by a space. | ||
|  | (newline) writes the newline character. | ||
|  | (read) reads the next expression from standard input and returns it. | ||
|  | 
 | ||
|  | Meta-operations: | ||
|  | (eval <x>) evaluates <x> in the current environment, returning the result. | ||
|  | (apply <proc> <list>) calls <proc> with arguments <list>, returning the result. | ||
|  | 
 | ||
|  | Miscellany: | ||
|  | (eq? <x> <y>) returns true when <x> and <y> are the same object.  Be careful | ||
|  |     using eq? with lists, because (eq? (cons <x> <y>) (cons <x> <y>)) is false.  | ||
|  | (put <x> <y> <z>) | ||
|  | (get <x> <y>) returns the last value <z> that was put for <x> and <y>, or nil | ||
|  |     if there is no such value. | ||
|  | (symbol? <x>) returns true when <x> is a symbol. | ||
|  | (gensym) returns a new symbol distinct from all symbols that can be read. | ||
|  | (random <n>) returns a random integer between 0 and <n>-1 (if <n> is positive). | ||
|  | (error <x>...) writes its arguments and aborts with error code 1. | ||
|  | 
 | ||
|  | 
 | ||
|  | 5. References | ||
|  | 
 | ||
|  | Harold Abelson and Gerald J. Sussman, with Julie Sussman. | ||
|  |   Structure and Interpretation of Computer Programs.  MIT Press, 1985. | ||
|  | 
 | ||
|  | John Allen.  Anatomy of Lisp.  McGraw-Hill, 1978. | ||
|  | 
 | ||
|  | Daniel P. Friedman and Matthias Felleisen.  The Little LISPer.  Macmillan, 1989. | ||
|  | 
 | ||
|  | Roger Rohrbach wrote a Lisp interpreter, in old awk (which has no | ||
|  | procedures!), called walk .  It can't do as much as this Lisp, but it | ||
|  | certainly has greater hack value.  Cooler name, too.  It's available at | ||
|  | http://www-2.cs.cmu.edu/afs/cs/project/ai-repository/ai/lang/lisp/impl/awk/0.html | ||
|  | 
 | ||
|  | 
 | ||
|  | 6. Bugs | ||
|  | 
 | ||
|  | Eval doesn't check the syntax of expressions.  This is a probably-misguided | ||
|  | attempt to bump up the speed a bit, that also simplifies some of the code. | ||
|  | The macroexpander in the startup file would be the best place to add syntax- | ||
|  | checking. |