239 lines
		
	
	
		
			10 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			239 lines
		
	
	
		
			10 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
		  awklisp: a Lisp interpreter in awk
 | 
						|
			     version 1.2
 | 
						|
 | 
						|
			     Darius Bacon
 | 
						|
			 darius@accesscom.com
 | 
						|
		  http://www.accesscom.com/~darius/
 | 
						|
 | 
						|
 | 
						|
1. Usage
 | 
						|
 | 
						|
    mawk [-v profiling=1] -f awklisp <optional-Lisp-source-files>
 | 
						|
 | 
						|
The  -v profiling=1  option turns call-count profiling on.
 | 
						|
 | 
						|
If you want to use it interactively, be sure to include '-' (for the standard 
 | 
						|
input) among the source files.  For example:
 | 
						|
 | 
						|
    mawk -f awklisp startup -
 | 
						|
 | 
						|
It should work with nawk and gawk, too, but even less quickly.
 | 
						|
    
 | 
						|
 | 
						|
2. Overview
 | 
						|
 | 
						|
This program arose out of one-upmanship.  At my previous job I had to
 | 
						|
use MapBasic, an interpreter so astoundingly slow (around 100 times
 | 
						|
slower than GWBASIC) that one must wonder if it itself is implemented
 | 
						|
in an interpreted language.  I still wonder, but it clearly could be:
 | 
						|
a bare-bones Lisp in awk, hacked up in a few hours, ran substantially
 | 
						|
faster.  Since then I've added features and polish, in the hope of
 | 
						|
taking over the burgeoning market for stately language
 | 
						|
implementations.
 | 
						|
 | 
						|
This version tries to deal with as many of the essential issues in
 | 
						|
interpreter implementation as is reasonable in awk (though most would
 | 
						|
call this program utterly unreasonable from start to finish, perhaps...).
 | 
						|
Awk's impoverished control structures put error recovery and tail-call
 | 
						|
optimization out of reach, in that I can't see a non-painful way to code
 | 
						|
them.  The scope of variables is dynamic because that was easier to 
 | 
						|
implement efficiently.  Subject to all those constraints, the language
 | 
						|
is as Schemely as I could make it: it has a single namespace with 
 | 
						|
uniform evaluation of expressions in the function and argument positions,
 | 
						|
and the Scheme names for primitives and special forms.
 | 
						|
 | 
						|
The rest of this file is a reference manual.  My favorite tutorial would be
 | 
						|
_The Little LISPer_ (see section 5, References); don't let the cute name
 | 
						|
and the cartoons turn you off, because it's a really excellent book with 
 | 
						|
some mind-stretching material towards the end.  All of its code will work
 | 
						|
with awklisp, except for the last two chapters.  (You'd be better off
 | 
						|
learning with a serious Lisp implementation, of course.)
 | 
						|
 | 
						|
The file Impl-notes in this distribution gives an overview of the 
 | 
						|
implementation.
 | 
						|
 | 
						|
 | 
						|
3. Expressions and their evaluation
 | 
						|
 | 
						|
Lisp evaluates expressions, which can be simple (atoms) or compound (lists).
 | 
						|
 | 
						|
An atom is a string of characters, which can be letters, digits, and most
 | 
						|
punctuation; the characters may -not- include spaces, quotes, parentheses,
 | 
						|
brackets, '.', '#', or ';' (the comment character).  In this Lisp, case is
 | 
						|
significant ( X  is different from  x ).
 | 
						|
 | 
						|
Atoms:		atom 42 1/137 + ok? hey:names-with-dashes-are-easy-to-read
 | 
						|
Not atoms: 	don't-include-quotes 	(or spaces or parentheses)
 | 
						|
 | 
						|
A list is a '(', followed by zero or more objects (each of which is an atom
 | 
						|
or a list), followed by a ')'.
 | 
						|
 | 
						|
Lists:		()   (a list of atoms)	((a list) of atoms (and lists))
 | 
						|
Not lists:	)    ((())		(two) (lists)
 | 
						|
 | 
						|
The special object  nil  is both an atom and the empty list.  That is,
 | 
						|
nil = ().  A non-nil list is called a -pair-, because it is represented by a
 | 
						|
pair of pointers, one to the first element of the list (its -car-), and one to
 | 
						|
the rest of the list (its -cdr-).  For example, the car of ((a list) of stuff)
 | 
						|
is (a list), and the cdr is (of stuff).  It's also possible to have a pair
 | 
						|
whose cdr is not a list; the pair with car A and cdr B is printed as (A . B).
 | 
						|
 | 
						|
That's the syntax of programs and data.  Now let's consider their meaning.  You
 | 
						|
can use Lisp like a calculator: type in an expression, and Lisp prints its 
 | 
						|
value.  If you type 25, it prints 25.  If you type (+ 2 2), it prints 4.  In 
 | 
						|
general, Lisp evaluates a particular expression in a particular environment
 | 
						|
(set of variable bindings) by following this algorithm:
 | 
						|
 | 
						|
If the expression is a number, return that number.
 | 
						|
 | 
						|
If the expression is a non-numeric atom (a -symbol-), return the value of that
 | 
						|
symbol in the current environment.  If the symbol is currently unbound, that's
 | 
						|
an error.
 | 
						|
 | 
						|
Otherwise the expression is a list.  If its car is one of the symbols: quote, 
 | 
						|
lambda, if, begin, while, set!, or define, then the expression is a -special-
 | 
						|
-form-, handled by special rules.  Otherwise it's just a procedure call,
 | 
						|
handled like this: evaluate each element of the list in the current environment,
 | 
						|
and then apply the operator (the value of the car) to the operands (the values
 | 
						|
of the rest of the list's elements).  For example, to evaluate (+ 2 3), we
 | 
						|
first evaluate each of its subexpressions: the value of + is (at least in the
 | 
						|
initial environment) the primitive procedure that adds, the value of 2 is 2,
 | 
						|
and the value of 3 is 3.  Then we call the addition procedure with 2 and 3 as
 | 
						|
arguments, yielding 5.  For another example, take (- (+ 2 3) 1).  Evaluating
 | 
						|
each subexpression gives the subtraction procedure, 5, and 1.  Applying the
 | 
						|
procedure to the arguments gives 4.
 | 
						|
 | 
						|
We'll see all the primitive procedures in the next section.  A user-defined 
 | 
						|
procedure is represented as a list of the form (lambda <parameters> <body>),
 | 
						|
such as (lambda (x) (+ x 1)).  To apply such a procedure, evaluate its body
 | 
						|
in the environment obtained by extending the current environment so that the
 | 
						|
parameters are bound to the corresponding arguments.  Thus, to apply the above
 | 
						|
procedure to the argument 41, evaluate (+ x 1) in the same environment as the
 | 
						|
current one except that x is bound to 41.
 | 
						|
 | 
						|
If the procedure's body has more than one expression -- e.g., 
 | 
						|
(lambda () (write 'Hello) (write 'world!)) -- evaluate them each in turn, and
 | 
						|
return the value of the last one.
 | 
						|
 | 
						|
We still need the rules for special forms.  They are:
 | 
						|
 | 
						|
The value of (quote <x>) is <x>.  There's a shorthand for this form: '<x>.
 | 
						|
E.g., the value of '(+ 2 2) is (+ 2 2), -not- 4.
 | 
						|
 | 
						|
(lambda <parameters> <body>) returns itself: e.g., the value of (lambda (x) x)
 | 
						|
is (lambda (x) x).
 | 
						|
 | 
						|
To evaluate (if <test-expr> <then-exp> <else-exp>), first evaluate <test-expr>.
 | 
						|
If the value is true (non-nil), then return the value of <then-exp>, otherwise
 | 
						|
return the value of <else-exp>.  (<else-exp> is optional; if it's left out, 
 | 
						|
pretend there's a  nil  there.)  Example: (if nil 'yes 'no) returns no.
 | 
						|
 | 
						|
To evaluate (begin <expr-1> <expr-2>...), evaluate each of the subexpressions
 | 
						|
in order, returning the value of the last one.
 | 
						|
 | 
						|
To evaluate (while <test> <expr-1> <expr-2>...), first evaluate <test>.  If 
 | 
						|
it's nil, return nil.  Otherwise, evaluate <expr-1>, <expr-2>,... in order,
 | 
						|
and then repeat.
 | 
						|
 | 
						|
To evaluate (set! <variable> <expr>), evaluate <expr>, and then set the value
 | 
						|
of <variable> in the current environment to the result.  If the variable is
 | 
						|
currently unbound, that's an error.  The value of the whole set! expression
 | 
						|
is the value of <expr>.
 | 
						|
 | 
						|
(define <variable> <expr>) is like set!, except it's used to introduce new
 | 
						|
bindings, and the value returned is <variable>.
 | 
						|
 | 
						|
It's possible to define new special forms using the macro facility provided in
 | 
						|
the startup file.  The macros defined there are: 
 | 
						|
 | 
						|
(let ((<var> <expr>)...) 
 | 
						|
  <body>...)
 | 
						|
  
 | 
						|
  Bind each <var> to its corresponding <expr> (evaluated in the current
 | 
						|
  environment), and evaluate <body> in the resulting environment.
 | 
						|
 | 
						|
(cond (<test-expr> <result-expr>...)... (else <result-expr>...))
 | 
						|
  
 | 
						|
  where the final  else  clause is optional.  Evaluate each <test-expr> in
 | 
						|
  turn, and for the first non-nil result, evaluate its <result-expr>.  If
 | 
						|
  none are non-nil, and there's no  else  clause, return nil.
 | 
						|
 | 
						|
(and <expr>...)
 | 
						|
 | 
						|
  Evaluate each <expr> in order, until one returns nil; then return nil.
 | 
						|
  If none are nil, return the value of the last <expr>.
 | 
						|
 | 
						|
(or <expr>...)
 | 
						|
 | 
						|
  Evaluate each <expr> in order, until one returns non-nil; return that value.
 | 
						|
  If all are nil, return nil.
 | 
						|
 | 
						|
 | 
						|
4. Built-in procedures
 | 
						|
 | 
						|
List operations:
 | 
						|
(null? <x>) returns true (non-nil) when <x> is nil.
 | 
						|
(atom? <x>) returns true when <x> is an atom.
 | 
						|
(pair? <x>) returns true when <x> is a pair.
 | 
						|
(car <pair>) returns the car of <pair>.
 | 
						|
(cdr <pair>) returns the cdr of <pair>.
 | 
						|
(cadr <pair>) returns the car of the cdr of <pair>. (i.e., the second element.)
 | 
						|
(cddr <pair>) returns the cdr of the cdr of <pair>.
 | 
						|
(cons <x> <y>) returns a new pair whose car is <x> and whose cdr is <y>.
 | 
						|
(list <x>...) returns a list of its arguments.
 | 
						|
(set-car! <pair> <x>) changes the car of <pair> to <x>.
 | 
						|
(set-cdr! <pair> <x>) changes the cdr of <pair> to <x>.
 | 
						|
(reverse! <list>) reverses <list> in place, returning the result.
 | 
						|
 | 
						|
Numbers:
 | 
						|
(number? <x>) returns true when <x> is a number.
 | 
						|
(+ <n> <n>) returns the sum of its arguments.
 | 
						|
(- <n> <n>) returns the difference of its arguments.
 | 
						|
(* <n> <n>) returns the product of its arguments.
 | 
						|
(quotient <n> <n>) returns the quotient.  Rounding is towards zero.
 | 
						|
(remainder <n> <n>) returns the remainder.
 | 
						|
(< <n1> <n2>) returns true when <n1> is less than <n2>.
 | 
						|
 | 
						|
I/O:
 | 
						|
(write <x>) writes <x> followed by a space.
 | 
						|
(newline) writes the newline character.
 | 
						|
(read) reads the next expression from standard input and returns it.
 | 
						|
 | 
						|
Meta-operations:
 | 
						|
(eval <x>) evaluates <x> in the current environment, returning the result.
 | 
						|
(apply <proc> <list>) calls <proc> with arguments <list>, returning the result.
 | 
						|
 | 
						|
Miscellany:
 | 
						|
(eq? <x> <y>) returns true when <x> and <y> are the same object.  Be careful
 | 
						|
    using eq? with lists, because (eq? (cons <x> <y>) (cons <x> <y>)) is false. 
 | 
						|
(put <x> <y> <z>)
 | 
						|
(get <x> <y>) returns the last value <z> that was put for <x> and <y>, or nil
 | 
						|
    if there is no such value.
 | 
						|
(symbol? <x>) returns true when <x> is a symbol.
 | 
						|
(gensym) returns a new symbol distinct from all symbols that can be read.
 | 
						|
(random <n>) returns a random integer between 0 and <n>-1 (if <n> is positive).
 | 
						|
(error <x>...) writes its arguments and aborts with error code 1.
 | 
						|
 | 
						|
 | 
						|
5. References
 | 
						|
 | 
						|
Harold Abelson and Gerald J. Sussman, with Julie Sussman.
 | 
						|
  Structure and Interpretation of Computer Programs.  MIT Press, 1985.
 | 
						|
 | 
						|
John Allen.  Anatomy of Lisp.  McGraw-Hill, 1978.
 | 
						|
 | 
						|
Daniel P. Friedman and Matthias Felleisen.  The Little LISPer.  Macmillan, 1989.
 | 
						|
 | 
						|
Roger Rohrbach wrote a Lisp interpreter, in old awk (which has no
 | 
						|
procedures!), called walk .  It can't do as much as this Lisp, but it
 | 
						|
certainly has greater hack value.  Cooler name, too.  It's available at
 | 
						|
http://www-2.cs.cmu.edu/afs/cs/project/ai-repository/ai/lang/lisp/impl/awk/0.html
 | 
						|
 | 
						|
 | 
						|
6. Bugs
 | 
						|
 | 
						|
Eval doesn't check the syntax of expressions.  This is a probably-misguided
 | 
						|
attempt to bump up the speed a bit, that also simplifies some of the code.
 | 
						|
The macroexpander in the startup file would be the best place to add syntax-
 | 
						|
checking.
 |