A Hawk program is composed of the following elements at the top level.
- pattern-action block pair
- BEGIN action block pair
- END action block pair
- action block without a pattern
- pattern without an action block
- user-defined function
-@global variable declaration
-@include directive
-@pragma directive
However, none of the above is mandatory. Hawk accepts an empty program.
### Pattern-Action Block Pair
A pattern-action pair is composed of a pattern and an action block as shown below:
pattern {
statement
statement
...
}
A pattern can be one of the followings when specified:
- expression
- first-expression, last-expression
- *BEGIN*
- *END*
An action block is a series of statements enclosed in a curly bracket pair. The *BEGIN* and *END* patterns require an action block while normal patterns don't. When no action block is specified for a normal pattern, it is treated
as if `{ print $0; }` is specified.
Hawk executes the action block for the *BEGIN* pattern when it starts executing a program; No start-up action is taken if no *BEGIN* pattern-action pair is specified. If a normal pattern-action pair and/or the *END*
pattern-action is specified, it reads the standard input stream. For each input line it reads, it checks if a normal pattern expression evaluates to true. For each pattern that evaluates to true, it executes the action block specified for
the pattern. When it reaches the end of the input stream, it executes the action block for the *END* pattern.
Hawk allows zero or more *BEGIN* patterns. When multiple *BEGIN* patterns are specified, it executes their action blocks in their appearance order in the program. The same applies to the *END* patterns and their action blocks. It
doesn't read the standard input stream for programs composed of BEGIN blocks only whereas it reads the stream as long as there is an action block for the END pattern or a normal pattern. It evaluates an empty pattern to true;
As a result, the action block for an empty pattern is executed for all input lines read.
You can compose a pattern range by putting 2 patterns separated by a comma. The pattern range evaluates to true once the first expression evaluates to true until the last expression evaluates to true.
The following code snippet is a valid Hawk program that prints the string *hello, world* to the console and exits.
BEGIN {
print "hello, world";
}
This program prints "hello, world" followed by "hello, all" to the console.
BEGIN {
print "hello, world";
}
BEGIN {
print "hello, all";
}
For the following text input,
abcdefgahijklmn
1234567890
opqrstuvwxyzabc
9876543210
this program
BEGIN { mr=0; my_nr=0; }
/abc/ { print "[" $0 "]"; mr++; }
{ my_nr++; }
END {
print "total records: " NR;
print "total records selfcounted: " my_nr;
print "matching records: " mr;
}
produces the output text like this:
[abcdefgahijklmn]
[opqrstuvwxyzabc]
total records: 4
total records selfcounted: 4
matching records: 2
See the table for the order of execution indicated by the number and the result
of pattern evaluation enclosed in parenthesis. The action block is executed if
The typical execution begins with the BEGIN block, goes through pattern-action blocks, and eaches the END block. If you like to use a function as an entry point, you may set a function name with @pragma entry.
| implicit | file | on, off | allow undeclared variables |
| multilinestr | file | on, off | allow a multiline string literal without continuation |
| entry | global | function name | change the program entry point |
| striprecspc | global | on, off | |
| striprecspc | global | on, off | trim leading and trailing spaces when convering a string to a number |
### @include and @include_once
The @include directive inserts the contents of the file specified in the following string as if they appeared in the source stream being processed.
Assuming the hello.inc file contains the print_hello() function as shown below,
function print_hello() { print "hello\n"; }
You may include the the file and use the function.
@include "hello.inc";
BEGIN { print_hello(); }
The semicolon after the included file name is optional. You could write @include "hello.inc" without the ending semicolon.
@include_once is similar to @include except it doesn't include the same file multiple times.
@include_once "hello.inc";
@include_once "hello.inc";
BEGIN { print_hello(); }
In this example, print_hello() is not included twice.
You may use @include and @include_once inside a block as well as at the top level.
BEGIN {
@include "init.inc";
...
}
### Comments
Hawk supports a single-line commnt that begins with a hash sign # and the C-style multi-line comment.
x = y; # assign y to x.
/*
this line is ignored.
this line is ignored too.
*/
### Reserved Words
The following words are reserved and cannot be used as a variable name, a parameter name, or a function name.
-@abort
-@global
-@include
-@include_once
-@local
-@pragma
-@reset
- BEGIN
- END
- break
- continue
- delete
- do
- else
- exit
- for
- function
- getbline
- getline
- if
- in
- next
- nextfile
- nextofile
- print
- printf
- return
- while
However, these words can be used as normal names in the context of a module call. For example, mymod::break. In practice, the predefined names used for built-in commands, functions, and variables are treated as if they are reserved since you can't create another denifition with the same name.
A regular expression literal is special in that it never appears as an indendent value and still entails a match operation against $0 without an match operator.
BEGIN { $0="ab"; print /ab/, typename(/ab/); }
For this reason, there is no way to get the type name of a regular expressin literal.
- str::tonum - convert a string to a number. a numeric value passed as a parameter is returned as it is. the leading prefix of 0b, 0, and 0x specifies the radix of 2, 8, 16 repectively. conversion stops when the end of the string is reached or the first invalid character for conversion is encountered.
There are subtle differences when you put an expression for a position variable. In Hawk, most of the ambiguity issues are resolved if you enclose the expression inside parentheses.
If you prefer C++, you may use the Hawk/HawkStd wrapper classes to simplify the task. The C++ classes are inferior to the C equivalents in that they don't allow creation of multiple runtime contexts over a single hawk instance. Here is the sample code that prints "hello, world".