30 KiB
Hawk - Embeddable AWK Interpreter in C/C++
Hawk is a stable and embeddable AWK interpreter written in C.
It can run AWK scripts inside your own applications or as a standalone AWK engine.
The library is stable, portable, and designed for projects that need a scripting engine with a small footprint.
Table of Contents
- Features
- Building Hawk From Source Code
- Embedding Hawk in C Applications
- Embedding Hawk in C++ Applications
- Language
- What Hawk Is
- Running Hawk
- Execution Model
- Values and Types
- Expressions and Operators
- Variables and Scope
- Arrays and Maps
- Functions
- Control Flow
- Input, Output, and Pipes
- Built-in Variables
- Built-in Functions
- Pragmas
- @include and @include_once
- Comments
- Reserved Words
- More Examples
- Garbage Collection
- Modules
- Incompatibility with AWK
Features
- Full AWK interpreter - mostly POSIX AWK compatible, with additional extensions.
- Embeddable library - integrate AWK scripting into C or C++ projects as an execution engine.
- C and C++ APIs - core functions exposed in C, with convenient C++ wrapper classes available.
- Flexible usage - usable as both a standalone command-line interpreter and a library.
- Portable core - the base library depends only on the standard C library.
- Optional extensions - loadable modules (e.g. MySQL access, FFI) can be built in or used via shared objects.
- Mature and stable - developed and maintained for many years with proven reliability.
- Embedded sed functionality - includes a sed engine that can be used from C/C++ or invoked via the CLI using --sed
Building Hawk From Source Code
Hawk uses autoconf and automake for building. Run the following commands to configure and compile Hawk:
$ ./configure ## This step offers various build options
$ make
$ make install
Embedding Hawk in C Applications
Here's an example of how Hawk can be embedded within a C application:
#include <hawk.h>
#include <stdio.h>
#include <string.h>
static const hawk_bch_t* src =
"BEGIN { print ARGV[0];"
" for (i=2;i<=9;i++)"
" {"
" for (j=1;j<=9;j++)"
" print i \"*\" j \"=\" i * j;"
" print \"---------------------\";"
" }"
"}";
int main ()
{
hawk_t* hawk = HAWK_NULL;
hawk_rtx_t* rtx = HAWK_NULL;
hawk_val_t* retv;
hawk_parsestd_t psin[2];
int ret;
hawk = hawk_openstd(0, HAWK_NULL); /* create a hawk instance */
if (!hawk)
{
fprintf(stderr, "ERROR: cannot open hawk\n");
ret = -1; goto oops;
}
/* set up source script file to read in */
memset(&psin, 0, HAWK_SIZEOF(psin));
psin[0].type = HAWK_PARSESTD_BCS; /* specify the first script path */
psin[0].u.bcs.ptr = (hawk_bch_t*)src;
psin[0].u.bcs.len = hawk_count_bcstr(src);
psin[1].type = HAWK_PARSESTD_NULL; /* indicate the no more script to read */
ret = hawk_parsestd(hawk, psin, HAWK_NULL); /* parse the script */
if (ret <= -1)
{
hawk_logbfmt(hawk, HAWK_LOG_STDERR, "ERROR(parse): %js\n", hawk_geterrmsg(hawk));
ret = -1; goto oops;
}
/* create a runtime context needed for execution */
rtx = hawk_rtx_openstd(
hawk,
0,
HAWK_T("hawk02"), /* ARGV[0] */
HAWK_NULL, /* stdin */
HAWK_NULL, /* stdout */
HAWK_NULL /* default cmgr */
);
if (!rtx)
{
hawk_logbfmt(hawk, HAWK_LOG_STDERR, "ERROR(rtx_open): %js\n", hawk_geterrmsg(hawk));
ret = -1; goto oops;
}
/* execute the BEGIN/pattern-action/END blocks */
retv = hawk_rtx_loop(rtx); /* alternatively, hawk_rtx_exec(rtx, HAWK_NULL, 0) */
if (!retv)
{
hawk_logbfmt(hawk, HAWK_LOG_STDERR, "ERROR(rtx_loop): %js\n", hawk_geterrmsg(hawk));
ret = -1; goto oops;
}
/* lowered the reference count of the returned value */
hawk_rtx_refdownval(rtx, retv);
ret = 0;
oops:
if (rtx) hawk_rtx_close(rtx); /* destroy the runtime context */
if (hawk) hawk_close(hawk); /* destroy the hawk instance */
return -1;
}
Embedding Hawk within an application involves a few key steps:
- Creating a Hawk Instance: The
hawk_openstd()function is used to create a new instance of the Hawk interpreter, which serves as the entry point for interacting with Hawk from within the application. - Parsing Scripts: The application can provide Hawk scripts as string literals or read them from files using the
hawk_parsestd()function. This associates the scripts with the Hawk instance for execution. - Creating a Runtime Context: A runtime context is created using
hawk_rtx_openstd(), encapsulating the state and configuration required for script execution, such as input/output streams and other settings. - Executing the Script: The
hawk_rtx_loop()orhawk_rtx_exec()functions are used to execute the Hawk script within the created runtime context, returning a value representing the result of the execution. - Handling the Result: The application can check the returned value for successful execution and handle any errors or results as needed.
- Cleaning Up: Finally, the application cleans up by closing the runtime context and destroying the Hawk instance using
hawk_rtx_close()andhawk_close(), respectively.
By following this pattern, applications can seamlessly embed the Hawk interpreter, leveraging its scripting capabilities and data manipulation functionality while benefiting from its portability, efficiency, and extensibility.
Assuming the above sample code is stored in hawk02.c and the built Hawk library has been installed properly, you may compile the sample code by running the following commands:
$ gcc -Wall -O2 -o hawk02 hawk02.c -lhawk
The actual command may vary depending on the compiler used and the configure options used.
Embedding Hawk in C++ Applications
Hawk can also be embedded in C++ applications. Here's an example:
#include <Hawk.hpp>
#include <stdio.h>
int main ()
{
HAWK::HawkStd hawk;
if (hawk.open() <= -1)
{
fprintf(stderr, "unable to open hawk - %s\n", hawk.getErrorMessageB());
return -1;
}
HAWK::HawkStd::SourceString s("BEGIN { print \"hello, world\"; }");
if (hawk.parse(s, HAWK::HawkStd::Source::NONE) == HAWK_NULL)
{
fprintf(stderr, "unable to parse - %s\n", hawk.getErrorMessageB());
hawk.close();
return -1;
}
HAWK::Hawk::Value vr;
hawk.loop(&vr); // alternatively, hawk.exec(&vr, HAWK_NULL, 0);
hawk.close();
return 0;
}
Embedding Hawk within a C++ application involves the following key steps:
- Creating a Hawk Instance: Create a new instance of the Hawk interpreter using the
HAWK::HawkStdclass. - Parsing Scripts: Provide Hawk scripts as strings using the
HAWK::HawkStd::SourceStringclass, and parse them using thehawk.parse()method. - Executing the Script: Use the
hawk.loop()orhawk.exec()methods to execute the Hawk script, returning a value representing the result of the execution. - Handling the Result: Handle the returned value or any errors that occurred during execution.
- Cleaning Up: Clean up by calling
hawk.close()to destroy the Hawk instance.
The C++ classes are inferior to the C equivalents in that they don't allow creation of multiple runtime contexts over a single hawk instance.
Language
What Hawk Is
Hawk is an embeddable awk interpreter with extensions. It can run awk scripts from the CLI or from C/C++ and provides modules like str::, sys::, ffi::, mysql::, and sqlite::.
Running Hawk
Run a script file:
$ hawk -f script.hawk input.txt
Run an inline program:
$ echo "a,b,c" | hawk 'BEGIN{FS=","} {print $2}'
Execution Model
Hawk follows the awk pipeline:
- Input is read as records (usually lines).
RScontrols record separation. - Each record (
$0) is split into fields$1,$2, ... byFS. - A script is a sequence of
pattern { action }blocks. BEGINruns before input;ENDruns after input.
Example:
BEGIN { FS=","; print "start" }
$3 ~ /ERR/ { print NR, $1, $3 }
END { print "done", NR }
@pragma entry
Hawk can override the default BEGIN/pattern/END flow with a custom entry point:
@pragma entry main
function main(a, b) {
print "entry:", a, b
}
Run:
$ hawk -f script.hawk one two
entry: one two
Values and Types
Hawk is dynamically typed:
- Numbers: integer and floating-point.
- Strings: Unicode text.
- Characters can be written with single quotes (e.g.,
'A') and are Unicode. - Byte strings: raw bytes (
@b"..."). - Byte characters use
@b'X'and must fit in a single byte. - Containers: array, map.
@nilrepresents null.
Examples:
BEGIN {
a = 10
b = 3.14
s = "hello"
c = 'X'
bc = @b'x'
bs = @b"\x00\x01"
m = @{"k": 1}
arr = @["x", "y"]
}
Expressions and Operators
Arithmetic and Comparison
- Arithmetic:
+,-,*,/,%,**(exponentiation),++,--,<<,>>. - Comparisons:
==,!=,<,<=,>,>=. - Type-precise compare:
===and!==.
Example:
BEGIN {
x = 10 + 5 * 2
if (x >= 20) print x
if ("10" === 10) print "no"
}
Strings and Regex
- Concatenation by adjacency:
"a" "b". - Explicit concatenation:
"a" %% "b". - Regex match:
~and!~.
Example:
BEGIN {
print "hi" %% "!"
if ("A" ~ /^[A-Z]$/) print "regex ok"
}
Logical Operators
- Logical AND/OR:
&&,||. - Boolean results are numeric (
0or1).
Example:
BEGIN {
if (1 && 0) print "no"; else print "ok"
}
Bitwise Operators
- Bitwise AND/OR:
&,|. |also denotes pipes, so use parentheses when you mean bitwise OR.>>is also used for append redirection; use parentheses when you mean right shift.
Bitwise OR vs pipe example:
BEGIN {
print (1 | 2) # bitwise OR => 3
print 1 | 2 # pipe to external command "2"
}
Variables and Scope
- Variables are created on assignment.
@localand@globaldeclare scope explicitly.
Example:
@global g
BEGIN {
@local x
x = 1
g = 2
}
Arrays and Maps
Hawk supports arrays and maps.
- Arrays are indexed by numbers.
- Maps accept string and numeric keys.
- Constructors:
@[],@{},hawk::array(),hawk::map(). - All constructors accept initial values.
Example:
BEGIN {
arr = @["a", "b", "c"]
m = @{"k": "v", 10: "ten"}
arr[4] = "d"
m["x"] = 99
print arr[1], m["k"], m[10]
}
Functions
Define functions with function name(...) { ... }.
- Missing args are
@nil. - Use
&for call-by-reference. - Use
...for varargs and access them via@argcand@argv. - Functions are first-class values and can be passed as parameters (e.g., a comparator for
asort).
Example:
function inc(&x) { x += 1 }
function greet(name) { if (name == "") name = "world"; print "hi", name }
BEGIN { n = 1; inc(n); greet(); greet("hawk"); print n }
Varargs example:
function dump(...) {
@local i
for (i = 0; i < @argc; i++) print @argv[i]
}
BEGIN { dump("a", 10, "b") }
Function-parameter example:
function desc(a, b) { return b - a }
BEGIN {
@local a, b, i
a = @[3, 1, 2]
asort(a, b, desc)
for (i in b) print i, b[i]
}
Control Flow
Hawk supports standard awk control flow.
if / else
{ if ($1 > 0) print $1; else print "skip" }
while
BEGIN {
i = 1
while (i <= 3) { print i; i++ }
}
do ... while
BEGIN {
i = 0
do { print i; i++ } while (i < 3)
}
for
BEGIN {
for (i = 1; i <= 3; i++) print i
}
for (i in array)
BEGIN {
arr = @["x", "y"]
for (i in arr) print i, arr[i]
}
in operator (key existence)
Use x in b to test if a key/index exists in a map or array.
BEGIN {
b = @{"k": 1}
if ("k" in b) print "yes"
}
switch
BEGIN {
x = 2
switch (x) {
case 1: print "one"; break;
case 2: print "two"; break;
default: print "other";
}
}
break / continue / return / exit
BEGIN {
for (i = 1; i <= 5; i++) {
if (i == 3) continue
if (i == 5) break
print i
}
exit 0
}
Note: Hawk allows return inside BEGIN and END blocks, in addition to functions.
nextfile / nextofile
nextfile skips the rest of the current input file (standard awk behavior). nextofile advances to the next output file specified with -t.
Example:
$ hawk -t /tmp/1 -t /tmp/2 'BEGIN { print 10; nextofile; print 20 }'
This writes 10 to /tmp/1 and 20 to /tmp/2.
Input, Output, and Pipes
getlinereads records.getblinereads records as bytes.getline/getblinereturn1on success,0on EOF, and-1on error.- Redirection works with
<,>, and>>. - Pipes:
cmd | getline varandprint x | "cmd". - Two-way pipes:
|& - CSV-style field splitting is supported when
FSbegins with?followed by four characters (separator, escaper, left quote, right quote).
Example:
BEGIN {
while (("ls -laF" | getline x) > 0) print "\t", x;
close ("ls -laF");
}
Two-way pipe example:
BEGIN {
cmd = "sort";
data = hawk::array("hello", "world", "two-way pipe", "testing");
for (i = 1; i <= length(data); i++) print data[i] |& cmd;
close(cmd, "to");
while ((cmd |& getline line) > 0) print line;
close(cmd);
}
Redirection examples:
BEGIN {
while ((getline line < "input.txt") > 0) print line > "out.txt"
print "more" >> "out.txt"
}
Byte-record example:
BEGIN { getbline b < "bin.dat"; print str::tohex(b) }
CSV-style FS example:
BEGIN { FS="?,\"\"\""; }
{ for (i = 0; i <= NF; i++) print i, "[" $i "]"; }
This example splits hawk,can,read,"a ""CSV"" file",. to 5 fields.
- hawk
- can
- read
- a "CSV" file
- .
Built-in Variables
Common built-ins:
NR,FNR,NFFS,RS,OFS,ORSFILENAME,OFILENAME
Example:
{ print NR, NF, $0 }
Built-in Functions
Hawk includes awk built-ins (e.g., length, substr, split, index) plus extensions in modules (see below).
Example:
BEGIN { print length("hawk"), substr("hawk", 2, 2) }
Pragmas
@pragma controls parser/runtime behavior. File-scope pragmas apply per file; global-scope pragmas appear once across all files.
| Name | Scope | Values | Default | Description |
|---|---|---|---|---|
| entry | global | function name | change the program entry point | |
| implicit | file | on, off | on | allow undeclared variables |
| multilinestr | file | on, off | off | allow a multiline string literal without continuation |
| rwpipe | file | on, off | on | allow the two-way pipe operator |& |
| striprecspc | global | on, off | off | removes leading and trailing blank fields in splitting a record if FS is a regular expression mathcing all spaces |
| stripstrspc | global | on, off | on | trim leading and trailing spaces when converting a string to a number |
| numstrdetect | global | on, off | on | trim leading and trailing spaces when converting a string to a number |
| stack_limit | global | number | 5120 | specify the runtime stack size measured in the number of values |
@pragma entry
Sets a custom entry function instead of the default BEGIN/pattern/END flow.
@pragma entry main;
function main () { print "hello, world"; }
Arguments passed on the command line are provided to the entry function:
@pragma entry main
function main(arg1, arg2) {
print "Arguments:", arg1, arg2
}
$ hawk -f main.hawk arg1_value arg2_value
If you don't know the number of arguments in advance, use ... and @argv/@argc:
@pragma entry main
function main(...) {
@local i
for (i = 0; i < @argc; i++) printf("%s:", @argv[i])
print ""
}
$ hawk -f main.hawk 10 20 30 40 50
Named arguments can be combined with ... to require a minimum number of parameters:
function x(a, b, ...) {
print "a=", a, "b=", b, "rest=", (@argc - 2)
}
BEGIN { x(1, 2, 3, 4) }
@pragma implicit
Controls implicit variable declaration. off requires @local/@global.
@pragma implicit off;
BEGIN {
a = 10; ## syntax error - undefined identifier 'a'
}
In the example above, the @pragma implicit off directive is used to turn off implicit variable declaration. As a result, attempting to use the undeclared variable a will result in a syntax error.
@pragma implicit off;
BEGIN {
@local a;
a = 10; ## syntax ok - 'a' is declared before use
}
@pragma striprecspc
When FS is a space-matching regex, this controls whether leading/trailing blank fields are removed.
- @pragma striprecspc on
$ echo ' a b c d ' | hawk '@pragma striprecspc on;
BEGIN { FS="[[:space:]]+"; }
{
print "NF=" NF;
for (i = 0; i < NF; i++) print i " [" $(i+1) "]";
}'
NF=4
0 [a]
1 [b]
2 [c]
3 [d]
- @pragma striprecspc off
$ echo ' a b c d ' | hawk '@pragma striprecspc off;
BEGIN { FS="[[:space:]]+"; }
{
print "NF=" NF;
for (i = 0; i < NF; i++) print i " [" $(i+1) "]";
}'
NF=6
0 []
1 [a]
2 [b]
3 [c]
4 [d]
5 []
@include and @include_once
@include inserts another file at parse time; the semicolon is optional. @include_once avoids duplicate inclusion.
function print_hello() { print "hello\n"; }
@include "hello.inc";
BEGIN { print_hello(); }
@include_once "hello.inc";
@include_once "hello.inc";
BEGIN { print_hello(); }
You can use them inside a block or at the top level:
BEGIN {
@include "init.inc";
...
}
Comments
Hawk supports a single-line comment that begins with a hash sign # and the C-style multi-line comment.
x = y; # assign y to x.
/*
this line is ignored.
this line is ignored too.
*/
Reserved Words
The following words are reserved and cannot be used as a variable name, a parameter name, or a function name.
- @abort
- @argc
- @argv
- @global
- @include
- @include_once
- @local
- @nil
- @pragma
- @reset
- BEGIN
- END
- break
- case
- continue
- default
- delete
- do
- else
- exit
- for
- function
- getbline
- getline
- if
- in
- next
- nextfile
- nextofile
- printf
- return
- while
- switch
However, some of these words not beginning with @ can be used as normal names in the context of a module call. For example, mymod::break. In practice, the predefined names used for built-in commands, functions, and variables are treated as if they are reserved since you can't create another definition with the same name.
Some Examples
- Print the first 10 even numbers
BEGIN {
i = 0
n = 1
while (i < 10) {
if (n % 2 == 0) {
print n
i++
}
n++
}
}
- Prompt the user for a positive number
BEGIN {
do {
printf "Enter a positive number: "
getline num
} while (num <= 0)
print "You entered:", num
}
- Print the multiplication table
BEGIN {
for (i = 1; i <= 10; i++) {
for (j = 1; j <= 10; j++) {
printf "%4d", i * j
}
printf "\n"
}
}
- Print only the even numbers from 1 to 16
BEGIN {
for (i = 1; i <= 20; i++) {
if (i % 2 != 0) {
continue
}
print i
if (i >= 16) {
break
}
}
}
- Count the frequency of words in a file
{
n = split($0, words, /[^[:alnum:]_]+/)
for (i = 1; i <= n; i++) {
freq[words[i]]++
}
}
END {
for (w in freq) {
printf "%s: %d\n", w, freq[w]
}
}
Garbage Collection
The primary value management is reference counting based but map and array values are garbage-collected additionally.
Modules
Hawk supports various modules.
Hawk
- hawk::array
- hawk::call
- hawk::cmgr_exists
- hawk::function_exists
- hawk::gc
- hawk::gc_get_threshold
- hawk::gc_set_threshold
- hawk::gcrefs
- hawk::hash
- hawk::isarray
- hawk::ismap
- hawk::isnil
- hawk::map
- hawk::modlibdirs
- hawk::type
- hawk::typename
- hawk::GC_NUM_GENS
String
The str module provides an extensive set of string manipulation functions.
- str::frombase64 - decode a base64-encoded byte string
- str::fromcharcode
- str::fromhex
- str::gsub - equivalent to gsub
- str::index
- str::isalnum
- str::isalpha
- str::isblank
- str::iscntrl
- str::isdigit
- str::isgraph
- str::islower
- str::isprint
- str::ispunct
- str::isspace
- str::isupper
- str::isxdigit
- str::length - equivalent to length
- str::ltrim
- str::match - similar to match. the optional third argument is the search start index. the optional fourth argument is equivalent to the third argument to match().
- str::normspace
- str::printf - equivalent to sprintf
- str::rindex
- str::rtrim
- str::split - equivalent to split
- str::sub - equivalent to sub
- str::substr - equivalent to substr
- str::tobase64 - encode data to a base64 byte string
- str::tocharcode - get the numeric value of the first character
- str::tohex
- str::tolower - equivalent to tolower
- str::tonum - convert a string to a number. a numeric value passed as a parameter is returned as it is. the leading prefix of 0b, 0, and 0x specifies the radix of 2, 8, 16 respectively. conversion stops when the end of the string is reached or the first invalid character for conversion is encountered.
- str::toupper - equivalent to toupper
- str::trim
System
The sys module provides various functions concerning the underlying operation system.
- sys::basename
- sys::chmod
- sys::close
- sys::closedir
- sys::dirname
- sys::dup
- sys::errmsg
- sys::fork
- sys::getegid
- sys::getenv
- sys::geteuid
- sys::getgid
- sys::getpid
- sys::getppid
- sys::gettid
- sys::gettime
- sys::getuid
- sys::kill
- sys::mkdir
- sys::mktime
- sys::open
- sys::opendir
- sys::openfd
- sys::pipe
- sys::read
- sys::readdir
- sys::setttime
- sys::sleep
- sys::strftime
- sys::system
- sys::unlink
- sys::wait
- sys::write
You may read the file in raw bytes.
BEGIN {
f = sys::open("/etc/sysctl.conf", sys::O_RDONLY);
if (f >= 0) {
while (sys::read(f, x, 10) > 0) printf (B"%s", x);
sys::close(f);
}
}
You can map a raw file descriptor to a handle created by this module and use it.
BEGIN {
a = sys::openfd(1);
sys::write(a, B"let me write something here\n");
sys::close(a, sys::C_KEEPFD); ## set C_KEEPFD to release 1 without closing it.
##sys::close(a);
print "done\n";
}
Creating pipes and sharing them with a child process is not big an issue.
BEGIN {
if (sys::pipe(p0, p1, sys::O_CLOEXEC | sys::O_NONBLOCK) <= -1)
##if (sys::pipe(p0, p1, sys::O_CLOEXEC) <= -1)
##if (sys::pipe(p0, p1) <= -1)
{
print "pipe error";
return -1;
}
a = sys::fork();
if (a <= -1)
{
print "fork error";
sys::close (p0);
sys::close (p1);
}
else if (a == 0)
{
## child
printf ("child.... %d %d %d\n", sys::getpid(), p0, p1);
sys::close (p1);
while (1)
{
n = sys::read (p0, k, 3);
if (n <= 0)
{
if (n == sys::RC_EAGAIN) continue; ## nonblock but data not available
if (n != 0) print "ERROR: " sys::errmsg();
break;
}
print k;
}
sys::close (p0);
return 123;
}
else
{
## parent
printf ("parent.... %d %d %d\n", sys::getpid(), p0, p1);
sys::close (p0);
sys::write (p1, B"hello");
sys::write (p1, B"world");
sys::close (p1);
##sys::wait(a, status, sys::WNOHANG);
while (sys::wait(a, status) != a);
if (sys::WIFEXITED(status)) print "Exit code: " sys::WEXITSTATUS(status);
else print "Child terminated abnormally"
}
}
You can read standard output of a child process in a parent process.
BEGIN {
if (sys::pipe(p0, p1, sys::O_NONBLOCK | sys::O_CLOEXEC) <= -1)
{
print "pipe error";
return -1;
}
a = sys::fork();
if (a <= -1)
{
print "fork error";
sys::close (p0);
sys::close (p1);
}
else if (a == 0)
{
## child
sys::close (p0);
stdout = sys::openfd(1);
sys::dup(p1, stdout);
print B"hello world";
print B"testing sys::dup()";
print B"writing to standard output..";
sys::close (p1);
sys::close (stdout);
}
else
{
sys::close (p1);
while (1)
{
n = sys::read(p0, k, 10);
if (n <= 0)
{
if (n == sys::RC_EAGAIN) continue; ## nonblock but data not available
if (n != 0) print "ERROR: " sys::errmsg();
break;
}
print "[" k "]";
}
sys::close (p0);
sys::wait(a);
}
}
You can duplicate file handles as necessary.
BEGIN {
a = sys::open("/etc/inittab", sys::O_RDONLY);
x = sys::open("/etc/fstab", sys::O_RDONLY);
b = sys::dup(a);
sys::close(a);
while (sys::read(b, abc, 100) > 0) printf (B"%s", abc);
print "-------------------------------";
c = sys::dup(x, b, sys::O_CLOEXEC);
## assertion: b == c
sys::close (x);
while (sys::read(c, abc, 100) > 0) printf (B"%s", abc);
sys::close (c);
}
Directory traversal is easy.
BEGIN {
d = sys::opendir("/etc", sys::DIR_SORT);
if (d >= 0)
{
while (sys::readdir(d,a) > 0)
{
print a;
sys::stat("/etc/" %% a, b);
for (i in b) print "\t", i, b[i];
}
sys::closedir(d);
}
}
You can get information of a network interface.
BEGIN {
if (sys::getnwifcfg("lo", sys::NWIFCFG_IN6, x) <= -1)
print sys::errmsg();
else
for (i in x) print i, x[i];
}
Socket functions are available.
BEGIN
{
s = sys::socket();
...
sys::close (s);
}
ffi
- ffi::open
- ffi::close
- ffi::call
- ffi::errmsg
BEGIN {
ffi = ffi::open();
if (ffi::call(ffi, r, @B"getenv", @B"s>s", "PATH") <= -1) print ffi::errmsg();
else print r;
ffi::close (ffi);
}
mysql
BEGIN {
mysql = mysql::open();
if (mysql::connect(mysql, "localhost", "username", "password", "mysql") <= -1)
{
print "connect error -", mysql::errmsg();
}
if (mysql::query(mysql, "select * from user") <= -1)
{
print "query error -", mysql::errmsg();
}
result = mysql::store_result(mysql);
if (result <= -1)
{
print "store result error - ", mysql::errmsg();
}
while (mysql::fetch_row(result, row) > 0)
{
ncols = length(row);
for (i = 0; i < ncols; i++) print row[i];
print "----";
}
mysql::free_result(result);
mysql::close(mysql);
}
sqlite
Assuming /tmp/test.db with the following schema,
sqlite> .schema
CREATE TABLE a(x int, y varchar(255));
You can retreive all rows as shown below:
@pragma entry main
@pragma implicit off
function main() {
@local db, stmt, row, i, ncols;
db = sqlite::open();
if (db <= -1) {
print "open error -", sqlite::errmsg();
return;
}
if (sqlite::connect(db, "/tmp/test.db", sqlite::CONNECT_READWRITE) <= -1) {
print "connect error -", sqlite::errmsg();
sqlite::close(db);
return;
}
sqlite::exec(db, "begin transaction");
sqlite::exec(db, "delete from a");
for (i = 0; i < 10; i++) {
@local sql, fld;
if (sqlite::escape_string(db, ((i % 2)? @b"'STXETX'": "'␂␃'") %% (math::rand() * 100), fld) <= -1) {
print "escape_string error -", sqlite::errmsg();
sqlite::exec(db, "rollback");
sqlite::close(db);
return;
}
sql=sprintf("insert into a(x,y) values(%d,'%s')", math::rand() * 100, fld);
print sql;
if (sqlite::exec(db, sql) <= -1) {
print "exec error -", sqlite::errmsg();
sqlite::exec(db, "rollback");
sqlite::close(db);
return;
}
}
sqlite::exec(db, "commit");
stmt = sqlite::prepare(db, "select x,y from a where x>?");
if (stmt <= -1) {
print "prepare error -", sqlite::errmsg();
sqlite::close(db);
return;
}
if (sqlite::bind(stmt, 1, 10) <= -1) {
print "bind error -", sqlite::errmsg();
sqlite::finalize(stmt);
sqlite::close(db);
return;
}
ncols = sqlite::column_count(stmt);
printf ("TOTAL %d COLUMNS:\n", ncols);
for (i = 1; i <= ncols; i++) {
print "-", i, sqlite::column_name(stmt, i);
}
while (sqlite::fetch_row(stmt, row, sqlite::FETCH_ROW_ARRAY) > 0) {
print "[id]", row[1], "[name]", row[2];
}
sqlite::finalize(stmt);
sqlite::close(db);
}
Incompatibility with AWK
Parameter passing
In AWK, it is possible for the caller to pass an uninitialized variable as a function parameter and obtain a modified value if the called function sets it to an array.
function q(a) {
a[1] = 20;
a[2] = 30;
}
BEGIN {
q(x);
for (i in x)
print i, x[i];
}
In Hawk, to achieve the same effect, you can indicate call-by-reference by prefixing the parameter name with an ampersand (&).
function q(&a) {
a[1] = 20;
a[2] = 30;
}
BEGIN {
q(x);
for (i in x)
print i, x[i];
}
Alternatively, you may create an array or a map before passing it to a function.
function q(a) {
a[1] = 20;
a[2] = 30;
}
BEGIN {
x[3] = 99; delete (x[3]); ## x = hawk::array() or x = hawk::map() also will do
q(x);
for (i in x)
print i, x[i];
}
Positional variable expression
There are subtle differences in handling expressions for positional variables. In Hawk, many of the ambiguity issues can be resolved by enclosing the expression in parentheses.
| Expression | Hawk | AWK |
|---|---|---|
$++$++i |
syntax error | OK |
$(++$(++i)) |
OK | syntax error |
Return value of getline
Others
returnis allowed inBEGINblocks,ENDblocks, and pattern-action blocks.
