Reorganized the directory structure
This commit is contained in:
304
doc/page/awk-embed.md
Normal file
304
doc/page/awk-embed.md
Normal file
@ -0,0 +1,304 @@
|
||||
QSEAWK Embedding Guide {#awk-embed}
|
||||
================================================================================
|
||||
|
||||
Overview
|
||||
---------
|
||||
|
||||
The QSEAWK library is divided into two layers: core and standard.
|
||||
The core layer is a skeleton implmenetation that requires various callbacks
|
||||
to be useful. The standard layer provides these callbacks in a general respect.
|
||||
For example, qse_awk_open() in the core layer requires a set of primitive
|
||||
functions to be able to create an awk object while qse_awk_openstd() provides
|
||||
qse_awk_open() with a standard set of primitive functions.
|
||||
|
||||
The core layer is defined in <qse/awk/awk.h> while the standard layer is
|
||||
defined in <qse/awk/std.h>. Naming-wise, a standard layer name contains *std*
|
||||
over its corresponding core layer name.
|
||||
|
||||
Embedding QSEAWK involves the following steps in the simplest form:
|
||||
|
||||
- create a new awk object
|
||||
- parse in a source script
|
||||
- create a new runtime context
|
||||
- execute pattern-action blocks or call a function
|
||||
- decrement the reference count of the return value
|
||||
- destroy the runtime context
|
||||
- destroy the awk object
|
||||
|
||||
The sample below follows these steps using as many standard layer functions as
|
||||
possible for convenience sake. It simply prints *hello, world* to the console.
|
||||
|
||||
\includelineno awk01.c
|
||||
|
||||
Separation of the awk object and the runtime context was devised to deal with
|
||||
such cases as you want to reuse the same script over different data streams.
|
||||
More complex samples concerning this will be shown later.
|
||||
|
||||
Locale
|
||||
------
|
||||
|
||||
While QSEAWK can use a wide character type as the default character type,
|
||||
the hosting program still has to initialize the locale whenever necessary.
|
||||
All the samples to be shown from here down will call a common function
|
||||
init_awk_sample_locale(), use the qse_main() macro as the main function,
|
||||
and call qse_runmain() for cross-platform and cross-character-set support.
|
||||
|
||||
Here is the function prototype.
|
||||
|
||||
\includelineno awk00.h
|
||||
|
||||
Here goes the actual function.
|
||||
|
||||
\includelineno awk00.c
|
||||
|
||||
Note that these two files do not constitute QSEAWK and are used for samples
|
||||
here only.
|
||||
|
||||
Customizing Console I/O
|
||||
-----------------------
|
||||
|
||||
The qse_awk_rtx_openstd() function implements I/O related callback functions
|
||||
for files, pipes, and the console. While you are unlikely to change the
|
||||
definition of files and pipes, the console is the most frequently customized
|
||||
I/O object. Most likely, you may want to feed the console with a string or
|
||||
something and capture the console output into a buffer. Though you can define
|
||||
your own callback functions for files, pipes, and the console, it is possible
|
||||
to override the callback functions implemented by qse_awk_rtx_openstd()
|
||||
partially. This sample redefines the console handler while keeping the file
|
||||
and pipe handler by qse_awk_rtx_openstd().
|
||||
|
||||
\includelineno awk02.c
|
||||
|
||||
Extention Area
|
||||
--------------
|
||||
|
||||
When creating an awk object or a runtime context object, you can ask
|
||||
a private extension area to be allocated with the main object. You can
|
||||
use this extension area to store data associated with the object.
|
||||
You can specify the size of the extension area when calling qse_awk_open(),
|
||||
qse_awk_rtx_open(), qse_awk_openstd(), and qse_awk_rtx_openstd().
|
||||
These functions iniitlize the area to zeros. You can get the pointer
|
||||
to the beginning of the area with qse_awk_getxtn() and qse_awk_rtx_getxtn().
|
||||
|
||||
|
||||
In the sample above, the string and the buffer used for I/O customization
|
||||
are declared globally. When you have multiple runtime contexts and independent
|
||||
console strings and buffers, you may want to associate a runtime context
|
||||
with an independent console string and buffer. The extension area that can
|
||||
be allocated on demand when you create a runtime context comes in handy.
|
||||
The sample below shows how to associate them through the extension area
|
||||
but does not create multiple runtime contexts for simplicity.
|
||||
|
||||
\includelineno awk03.c
|
||||
|
||||
Entry Point
|
||||
-----------
|
||||
|
||||
A typical AWK program executes BEGIN, patten-action, END blocks. QSEAWK provides
|
||||
a way to drive a AWK program in a different style. That is, you can execute
|
||||
a particular user-defined function on demand. It can be useful if you want
|
||||
to drive an AWK program in an event-driven mannger though you can free to
|
||||
change the entry point for your preference. The qse_awk_rtx_call() function
|
||||
used is limited to user-defined functions. It is not able to call built-in
|
||||
functions like *gsub* or *index*.
|
||||
|
||||
\includelineno awk04.c
|
||||
|
||||
If you want to pass arguments to the function, you must create values with
|
||||
value creation functions, updates their reference count, and pass them to
|
||||
qse_awk_rtx_call(). The sample below creates 2 integer values with
|
||||
qse_awk_rtx_makeintval() and pass them to the *pow* function.
|
||||
|
||||
\includelineno awk05.c
|
||||
|
||||
While qse_awk_rtx_call() looks up a function in the function table by name,
|
||||
you can find the function in advance and use the information found when
|
||||
calling it. qse_awk_rtx_findfun() and qse_awk_rtx_callfun() come to play a role
|
||||
in this situation. qse_awk_rtx_call() in the sample above can be translated
|
||||
into 2 separate calls to qse_awk_rtx_findfun() and qse_awk_rtx_callfun().
|
||||
You can reduce look-up overhead via these 2 functions if you are to execute
|
||||
the same function multiple times.
|
||||
|
||||
\includelineno awk06.c
|
||||
|
||||
Similarly, you can pass a more complex value than a plain number or string.
|
||||
You can compose a map value with qse_awk_rtx_makemapval() or
|
||||
qse_awk_rtx_makemapvalwithdata(). The following sample demonstrates how to
|
||||
use qse_awk_rtx_makemapvalwithdata(), pass a created map value to
|
||||
qse_awk_rtx_call(), and traverse a map value returned with
|
||||
qse_awk_rtx_getfirstmapvalitr() and qse_awk_rtx_getnextmapvalitr().
|
||||
|
||||
\includelineno awk07.c
|
||||
|
||||
Built-in Global Variables
|
||||
--------------------------
|
||||
|
||||
QSEAWK predefines global variables such as *SUBSEP* and *ARGC*. You can add
|
||||
your own built-in variables in the global scope with qse_awk_addgbl(). You
|
||||
must add new variables before qse_awk_parse() or qse_awk_parsestd(). Later,
|
||||
you can get the values of the global variables using qse_awk_rtx_getgbl()
|
||||
with an ID returned by qse_awk_addgbl(). The IDs of the predefined global
|
||||
variables are available as the ::qse_awk_gbl_id_t type values
|
||||
|
||||
\includelineno awk08.c
|
||||
|
||||
Built-in Functions
|
||||
------------------
|
||||
|
||||
QSEAWK predefines built-in functions like *match* and *gsub*. You can add your
|
||||
own built-in function with qse_awk_addfnc(). The following sample shows how to
|
||||
add a function named *basename* that get the base file name part of a path name.
|
||||
|
||||
\includelineno awk09.c
|
||||
|
||||
In the sample above, the *basename* function returns a resulting string. In
|
||||
case of any implemenation errors, it would cause the runtime context to abort
|
||||
with an error since it returned -1. To avoid the situation, you may change
|
||||
the way basename() works by defining it to return the resulting string via
|
||||
the second parameter and return 0 or -1 as a return value. For the arguements
|
||||
to pass by reference, you can specify the letter *r* into the *arg.spec* field
|
||||
at the argument position. That is, speciying *r* at the second position in
|
||||
the *arg.spec* string means that you want to pass the second argument by
|
||||
reference.
|
||||
|
||||
\includelineno awk10.c
|
||||
|
||||
Customizing Other Behaviors
|
||||
---------------------------
|
||||
|
||||
QSEAWK comes with more more trait options that you can use to change the
|
||||
behavior. For instance, you have seen how to disable the standard BEGIN,
|
||||
END, pattern-action blocks by turning off the #QSE_AWK_PABLOCK trait option
|
||||
in several sample program above.
|
||||
|
||||
The ::qse_awk_trait_t type defines various trait options that you can turn
|
||||
on or off using qse_awk_setopt() with #QSE_AWK_TRAIT. The following code
|
||||
snippet shows how to disable all built-in I/O statements like *getline*,
|
||||
*print*, *printf*, *close*, *fflush*, piping, and file redirection.
|
||||
Additionally, it disables the BEGIN, END, pattern-action blocks.
|
||||
|
||||
~~~~~{.c}
|
||||
qse_awk_getopt (awk, QSE_AWK_TRAIT, &opt);
|
||||
opt &= ~QSE_AWK_PABLOCK;
|
||||
opt &= ~QSE_AWK_RIO;
|
||||
qse_awk_setopt (awk, QSE_AWK_TRAIT, &opt);
|
||||
~~~~~
|
||||
|
||||
This way, you can change the QSEAWK language behave differently for your
|
||||
own needs.
|
||||
|
||||
Multiple Instances
|
||||
------------------
|
||||
|
||||
The awk object and the runtime context object reside in its own memory blocks
|
||||
allocated and maintain related information in their own object space. Multiple
|
||||
instances created are independent of each other.
|
||||
|
||||
You can run a script over multiple data streams by creating multiple runtime
|
||||
context objects from a single awk object.
|
||||
|
||||
TBD.
|
||||
|
||||
Memory Pool
|
||||
-----------
|
||||
|
||||
You can confine the information used for an awk object include the related
|
||||
runtime context objects in a single memory pool.
|
||||
|
||||
TBD.
|
||||
|
||||
Writing Modules
|
||||
---------------
|
||||
|
||||
Modular built-in functions and variables reside in a shared object.
|
||||
|
||||
TBD.
|
||||
|
||||
Embedding in C++
|
||||
-----------------
|
||||
|
||||
The QSE::Awk class and QSE::StdAwk classe wrap the underlying C library routines
|
||||
for better object-orientation. These two classes are defined in <qse/awk/Awk.hpp>
|
||||
and <qse/awk/StdAwk.hpp> respectively. The embedding task can be simplified despite
|
||||
slight performance overhead. The hello-world sample in C can be rewritten with
|
||||
less numbers of lines in C++.
|
||||
|
||||
\includelineno awk21.cpp
|
||||
|
||||
Customizing the console I/O is not much different in C++. When using the
|
||||
QSE::StdAwk class, you can inherit the class and implement these five methods:
|
||||
|
||||
- int openConsole (Console& io);
|
||||
- int closeConsole (Console& io);
|
||||
- int flushConsole (Console& io);
|
||||
- int nextConsole (Console& io);
|
||||
- ssize_t readConsole (Console& io, char_t* data, size_t size);
|
||||
- ssize_t writeConsole (Console& io, const char_t* data, size_t size);
|
||||
|
||||
The sample below shows how to do it to use a string as the console input
|
||||
and store the console output to a string buffer.
|
||||
|
||||
\includelineno awk22.cpp
|
||||
|
||||
Alternatively, you can choose to implement QSE::Awk::Console::Handler
|
||||
and call QSE::Awk::setConsoleHandler() with the implemented handler.
|
||||
This way, you do not need to inherit QSE::Awk or QSE::StdAwk.
|
||||
The sample here shows how to customize the console I/O by implementing
|
||||
QSE::Awk::Console::Handler. It also shows how to run the same script
|
||||
over two different data streams in a row.
|
||||
|
||||
\includelineno awk23.cpp
|
||||
|
||||
|
||||
Changes in 0.6.0
|
||||
----------------
|
||||
|
||||
### qse_awk_parsestd() ###
|
||||
|
||||
The second parameter of qse_awk_parsestd() specifies the input script.
|
||||
|
||||
In 0.5.6, it accepted a single script for input.
|
||||
|
||||
~~~~~{.c}
|
||||
qse_awk_parsestd_t psin;
|
||||
psin.type = QSE_AWK_PARSESTD_STR;
|
||||
psin.u.str.ptr = src;
|
||||
psin.u.str.len = qse_strlen(src);
|
||||
qse_awk_parsestd (awk, &psin, QSE_NULL);
|
||||
~~~~~
|
||||
|
||||
In 0.6.X, it accepts an array of scripts for input. To specify a single script,
|
||||
use an array of 2 elements whose last element is of the #QSE_AWK_PARSESTD_NULL
|
||||
type.
|
||||
|
||||
~~~~~{.c}
|
||||
qse_awk_parsestd_t psin[2];
|
||||
psin[0].type = QSE_AWK_PARSESTD_STR;
|
||||
psin[0].u.str.ptr = src;
|
||||
psin[0].u.str.len = qse_strlen(src);
|
||||
psin[1].type = QSE_AWK_PARSESTD_NULL;
|
||||
qse_awk_parsestd (awk, psin, QSE_NULL);
|
||||
~~~~~
|
||||
|
||||
### 0 upon Opening ###
|
||||
I/O handlers can return 0 for success upon opening.
|
||||
|
||||
|
||||
|
||||
\skipline ---------------------------------------------------------------------
|
||||
\skipline the sample files are listed here for example list generation purpose.
|
||||
\skipline ---------------------------------------------------------------------
|
||||
\example awk01.c
|
||||
\example awk02.c
|
||||
\example awk03.c
|
||||
\example awk04.c
|
||||
\example awk05.c
|
||||
\example awk06.c
|
||||
\example awk07.c
|
||||
\example awk08.c
|
||||
\example awk09.c
|
||||
\example awk10.c
|
||||
\example awk21.cpp
|
||||
\example awk22.cpp
|
||||
\example awk23.cpp
|
||||
|
1571
doc/page/awk-lang.md
Normal file
1571
doc/page/awk-lang.md
Normal file
File diff suppressed because it is too large
Load Diff
231
doc/page/installation.md
Normal file
231
doc/page/installation.md
Normal file
@ -0,0 +1,231 @@
|
||||
Installation {#installation}
|
||||
================================================================================
|
||||
|
||||
Source Package
|
||||
--------------
|
||||
|
||||
You can download the source package from
|
||||
|
||||
http://code.google.com/p/qse/downloads/list
|
||||
|
||||
A source package has this naming format of *qse-<version>.tar.gz*.
|
||||
|
||||
Alternatively, you can check out the lastest source files from the subversion
|
||||
repository by executing the following command:
|
||||
|
||||
svn checkout http://qse.googlecode.com/svn/trunk/qse/
|
||||
|
||||
Building on Unix/Linux
|
||||
----------------------
|
||||
|
||||
The project uses the standard autoconf/automake generated script files for
|
||||
buildiing. If you work on the systems where these scripts can run, you can
|
||||
follow the standard procedures of configuring and making the project.
|
||||
|
||||
$ ./configure
|
||||
$ make
|
||||
$ make install
|
||||
|
||||
You can use this method of building for MinGW or Cygwin on Windows.
|
||||
|
||||
Cross-compiling for WIN32
|
||||
-------------------------
|
||||
|
||||
While the autoconf/automake scripts may not support your native compilers,
|
||||
you can cross-compile it for WIN32/WIN64 with a cross-compiler. Get a
|
||||
cross-compiler installed first and run the *configure* script with a host
|
||||
and a target.
|
||||
|
||||
With MINGW-W64, you may run *configure* as shown below:
|
||||
|
||||
$ ./configure --host=i686-w64-mingw32 --target=i686-w64-mingw32
|
||||
$ make
|
||||
$ make install
|
||||
|
||||
With MINGW-W64, you may run *configure* as shown below:
|
||||
|
||||
$ ./configure --host=x86_64-w64-mingw32 --target=x86_64-w64-mingw32
|
||||
$ make
|
||||
$ make install
|
||||
|
||||
The actual host and target names may vary depending on the cross-compiler
|
||||
installed.
|
||||
|
||||
Native Makefiles
|
||||
----------------
|
||||
|
||||
The project provides makefiles for some selected compilers and platforms.
|
||||
The makefiles were generated with bakefile (www.bakefile.org) and can be
|
||||
found in the *bld* subdirectory.
|
||||
|
||||
- os2-watcom/makefile (Watcom C/C++ for OS/2)
|
||||
- win32-watcom/makefile (Watcom C/C++ for Windows)
|
||||
- win32-borland/makefile (Borland C/C++ for Windows)
|
||||
- win32-msvc/makefile (Microsoft Visual C/C++ for Windows)
|
||||
|
||||
You can execute your native make utility for building in each subdirectory.
|
||||
For example, to build for OS/2 with Watcom C/C++ in the result mode using
|
||||
the wide character type, you can execute this:
|
||||
|
||||
cd bld\os2-watcom
|
||||
wmake BUILD=release CHAR=wchar
|
||||
|
||||
Build Options
|
||||
-------------
|
||||
|
||||
The configure script and the native makefiles provides some options that you
|
||||
can use to change the build environment. The options presented here can be
|
||||
specified to the command line of the configure script or the native make
|
||||
utilities.
|
||||
|
||||
For the configure script, the options should prefixed with double
|
||||
slashes and mutliples options can be specified together. See this example:
|
||||
|
||||
./configure --enable-debug --disable-wchar
|
||||
|
||||
|
||||
For the native makefiles, the options can be appened to the end of the command
|
||||
line. See this example:
|
||||
|
||||
make BUILD=debug CHAR=mchar
|
||||
|
||||
### Build Mode ###
|
||||
|
||||
You can choose to build the project in the **release** mode or in the **debug**
|
||||
mode. The resulting libraries and programs in the **debug** mode contain
|
||||
extra information useful for debugging. The default mode is **release**.
|
||||
|
||||
value | configure | native makefile
|
||||
--------|----------------|-----------------
|
||||
debug | enable-debug | BUILD=debug
|
||||
release | disable-debug | BUILD=release
|
||||
|
||||
### Character Type ###
|
||||
|
||||
You can choose between the wide charcter type and the multi-byte character
|
||||
type as a basic character type represented in the #qse_char_t type. The default
|
||||
character type is the wide character type.
|
||||
|
||||
value | configure | native makefile
|
||||
-----------|----------------|-----------------
|
||||
wide | enable-wchar | CHAR=wchar
|
||||
multi-byte | disable-wchar | CHAR=mchar
|
||||
|
||||
If the wide charater type is chosen:
|
||||
- #QSE_CHAR_IS_WCHAR is defined.
|
||||
- #qse_char_t maps to #qse_wchar_t.
|
||||
|
||||
If the multi-byte charater type is chosen:
|
||||
- #QSE_CHAR_IS_MCHAR is defined.
|
||||
- #qse_char_t maps to #qse_mchar_t.
|
||||
|
||||
### Bundled Unicode Routines ###
|
||||
|
||||
You can choose to use the bundled character classification routines
|
||||
based on unicode. It is disabled by default.
|
||||
|
||||
value | configure | native makefile
|
||||
-----------|--------------------------|-----------------
|
||||
on | enable-bundled-unicode | BUNDLED_UNICODE=on
|
||||
off | disable-bundled-unicode | BUNDLED_UNICODE=off
|
||||
|
||||
Enabling this option makes the routines defined in <qse/cmn/uni.h>
|
||||
to be included in the resulting library. It also affects somes routines
|
||||
defined in <qse/cmn/chr.h> to use these bundled unicode routines.
|
||||
|
||||
### Character Encoding Conversion ###
|
||||
|
||||
You can include extra routines for character encoding conversion into
|
||||
the resulting library. This option is disabled by default.
|
||||
|
||||
value | configure | native makefile
|
||||
-----------|-----------------|---------------------
|
||||
on | enable-xcmgrs | XCMGRS=on
|
||||
off | disable-xcmgrs | XCMGRS=off
|
||||
|
||||
More #qse_cmgr_t instances are made available when this option is enabled.
|
||||
The UTF-8 conversion and the locale-based conversion are included regardless
|
||||
of this option.
|
||||
|
||||
### TCPV40HDRS ###
|
||||
|
||||
The option, when turned on, enables you to use *tcp32dll.dll* and *so32dll.dll*
|
||||
instead of *tcpip32.dll*. Doing so allows a resulting program to run on OS/2
|
||||
systems without the 32-bit TCP/IP stack. This option is off by default and
|
||||
available for the native makefile for Watcom C/C++ for OS/2 only.
|
||||
|
||||
wmake TCPV40HDRS=on
|
||||
|
||||
### C++ ###
|
||||
|
||||
C++ support is enabled by default as long as a C++ compiler is detected.
|
||||
If you want to disable it for any reasons, you can specify `--disable-cxx`.
|
||||
|
||||
./configure --disable-cxx
|
||||
|
||||
### SCO UNIX System V/386 Release 3.2 ###
|
||||
|
||||
- If /usr/include/netinet and /usr/include/net are missing,
|
||||
check if there are /usr/include/sys/netinet and /usr/include/sys/net.
|
||||
if they exists, you can make symbolic links.
|
||||
|
||||
cd /usr/include
|
||||
ln -sf sys/netinet netinet
|
||||
ln -sf sys/net net
|
||||
|
||||
- Specify GREP if configure fails to find an acceptable grep.
|
||||
- Specify RANLIB to /bin/true.
|
||||
/bin/ranlib ended up like this: *ranlib: .libs/libqsecmn.a: not an archive*
|
||||
- Build in the source tree. Building outside the source tree is likely to fail
|
||||
for dificiency of the bundled make utility.
|
||||
- Do not include -g in CFLAGS.
|
||||
|
||||
./configure GREP=/bin/grep RANLIB=/bin/true CFLAGS=""
|
||||
|
||||
- Remove $(LIBLTDL) from LIBADD_LIB_COMMON in lib/awk/Makefile
|
||||
- Remove $(LIBLTDL) from libqsehttp_la_LIBADD in lib/http/Makefile
|
||||
|
||||
make
|
||||
|
||||
### Mac OS X/Darwin ###
|
||||
|
||||
No special consideration is required if you work with moderately recent
|
||||
version of developer tools. The GCC compiler by Apple before early 2000's
|
||||
has an option called `-no-cpp-precomp`.
|
||||
|
||||
\code
|
||||
% cc -E /tmp/a.c
|
||||
#1 "/tmp/a.c"
|
||||
|
||||
|
||||
int main ( )
|
||||
{
|
||||
Lxxxx ;
|
||||
return 0 ;
|
||||
}
|
||||
|
||||
% cc -E -no-cpp-precomp /tmp/a.c
|
||||
#1 "/tmp/a.c"
|
||||
|
||||
|
||||
int main ( )
|
||||
{
|
||||
Lxxxx ;
|
||||
return 0 ;
|
||||
}
|
||||
\endcode
|
||||
|
||||
Without the `-no-cpp-precomp` option, some preprocessing produces erroneous
|
||||
code. If your compiler behaves this way, you should specify `-no-cpp-precomp`
|
||||
to CFLAGS or CXXFLAGS when running configure. For instance,
|
||||
|
||||
$ ./configure --prefix=/usr/local --disable-cxx CFLAGS="-Wall -g -no-cpp-precomp"
|
||||
|
||||
|
||||
### More options ###
|
||||
|
||||
More options are available for the configure script. Execute this for more
|
||||
information:
|
||||
|
||||
./configure --help
|
||||
|
29
doc/page/mainpage.md
Normal file
29
doc/page/mainpage.md
Normal file
@ -0,0 +1,29 @@
|
||||
QSE {#mainpage}
|
||||
================================================================================
|
||||
|
||||
\image html qse-logo.png
|
||||
|
||||
The QSE library implements AWK, SED, and Unix commands in an embeddable form
|
||||
and defines data types, functions, and classes that you can use when you embed
|
||||
them into an application. It also provides more fundamental data types and
|
||||
funtions needed when you deal with memory, streams, data structures.
|
||||
The interface has been designed to be flexible enough to access various
|
||||
aspects of embedding application and an embedded object from each other.
|
||||
|
||||
The library is licensed under the GNU Lesser General Public License version 3:
|
||||
http://www.gnu.org/licenses/
|
||||
|
||||
The project webpage: https://code.miflux.com/@qse or https://github.com/hyung-hwan/qse
|
||||
|
||||
For further information, contact:
|
||||
Chung, Hyung-Hwan <hyunghwan.chung@gmail.com>
|
||||
|
||||
See the subpages for more information.
|
||||
|
||||
- \ref installation
|
||||
- \ref awk-lang
|
||||
- \ref awk-embed
|
||||
- \ref sed-cmd
|
||||
- \ref sed-embed
|
||||
- \subpage mem "Memory Management"
|
||||
|
86
doc/page/mem.doc
Normal file
86
doc/page/mem.doc
Normal file
@ -0,0 +1,86 @@
|
||||
/** @page mem Memory Management
|
||||
|
||||
@section mem_overview Overview
|
||||
|
||||
A memory manager is an instance of a structure type #qse_mmgr_t. Creating
|
||||
and/or initializing an object requires a memory manager to be passed in.
|
||||
|
||||
The default memory manager is merely a wrapper to memory allocation functions
|
||||
provided by underlying operating systems: HeapAlloc/HeapReAlloc/HeapFree
|
||||
on _WIN32 and malloc/realloc/free on other platforms. You can get this default
|
||||
memory manager with qse_getdflmmgr() and can change it with qse_setdflmmgr().
|
||||
|
||||
Typically, the name of a function creating an object begins with @b qse_,
|
||||
ends with @b _open, and accepts a memory manager as the first parameter.
|
||||
See qse_mbs_open() for instance. So you can customize memory management
|
||||
at the per-object level.
|
||||
|
||||
Three types of special memory allocators are provided in the library.
|
||||
- #qse_xma_t - generaic private heap allocator
|
||||
- #qse_fma_t - fixed-size block allocator
|
||||
- #qse_pma_t - pool-based block allocator
|
||||
|
||||
@section mem_xma Priviate Heap
|
||||
|
||||
While the default memory manager allocates memory from a system-wide heap,
|
||||
you can create a private heap and use it when you create an object.
|
||||
The #qse_xma_t type defines a private heap manager and its functions offer
|
||||
sufficient interface to form a memory manager over a private heap.
|
||||
|
||||
A typical usage is shown below:
|
||||
|
||||
@code
|
||||
qse_mmgr_t mmgr;
|
||||
|
||||
// Create a private heap using the default memory manager
|
||||
heap = qse_xma_open (QSE_NULL, 0, 1024 * 1024); // 1M heap
|
||||
|
||||
// Initialize a memory manager with the heap
|
||||
mmgr.alloc = (qse_mmgr_alloc_t)qse_xma_alloc;
|
||||
mmgr.realloc = (qse_mmgr_realloc_t)qse_xma_realloc;
|
||||
mmgr.free = (qse_mmgr_free_t)qse_xma_realloc;
|
||||
mmgr.ctx = heap;
|
||||
|
||||
// You can pass 'mmgr' when you create/initialize a different object.
|
||||
....
|
||||
....
|
||||
|
||||
// Destroy the private heap
|
||||
qse_xma_close (heap);
|
||||
@endcode
|
||||
|
||||
Note that creating a private heap requires a memory manager, too. The example
|
||||
above used the default memory manager to create a private heap within the
|
||||
global heap. This means that you can split a heap to smaller subheaps.
|
||||
|
||||
@section mem_fma Fixed-size Block Allocator
|
||||
|
||||
If memory blocks to allocate share the same size, you can use #qse_fma_t
|
||||
for performance. It achieves fast memory allocation as it knows the block
|
||||
size in advance. The blocks allocated with this memory allocator
|
||||
don't outlive the memory allocator itself. That is, qse_fma_close() or
|
||||
qse_fma_fini() invalidates all the pointers allocated with qse_fma_alloc().
|
||||
|
||||
@code
|
||||
qse_fma_t* fma; int* ptr;
|
||||
fma = qse_fma_open (QSE_NULL, 0, sizeof(int), 10, 0); // create an allocator
|
||||
ptr = (int*)qse_fma_alloc (fma, sizeof(int)); // allocate a block
|
||||
*ptr = 20; // access the block
|
||||
qse_fma_free (fma, ptr); // free the block
|
||||
qse_fma_close (fma); // destroy the allocator
|
||||
@endcode
|
||||
|
||||
@section mem_pma Simple Memory Pool Allocator
|
||||
|
||||
If you want to allocate blocks quickly but don't want to resize or
|
||||
deallocate the blocks individually, you can use #qse_pma_t.
|
||||
|
||||
@code
|
||||
qse_pma_t* pma; int* ptr;
|
||||
pma = qse_pma_open (QSE_NULL, 0); // create an allocator
|
||||
ptr = (int*)qse_pma_alloc (pma, sizeof(int)); // allocate a block
|
||||
*ptr = 20; // access the block
|
||||
qse_pma_close (pma); // destroy the allocator
|
||||
@endcode
|
||||
|
||||
*/
|
265
doc/page/sed-cmd.md
Normal file
265
doc/page/sed-cmd.md
Normal file
@ -0,0 +1,265 @@
|
||||
QSESED Commands {#sed-cmd}
|
||||
================================================================================
|
||||
|
||||
Overview
|
||||
--------
|
||||
A stream editor is a non-interactive text editing tool commonly used
|
||||
on Unix environment. It reads text from an input stream, stores it to
|
||||
pattern space, manipulates the pattern space by applying a set of editing
|
||||
commands, and writes the pattern space to an output stream. Typically, the
|
||||
input and output streams are a console or a file.
|
||||
|
||||
Commands
|
||||
--------
|
||||
|
||||
A sed command is composed of:
|
||||
|
||||
- line selector (optional)
|
||||
- ! (optional)
|
||||
- command code
|
||||
- command arguments (optional, dependent on command code)
|
||||
|
||||
A line selector selects input lines to apply a command to and has the following
|
||||
forms:
|
||||
- address - specify a single address
|
||||
- address,address - specify an address range
|
||||
- start~step - specify a starting line and a step.
|
||||
#QSE_SED_EXTENDEDADR enables this form.
|
||||
|
||||
An *address* is a line number, a regular expression, or a dollar sign ($)
|
||||
while a *start* and a *step* is a line number.
|
||||
|
||||
A regular expression for an address has the following form:
|
||||
- /rex/ - a regular expression *rex* is enclosed in slashes
|
||||
- \\CrexC - a regular expression *rex* is enclosed in \\C and *C*
|
||||
where *C* can be any character.
|
||||
|
||||
It treats the \\n sequence specially to match a newline character.
|
||||
|
||||
Here are examples of line selectors:
|
||||
- 10 - match the 10th line
|
||||
- 10,20 - match lines from the 10th to the 20th.
|
||||
- /^[[:space:]]*$/ - match an empty line
|
||||
- /^abc$/,/^def$/ - match all lines between *abc* and *def* inclusive
|
||||
- 10,$ - match the 10th line down to the last line.
|
||||
- 3~4 - match every 4th line from the 3rd line.
|
||||
|
||||
Note that an address range always selects the line matching the first address
|
||||
regardless of the second address; For example, 8,6 selects the 8th line.
|
||||
|
||||
The exclamation mark(!), when used after the line selector and before
|
||||
the command code, negates the line selection; For example, 1! selects all
|
||||
lines except the first line.
|
||||
|
||||
A command without a line selector is applied to all input lines;
|
||||
A command with a single address is applied to an input line that matches
|
||||
the address; A command with an address range is applied to all input
|
||||
lines within the range, inclusive; A command with a start and a step is
|
||||
applied to every <b>step</b>'th line starting from the line start.
|
||||
|
||||
Here is the summary of the commands.
|
||||
|
||||
### # comment ###
|
||||
The text beginning from # to the line end is ignored; # in a line following
|
||||
<b>a \\</b>, <b>i \\</b>, and <b>c \\</b> is treated literally and does not
|
||||
introduce a comment.
|
||||
|
||||
### : label ###
|
||||
A label can be composed of letters, digits, periods, hyphens, and underlines.
|
||||
It remembers a target label for *b* and *t* commands and prohibits a line
|
||||
selector.
|
||||
|
||||
### { ###
|
||||
The left curly bracket forms a command group where you can nest other
|
||||
commands. It should be paired with an ending }.
|
||||
|
||||
### q ###
|
||||
Terminates the exection of commands. Upon termination, it prints the pattern
|
||||
space if #QSE_SED_QUIET is not set.
|
||||
|
||||
### Q ###
|
||||
Terminates the exection of commands quietly.
|
||||
|
||||
### a \\ \n text ###
|
||||
Stores *text* into the append buffer which is printed after the pattern
|
||||
space for each input line. If #QSE_SED_STRICT is on, an address range is not
|
||||
allowed for a line selector. If #QSE_SED_SAMELINE is on, the backslash and the
|
||||
text can be located on the same line without a line break.
|
||||
|
||||
### i \\ \n text ###
|
||||
Inserts *text* into an insert buffer which is printed before the pattern
|
||||
space for each input line. If #QSE_SED_STRICT is on, an address range is not
|
||||
allowed for a line selector. If #QSE_SED_SAMELINE is on, the backslash and the
|
||||
text can be located on the same line without a line break.
|
||||
|
||||
### c \\ \n text ###
|
||||
If a single line is selected for the command (i.e. no line selector, a single
|
||||
address line selector, or a start~step line selector is specified), it changes
|
||||
the pattern space to *text* and branches to the end of commands for the line.
|
||||
If an address range is specified, it deletes the pattern space and branches
|
||||
to the end of commands for all input lines but the last, and changes pattern
|
||||
space to *text* and branches to the end of commands. If #QSE_SED_SAMELINE is
|
||||
on, the backlash and the text can be located on the same line without a line
|
||||
break.
|
||||
|
||||
### d ###
|
||||
Deletes the pattern space and branches to the end of commands.
|
||||
|
||||
### D ###
|
||||
Deletes the first line of the pattern space. If the pattern space is emptied,
|
||||
it branches to the end of script. Otherwise, the commands from the first are
|
||||
reapplied to the current pattern space.
|
||||
|
||||
### = ###
|
||||
Prints the current line number. If #QSE_SED_STRICT is on, an address range is
|
||||
not allowed as a line selector.
|
||||
|
||||
### p ###
|
||||
Prints the pattern space.
|
||||
|
||||
### P ###
|
||||
Prints the first line of the pattern space.
|
||||
|
||||
### l ###
|
||||
Prints the pattern space in a visually unambiguous form.
|
||||
|
||||
### h ###
|
||||
Copies the pattern space to the hold space
|
||||
|
||||
### H ###
|
||||
Appends the pattern space to the hold space
|
||||
|
||||
### g ###
|
||||
Copies the hold space to the pattern space
|
||||
|
||||
### G ###
|
||||
Appends the hold space to the pattern space
|
||||
|
||||
### x ###
|
||||
Exchanges the pattern space and the hold space
|
||||
|
||||
### n ###
|
||||
Prints the pattern space and read the next line from the input stream to fill
|
||||
the pattern space.
|
||||
|
||||
### N ###
|
||||
Prints the pattern space and read the next line from the input stream
|
||||
to append it to the pattern space with a newline inserted.
|
||||
|
||||
### b ###
|
||||
Branches to the end of commands.
|
||||
|
||||
### b label ###
|
||||
Branches to *label*
|
||||
|
||||
### t ###
|
||||
Branches to the end of commands if substitution(s//) has been made
|
||||
successfully since the last reading of an input line or the last *t* command.
|
||||
|
||||
### t label ###
|
||||
Branches to *label* if substitution(s//) has been made successfully
|
||||
since the last reading of an input line or the last *t* command.
|
||||
|
||||
### r file ###
|
||||
Reads text from *file* and prints it after printing the pattern space but
|
||||
before printing the append buffer. Failure to read *file* does not cause an
|
||||
error.
|
||||
|
||||
### R file ###
|
||||
Reads a line of text from *file* and prints it after printing pattern space
|
||||
but before printing the append buffer. Failure to read *file* does not cause
|
||||
an error.
|
||||
|
||||
### w file ###
|
||||
Writes the pattern space to *file*
|
||||
|
||||
### W file ####
|
||||
Writes the first line of the pattern space to *file*
|
||||
|
||||
### s/rex/repl/opts ###
|
||||
Finds a matching substring with *rex* in pattern space and replaces it
|
||||
with *repl*. An ampersand(&) in *repl* refers to the matching substring.
|
||||
*opts* may be empty; You can combine the following options into *opts*:
|
||||
|
||||
- *g* replaces all occurrences of a matching substring with *rex*
|
||||
- *number* replaces the <b>number</b>'th occurrence of a matching substring
|
||||
with *rex*
|
||||
- *p* prints pattern space if a successful replacement was made
|
||||
- *w* file writes pattern space to *file* if a successful replacement
|
||||
was made. It, if specified, should be the last option.
|
||||
- *k* removes(kills) unmached portions from the pattern space. It is
|
||||
useful for partial extraction.
|
||||
|
||||
### y/src/dst/ ###
|
||||
Replaces all occurrences of characters in *src* with characters in *dst*.
|
||||
*src* and *dst* must contain equal number of characters.
|
||||
|
||||
### C/selector/opts ###
|
||||
Selects characters or fields from the pattern space as specified by the
|
||||
*selector* and update the pattern space with the selected text. A selector
|
||||
is a comma-separated list of specifiers. A specifier is one of the followings:
|
||||
|
||||
+ *d* specifies the input field delimiter with the next character. e.g) d:
|
||||
+ *D* sepcifies the output field delimiter with the next character. e.g) D;
|
||||
+ *c* specifies a position or a range of characters to select. e.g) c23-25
|
||||
+ *f* specifies a position or a range of fields to select. e.g) f1,f4-3
|
||||
|
||||
*opts* may be empty; You can combine the following options into *opts*:
|
||||
|
||||
+ *f* folds consecutive delimiters into one.
|
||||
+ *w* uses white spaces for a field delimiter regardless of the input
|
||||
delimiter specified in the selector.
|
||||
+ *d* deletes the pattern space if the line is not delimited by
|
||||
the input field delimiter
|
||||
|
||||
In principle, this can replace the *cut* utility with the *C* command.
|
||||
|
||||
Examples
|
||||
--------
|
||||
|
||||
Here are some samples.
|
||||
|
||||
### G;G;G ###
|
||||
Triple spaces input lines. If #QSE_SED_QUIET is on, <b>G;G;G;p</b>.
|
||||
It works because the hold space is empty unless something is copied to it.
|
||||
|
||||
### $!d ###
|
||||
Prints the last line. If #QSE_SED_QUIET is on, try <b>$p</b>.
|
||||
|
||||
### 1!G;h;$!d ###
|
||||
Prints input lines in the reverse order. That is, it prints the last line
|
||||
first and the first line last.
|
||||
|
||||
$ echo -e "a\nb\nc" | qsesed '1!G;h;$!d'
|
||||
c
|
||||
b
|
||||
a
|
||||
|
||||
### s/[[:space:]]{2,}/ /g ###
|
||||
Compacts whitespaces if #QSE_SED_EXTENDEDREX is on.
|
||||
|
||||
### s/[0-9]/&/gk ###
|
||||
Extract all digits.
|
||||
|
||||
$ echo "Q123Q456" | qsesed -r 's/[0-9]+/&/gk'
|
||||
123456
|
||||
|
||||
### s/[0-9]+/&/2k ###
|
||||
Extract the second number.
|
||||
|
||||
$ echo "Q123Q456" | qsesed -r 's/[0-9]+/&/2k'
|
||||
456
|
||||
|
||||
### C/d:,f3,1/ ###
|
||||
Prints the third field and the first field from a colon separated text.
|
||||
|
||||
$ head -5 /etc/passwd
|
||||
root:x:0:0:root:/root:/bin/bash
|
||||
daemon:x:1:1:daemon:/usr/sbin:/bin/sh
|
||||
bin:x:2:2:bin:/bin:/bin/sh
|
||||
sys:x:3:3:sys:/dev:/bin/sh
|
||||
sync:x:4:65534:sync:/bin:/bin/sync
|
||||
$ qsesed '1,3C/d:,f3,1/;4,$d' /etc/passwd
|
||||
0 root
|
||||
1 daemon
|
||||
2 bin
|
101
doc/page/sed-embed.md
Normal file
101
doc/page/sed-embed.md
Normal file
@ -0,0 +1,101 @@
|
||||
QSESED Embedding Guide {#sed-embed}
|
||||
================================================================================
|
||||
|
||||
Overview
|
||||
--------
|
||||
|
||||
The QSESED library is divided into the core layer and the standard layer.
|
||||
The core layer is a skeleton implmenetation that requires various callbacks
|
||||
to be useful. The standard layer provides these callbacks in a general respect.
|
||||
|
||||
You can find core layer routines in <qse/sed/sed.h> while you can find standard
|
||||
layer routines in <qse/sed/std.h>.
|
||||
|
||||
Embedding QSESED involves the following steps in the simplest form:
|
||||
|
||||
- create a new sed object
|
||||
- compile commands
|
||||
- execute commands
|
||||
- destroy the sed object
|
||||
|
||||
The sample here shows a simple stream editor than can accepts a command string,
|
||||
and optionally an input file name and an output file name.
|
||||
|
||||
\includelineno sed01.c
|
||||
|
||||
You can call qse_sed_compstdfile() instead of qse_sed_compstdstr() to compile
|
||||
sed commands stored in a file. You can use qse_sed_compstd() or qse_sed_comp()
|
||||
for more flexibility.
|
||||
|
||||
Locale
|
||||
------
|
||||
|
||||
While QSESED can use a wide character type as the default character type,
|
||||
the hosting program still has to initialize the locale whenever necessary.
|
||||
All the samples shown in this page calls a common function
|
||||
init_sed_sample_locale(), use the qse_main() macro as the main function,
|
||||
and call qse_runmain() for cross-platform and cross-character-set support.
|
||||
|
||||
Here is the function prototype.
|
||||
|
||||
\includelineno sed00.h
|
||||
|
||||
Here goes the actual function.
|
||||
|
||||
\includelineno sed00.c
|
||||
|
||||
Note that these two files do not constitute QSEAWK and are used for samples
|
||||
here only.
|
||||
|
||||
Customizing Streams
|
||||
-------------------
|
||||
|
||||
You can use qse_sed_execstd() in customzing the input and output streams.
|
||||
The sample below uses I/O resources of the #QSE_SED_IOSTD_STR type to use
|
||||
an argument as input data and let the output to be dynamically allocated.
|
||||
|
||||
\includelineno sed02.c
|
||||
|
||||
You can use the core layer function qse_sed_exec() and implement the
|
||||
::qse_sed_io_impl_t interface for more flexibility. No samples will
|
||||
be provided here because the standard layer functions qse_sed_execstd()
|
||||
and qse_sed_execstdfile() are the good samples.
|
||||
|
||||
Accessing Pattern and Hold Space
|
||||
--------------------------------
|
||||
|
||||
The qse_sed_getspace() allows to you get the pointer and the length
|
||||
of the pattern space and the hold space. It may not be so useful you
|
||||
access them after execution is completed. The qse_sed_setopt()
|
||||
function called with #QSE_SED_TRACER lets you set up a hook function
|
||||
that can inspect various things during execution time.
|
||||
|
||||
The following sample prints the contents of the pattern space and
|
||||
hold space at each phase of execution.
|
||||
|
||||
\includelineno sed03.c
|
||||
|
||||
Embedding In C++
|
||||
----------------
|
||||
|
||||
The QSE::Sed and QSE::StdSed classes are provided for C++. The sample here shows
|
||||
how to embed QSE::StdSed for stream editing.
|
||||
|
||||
\includelineno sed21.cpp
|
||||
|
||||
The following sample shows how to inherit QSE::StdSed and and create a
|
||||
customized stream editor.
|
||||
|
||||
\includelineno sed22.cpp
|
||||
|
||||
|
||||
|
||||
|
||||
\skipline ---------------------------------------------------------------------
|
||||
\skipline the sample files are listed here for example list generation purpose.
|
||||
\skipline ---------------------------------------------------------------------
|
||||
\example sed01.c
|
||||
\example sed02.c
|
||||
\example sed03.c
|
||||
\example sed21.cpp
|
||||
\example sed22.cpp
|
Reference in New Issue
Block a user