Reorganized the directory structure

This commit is contained in:
2022-09-25 09:23:29 +09:00
parent 1bac167e2d
commit 84d1c4c55f
864 changed files with 11 additions and 12 deletions

304
doc/page/awk-embed.md Normal file
View File

@ -0,0 +1,304 @@
QSEAWK Embedding Guide {#awk-embed}
================================================================================
Overview
---------
The QSEAWK library is divided into two layers: core and standard.
The core layer is a skeleton implmenetation that requires various callbacks
to be useful. The standard layer provides these callbacks in a general respect.
For example, qse_awk_open() in the core layer requires a set of primitive
functions to be able to create an awk object while qse_awk_openstd() provides
qse_awk_open() with a standard set of primitive functions.
The core layer is defined in <qse/awk/awk.h> while the standard layer is
defined in <qse/awk/std.h>. Naming-wise, a standard layer name contains *std*
over its corresponding core layer name.
Embedding QSEAWK involves the following steps in the simplest form:
- create a new awk object
- parse in a source script
- create a new runtime context
- execute pattern-action blocks or call a function
- decrement the reference count of the return value
- destroy the runtime context
- destroy the awk object
The sample below follows these steps using as many standard layer functions as
possible for convenience sake. It simply prints *hello, world* to the console.
\includelineno awk01.c
Separation of the awk object and the runtime context was devised to deal with
such cases as you want to reuse the same script over different data streams.
More complex samples concerning this will be shown later.
Locale
------
While QSEAWK can use a wide character type as the default character type,
the hosting program still has to initialize the locale whenever necessary.
All the samples to be shown from here down will call a common function
init_awk_sample_locale(), use the qse_main() macro as the main function,
and call qse_runmain() for cross-platform and cross-character-set support.
Here is the function prototype.
\includelineno awk00.h
Here goes the actual function.
\includelineno awk00.c
Note that these two files do not constitute QSEAWK and are used for samples
here only.
Customizing Console I/O
-----------------------
The qse_awk_rtx_openstd() function implements I/O related callback functions
for files, pipes, and the console. While you are unlikely to change the
definition of files and pipes, the console is the most frequently customized
I/O object. Most likely, you may want to feed the console with a string or
something and capture the console output into a buffer. Though you can define
your own callback functions for files, pipes, and the console, it is possible
to override the callback functions implemented by qse_awk_rtx_openstd()
partially. This sample redefines the console handler while keeping the file
and pipe handler by qse_awk_rtx_openstd().
\includelineno awk02.c
Extention Area
--------------
When creating an awk object or a runtime context object, you can ask
a private extension area to be allocated with the main object. You can
use this extension area to store data associated with the object.
You can specify the size of the extension area when calling qse_awk_open(),
qse_awk_rtx_open(), qse_awk_openstd(), and qse_awk_rtx_openstd().
These functions iniitlize the area to zeros. You can get the pointer
to the beginning of the area with qse_awk_getxtn() and qse_awk_rtx_getxtn().
In the sample above, the string and the buffer used for I/O customization
are declared globally. When you have multiple runtime contexts and independent
console strings and buffers, you may want to associate a runtime context
with an independent console string and buffer. The extension area that can
be allocated on demand when you create a runtime context comes in handy.
The sample below shows how to associate them through the extension area
but does not create multiple runtime contexts for simplicity.
\includelineno awk03.c
Entry Point
-----------
A typical AWK program executes BEGIN, patten-action, END blocks. QSEAWK provides
a way to drive a AWK program in a different style. That is, you can execute
a particular user-defined function on demand. It can be useful if you want
to drive an AWK program in an event-driven mannger though you can free to
change the entry point for your preference. The qse_awk_rtx_call() function
used is limited to user-defined functions. It is not able to call built-in
functions like *gsub* or *index*.
\includelineno awk04.c
If you want to pass arguments to the function, you must create values with
value creation functions, updates their reference count, and pass them to
qse_awk_rtx_call(). The sample below creates 2 integer values with
qse_awk_rtx_makeintval() and pass them to the *pow* function.
\includelineno awk05.c
While qse_awk_rtx_call() looks up a function in the function table by name,
you can find the function in advance and use the information found when
calling it. qse_awk_rtx_findfun() and qse_awk_rtx_callfun() come to play a role
in this situation. qse_awk_rtx_call() in the sample above can be translated
into 2 separate calls to qse_awk_rtx_findfun() and qse_awk_rtx_callfun().
You can reduce look-up overhead via these 2 functions if you are to execute
the same function multiple times.
\includelineno awk06.c
Similarly, you can pass a more complex value than a plain number or string.
You can compose a map value with qse_awk_rtx_makemapval() or
qse_awk_rtx_makemapvalwithdata(). The following sample demonstrates how to
use qse_awk_rtx_makemapvalwithdata(), pass a created map value to
qse_awk_rtx_call(), and traverse a map value returned with
qse_awk_rtx_getfirstmapvalitr() and qse_awk_rtx_getnextmapvalitr().
\includelineno awk07.c
Built-in Global Variables
--------------------------
QSEAWK predefines global variables such as *SUBSEP* and *ARGC*. You can add
your own built-in variables in the global scope with qse_awk_addgbl(). You
must add new variables before qse_awk_parse() or qse_awk_parsestd(). Later,
you can get the values of the global variables using qse_awk_rtx_getgbl()
with an ID returned by qse_awk_addgbl(). The IDs of the predefined global
variables are available as the ::qse_awk_gbl_id_t type values
\includelineno awk08.c
Built-in Functions
------------------
QSEAWK predefines built-in functions like *match* and *gsub*. You can add your
own built-in function with qse_awk_addfnc(). The following sample shows how to
add a function named *basename* that get the base file name part of a path name.
\includelineno awk09.c
In the sample above, the *basename* function returns a resulting string. In
case of any implemenation errors, it would cause the runtime context to abort
with an error since it returned -1. To avoid the situation, you may change
the way basename() works by defining it to return the resulting string via
the second parameter and return 0 or -1 as a return value. For the arguements
to pass by reference, you can specify the letter *r* into the *arg.spec* field
at the argument position. That is, speciying *r* at the second position in
the *arg.spec* string means that you want to pass the second argument by
reference.
\includelineno awk10.c
Customizing Other Behaviors
---------------------------
QSEAWK comes with more more trait options that you can use to change the
behavior. For instance, you have seen how to disable the standard BEGIN,
END, pattern-action blocks by turning off the #QSE_AWK_PABLOCK trait option
in several sample program above.
The ::qse_awk_trait_t type defines various trait options that you can turn
on or off using qse_awk_setopt() with #QSE_AWK_TRAIT. The following code
snippet shows how to disable all built-in I/O statements like *getline*,
*print*, *printf*, *close*, *fflush*, piping, and file redirection.
Additionally, it disables the BEGIN, END, pattern-action blocks.
~~~~~{.c}
qse_awk_getopt (awk, QSE_AWK_TRAIT, &opt);
opt &= ~QSE_AWK_PABLOCK;
opt &= ~QSE_AWK_RIO;
qse_awk_setopt (awk, QSE_AWK_TRAIT, &opt);
~~~~~
This way, you can change the QSEAWK language behave differently for your
own needs.
Multiple Instances
------------------
The awk object and the runtime context object reside in its own memory blocks
allocated and maintain related information in their own object space. Multiple
instances created are independent of each other.
You can run a script over multiple data streams by creating multiple runtime
context objects from a single awk object.
TBD.
Memory Pool
-----------
You can confine the information used for an awk object include the related
runtime context objects in a single memory pool.
TBD.
Writing Modules
---------------
Modular built-in functions and variables reside in a shared object.
TBD.
Embedding in C++
-----------------
The QSE::Awk class and QSE::StdAwk classe wrap the underlying C library routines
for better object-orientation. These two classes are defined in <qse/awk/Awk.hpp>
and <qse/awk/StdAwk.hpp> respectively. The embedding task can be simplified despite
slight performance overhead. The hello-world sample in C can be rewritten with
less numbers of lines in C++.
\includelineno awk21.cpp
Customizing the console I/O is not much different in C++. When using the
QSE::StdAwk class, you can inherit the class and implement these five methods:
- int openConsole (Console& io);
- int closeConsole (Console& io);
- int flushConsole (Console& io);
- int nextConsole (Console& io);
- ssize_t readConsole (Console& io, char_t* data, size_t size);
- ssize_t writeConsole (Console& io, const char_t* data, size_t size);
The sample below shows how to do it to use a string as the console input
and store the console output to a string buffer.
\includelineno awk22.cpp
Alternatively, you can choose to implement QSE::Awk::Console::Handler
and call QSE::Awk::setConsoleHandler() with the implemented handler.
This way, you do not need to inherit QSE::Awk or QSE::StdAwk.
The sample here shows how to customize the console I/O by implementing
QSE::Awk::Console::Handler. It also shows how to run the same script
over two different data streams in a row.
\includelineno awk23.cpp
Changes in 0.6.0
----------------
### qse_awk_parsestd() ###
The second parameter of qse_awk_parsestd() specifies the input script.
In 0.5.6, it accepted a single script for input.
~~~~~{.c}
qse_awk_parsestd_t psin;
psin.type = QSE_AWK_PARSESTD_STR;
psin.u.str.ptr = src;
psin.u.str.len = qse_strlen(src);
qse_awk_parsestd (awk, &psin, QSE_NULL);
~~~~~
In 0.6.X, it accepts an array of scripts for input. To specify a single script,
use an array of 2 elements whose last element is of the #QSE_AWK_PARSESTD_NULL
type.
~~~~~{.c}
qse_awk_parsestd_t psin[2];
psin[0].type = QSE_AWK_PARSESTD_STR;
psin[0].u.str.ptr = src;
psin[0].u.str.len = qse_strlen(src);
psin[1].type = QSE_AWK_PARSESTD_NULL;
qse_awk_parsestd (awk, psin, QSE_NULL);
~~~~~
### 0 upon Opening ###
I/O handlers can return 0 for success upon opening.
\skipline ---------------------------------------------------------------------
\skipline the sample files are listed here for example list generation purpose.
\skipline ---------------------------------------------------------------------
\example awk01.c
\example awk02.c
\example awk03.c
\example awk04.c
\example awk05.c
\example awk06.c
\example awk07.c
\example awk08.c
\example awk09.c
\example awk10.c
\example awk21.cpp
\example awk22.cpp
\example awk23.cpp

1571
doc/page/awk-lang.md Normal file

File diff suppressed because it is too large Load Diff

231
doc/page/installation.md Normal file
View File

@ -0,0 +1,231 @@
Installation {#installation}
================================================================================
Source Package
--------------
You can download the source package from
http://code.google.com/p/qse/downloads/list
A source package has this naming format of *qse-<version>.tar.gz*.
Alternatively, you can check out the lastest source files from the subversion
repository by executing the following command:
svn checkout http://qse.googlecode.com/svn/trunk/qse/
Building on Unix/Linux
----------------------
The project uses the standard autoconf/automake generated script files for
buildiing. If you work on the systems where these scripts can run, you can
follow the standard procedures of configuring and making the project.
$ ./configure
$ make
$ make install
You can use this method of building for MinGW or Cygwin on Windows.
Cross-compiling for WIN32
-------------------------
While the autoconf/automake scripts may not support your native compilers,
you can cross-compile it for WIN32/WIN64 with a cross-compiler. Get a
cross-compiler installed first and run the *configure* script with a host
and a target.
With MINGW-W64, you may run *configure* as shown below:
$ ./configure --host=i686-w64-mingw32 --target=i686-w64-mingw32
$ make
$ make install
With MINGW-W64, you may run *configure* as shown below:
$ ./configure --host=x86_64-w64-mingw32 --target=x86_64-w64-mingw32
$ make
$ make install
The actual host and target names may vary depending on the cross-compiler
installed.
Native Makefiles
----------------
The project provides makefiles for some selected compilers and platforms.
The makefiles were generated with bakefile (www.bakefile.org) and can be
found in the *bld* subdirectory.
- os2-watcom/makefile (Watcom C/C++ for OS/2)
- win32-watcom/makefile (Watcom C/C++ for Windows)
- win32-borland/makefile (Borland C/C++ for Windows)
- win32-msvc/makefile (Microsoft Visual C/C++ for Windows)
You can execute your native make utility for building in each subdirectory.
For example, to build for OS/2 with Watcom C/C++ in the result mode using
the wide character type, you can execute this:
cd bld\os2-watcom
wmake BUILD=release CHAR=wchar
Build Options
-------------
The configure script and the native makefiles provides some options that you
can use to change the build environment. The options presented here can be
specified to the command line of the configure script or the native make
utilities.
For the configure script, the options should prefixed with double
slashes and mutliples options can be specified together. See this example:
./configure --enable-debug --disable-wchar
For the native makefiles, the options can be appened to the end of the command
line. See this example:
make BUILD=debug CHAR=mchar
### Build Mode ###
You can choose to build the project in the **release** mode or in the **debug**
mode. The resulting libraries and programs in the **debug** mode contain
extra information useful for debugging. The default mode is **release**.
value | configure | native makefile
--------|----------------|-----------------
debug | enable-debug | BUILD=debug
release | disable-debug | BUILD=release
### Character Type ###
You can choose between the wide charcter type and the multi-byte character
type as a basic character type represented in the #qse_char_t type. The default
character type is the wide character type.
value | configure | native makefile
-----------|----------------|-----------------
wide | enable-wchar | CHAR=wchar
multi-byte | disable-wchar | CHAR=mchar
If the wide charater type is chosen:
- #QSE_CHAR_IS_WCHAR is defined.
- #qse_char_t maps to #qse_wchar_t.
If the multi-byte charater type is chosen:
- #QSE_CHAR_IS_MCHAR is defined.
- #qse_char_t maps to #qse_mchar_t.
### Bundled Unicode Routines ###
You can choose to use the bundled character classification routines
based on unicode. It is disabled by default.
value | configure | native makefile
-----------|--------------------------|-----------------
on | enable-bundled-unicode | BUNDLED_UNICODE=on
off | disable-bundled-unicode | BUNDLED_UNICODE=off
Enabling this option makes the routines defined in <qse/cmn/uni.h>
to be included in the resulting library. It also affects somes routines
defined in <qse/cmn/chr.h> to use these bundled unicode routines.
### Character Encoding Conversion ###
You can include extra routines for character encoding conversion into
the resulting library. This option is disabled by default.
value | configure | native makefile
-----------|-----------------|---------------------
on | enable-xcmgrs | XCMGRS=on
off | disable-xcmgrs | XCMGRS=off
More #qse_cmgr_t instances are made available when this option is enabled.
The UTF-8 conversion and the locale-based conversion are included regardless
of this option.
### TCPV40HDRS ###
The option, when turned on, enables you to use *tcp32dll.dll* and *so32dll.dll*
instead of *tcpip32.dll*. Doing so allows a resulting program to run on OS/2
systems without the 32-bit TCP/IP stack. This option is off by default and
available for the native makefile for Watcom C/C++ for OS/2 only.
wmake TCPV40HDRS=on
### C++ ###
C++ support is enabled by default as long as a C++ compiler is detected.
If you want to disable it for any reasons, you can specify `--disable-cxx`.
./configure --disable-cxx
### SCO UNIX System V/386 Release 3.2 ###
- If /usr/include/netinet and /usr/include/net are missing,
check if there are /usr/include/sys/netinet and /usr/include/sys/net.
if they exists, you can make symbolic links.
cd /usr/include
ln -sf sys/netinet netinet
ln -sf sys/net net
- Specify GREP if configure fails to find an acceptable grep.
- Specify RANLIB to /bin/true.
/bin/ranlib ended up like this: *ranlib: .libs/libqsecmn.a: not an archive*
- Build in the source tree. Building outside the source tree is likely to fail
for dificiency of the bundled make utility.
- Do not include -g in CFLAGS.
./configure GREP=/bin/grep RANLIB=/bin/true CFLAGS=""
- Remove $(LIBLTDL) from LIBADD_LIB_COMMON in lib/awk/Makefile
- Remove $(LIBLTDL) from libqsehttp_la_LIBADD in lib/http/Makefile
make
### Mac OS X/Darwin ###
No special consideration is required if you work with moderately recent
version of developer tools. The GCC compiler by Apple before early 2000's
has an option called `-no-cpp-precomp`.
\code
% cc -E /tmp/a.c
#1 "/tmp/a.c"
int main ( )
{
Lxxxx ;
return 0 ;
}
% cc -E -no-cpp-precomp /tmp/a.c
#1 "/tmp/a.c"
int main ( )
{
Lxxxx ;
return 0 ;
}
\endcode
Without the `-no-cpp-precomp` option, some preprocessing produces erroneous
code. If your compiler behaves this way, you should specify `-no-cpp-precomp`
to CFLAGS or CXXFLAGS when running configure. For instance,
$ ./configure --prefix=/usr/local --disable-cxx CFLAGS="-Wall -g -no-cpp-precomp"
### More options ###
More options are available for the configure script. Execute this for more
information:
./configure --help

29
doc/page/mainpage.md Normal file
View File

@ -0,0 +1,29 @@
QSE {#mainpage}
================================================================================
\image html qse-logo.png
The QSE library implements AWK, SED, and Unix commands in an embeddable form
and defines data types, functions, and classes that you can use when you embed
them into an application. It also provides more fundamental data types and
funtions needed when you deal with memory, streams, data structures.
The interface has been designed to be flexible enough to access various
aspects of embedding application and an embedded object from each other.
The library is licensed under the GNU Lesser General Public License version 3:
http://www.gnu.org/licenses/
The project webpage: https://code.miflux.com/@qse or https://github.com/hyung-hwan/qse
For further information, contact:
Chung, Hyung-Hwan <hyunghwan.chung@gmail.com>
See the subpages for more information.
- \ref installation
- \ref awk-lang
- \ref awk-embed
- \ref sed-cmd
- \ref sed-embed
- \subpage mem "Memory Management"

86
doc/page/mem.doc Normal file
View File

@ -0,0 +1,86 @@
/** @page mem Memory Management
@section mem_overview Overview
A memory manager is an instance of a structure type #qse_mmgr_t. Creating
and/or initializing an object requires a memory manager to be passed in.
The default memory manager is merely a wrapper to memory allocation functions
provided by underlying operating systems: HeapAlloc/HeapReAlloc/HeapFree
on _WIN32 and malloc/realloc/free on other platforms. You can get this default
memory manager with qse_getdflmmgr() and can change it with qse_setdflmmgr().
Typically, the name of a function creating an object begins with @b qse_,
ends with @b _open, and accepts a memory manager as the first parameter.
See qse_mbs_open() for instance. So you can customize memory management
at the per-object level.
Three types of special memory allocators are provided in the library.
- #qse_xma_t - generaic private heap allocator
- #qse_fma_t - fixed-size block allocator
- #qse_pma_t - pool-based block allocator
@section mem_xma Priviate Heap
While the default memory manager allocates memory from a system-wide heap,
you can create a private heap and use it when you create an object.
The #qse_xma_t type defines a private heap manager and its functions offer
sufficient interface to form a memory manager over a private heap.
A typical usage is shown below:
@code
qse_mmgr_t mmgr;
// Create a private heap using the default memory manager
heap = qse_xma_open (QSE_NULL, 0, 1024 * 1024); // 1M heap
// Initialize a memory manager with the heap
mmgr.alloc = (qse_mmgr_alloc_t)qse_xma_alloc;
mmgr.realloc = (qse_mmgr_realloc_t)qse_xma_realloc;
mmgr.free = (qse_mmgr_free_t)qse_xma_realloc;
mmgr.ctx = heap;
// You can pass 'mmgr' when you create/initialize a different object.
....
....
// Destroy the private heap
qse_xma_close (heap);
@endcode
Note that creating a private heap requires a memory manager, too. The example
above used the default memory manager to create a private heap within the
global heap. This means that you can split a heap to smaller subheaps.
@section mem_fma Fixed-size Block Allocator
If memory blocks to allocate share the same size, you can use #qse_fma_t
for performance. It achieves fast memory allocation as it knows the block
size in advance. The blocks allocated with this memory allocator
don't outlive the memory allocator itself. That is, qse_fma_close() or
qse_fma_fini() invalidates all the pointers allocated with qse_fma_alloc().
@code
qse_fma_t* fma; int* ptr;
fma = qse_fma_open (QSE_NULL, 0, sizeof(int), 10, 0); // create an allocator
ptr = (int*)qse_fma_alloc (fma, sizeof(int)); // allocate a block
*ptr = 20; // access the block
qse_fma_free (fma, ptr); // free the block
qse_fma_close (fma); // destroy the allocator
@endcode
@section mem_pma Simple Memory Pool Allocator
If you want to allocate blocks quickly but don't want to resize or
deallocate the blocks individually, you can use #qse_pma_t.
@code
qse_pma_t* pma; int* ptr;
pma = qse_pma_open (QSE_NULL, 0); // create an allocator
ptr = (int*)qse_pma_alloc (pma, sizeof(int)); // allocate a block
*ptr = 20; // access the block
qse_pma_close (pma); // destroy the allocator
@endcode
*/

265
doc/page/sed-cmd.md Normal file
View File

@ -0,0 +1,265 @@
QSESED Commands {#sed-cmd}
================================================================================
Overview
--------
A stream editor is a non-interactive text editing tool commonly used
on Unix environment. It reads text from an input stream, stores it to
pattern space, manipulates the pattern space by applying a set of editing
commands, and writes the pattern space to an output stream. Typically, the
input and output streams are a console or a file.
Commands
--------
A sed command is composed of:
- line selector (optional)
- ! (optional)
- command code
- command arguments (optional, dependent on command code)
A line selector selects input lines to apply a command to and has the following
forms:
- address - specify a single address
- address,address - specify an address range
- start~step - specify a starting line and a step.
#QSE_SED_EXTENDEDADR enables this form.
An *address* is a line number, a regular expression, or a dollar sign ($)
while a *start* and a *step* is a line number.
A regular expression for an address has the following form:
- /rex/ - a regular expression *rex* is enclosed in slashes
- \\CrexC - a regular expression *rex* is enclosed in \\C and *C*
where *C* can be any character.
It treats the \\n sequence specially to match a newline character.
Here are examples of line selectors:
- 10 - match the 10th line
- 10,20 - match lines from the 10th to the 20th.
- /^[[:space:]]*$/ - match an empty line
- /^abc$/,/^def$/ - match all lines between *abc* and *def* inclusive
- 10,$ - match the 10th line down to the last line.
- 3~4 - match every 4th line from the 3rd line.
Note that an address range always selects the line matching the first address
regardless of the second address; For example, 8,6 selects the 8th line.
The exclamation mark(!), when used after the line selector and before
the command code, negates the line selection; For example, 1! selects all
lines except the first line.
A command without a line selector is applied to all input lines;
A command with a single address is applied to an input line that matches
the address; A command with an address range is applied to all input
lines within the range, inclusive; A command with a start and a step is
applied to every <b>step</b>'th line starting from the line start.
Here is the summary of the commands.
### # comment ###
The text beginning from # to the line end is ignored; # in a line following
<b>a \\</b>, <b>i \\</b>, and <b>c \\</b> is treated literally and does not
introduce a comment.
### : label ###
A label can be composed of letters, digits, periods, hyphens, and underlines.
It remembers a target label for *b* and *t* commands and prohibits a line
selector.
### { ###
The left curly bracket forms a command group where you can nest other
commands. It should be paired with an ending }.
### q ###
Terminates the exection of commands. Upon termination, it prints the pattern
space if #QSE_SED_QUIET is not set.
### Q ###
Terminates the exection of commands quietly.
### a \\ \n text ###
Stores *text* into the append buffer which is printed after the pattern
space for each input line. If #QSE_SED_STRICT is on, an address range is not
allowed for a line selector. If #QSE_SED_SAMELINE is on, the backslash and the
text can be located on the same line without a line break.
### i \\ \n text ###
Inserts *text* into an insert buffer which is printed before the pattern
space for each input line. If #QSE_SED_STRICT is on, an address range is not
allowed for a line selector. If #QSE_SED_SAMELINE is on, the backslash and the
text can be located on the same line without a line break.
### c \\ \n text ###
If a single line is selected for the command (i.e. no line selector, a single
address line selector, or a start~step line selector is specified), it changes
the pattern space to *text* and branches to the end of commands for the line.
If an address range is specified, it deletes the pattern space and branches
to the end of commands for all input lines but the last, and changes pattern
space to *text* and branches to the end of commands. If #QSE_SED_SAMELINE is
on, the backlash and the text can be located on the same line without a line
break.
### d ###
Deletes the pattern space and branches to the end of commands.
### D ###
Deletes the first line of the pattern space. If the pattern space is emptied,
it branches to the end of script. Otherwise, the commands from the first are
reapplied to the current pattern space.
### = ###
Prints the current line number. If #QSE_SED_STRICT is on, an address range is
not allowed as a line selector.
### p ###
Prints the pattern space.
### P ###
Prints the first line of the pattern space.
### l ###
Prints the pattern space in a visually unambiguous form.
### h ###
Copies the pattern space to the hold space
### H ###
Appends the pattern space to the hold space
### g ###
Copies the hold space to the pattern space
### G ###
Appends the hold space to the pattern space
### x ###
Exchanges the pattern space and the hold space
### n ###
Prints the pattern space and read the next line from the input stream to fill
the pattern space.
### N ###
Prints the pattern space and read the next line from the input stream
to append it to the pattern space with a newline inserted.
### b ###
Branches to the end of commands.
### b label ###
Branches to *label*
### t ###
Branches to the end of commands if substitution(s//) has been made
successfully since the last reading of an input line or the last *t* command.
### t label ###
Branches to *label* if substitution(s//) has been made successfully
since the last reading of an input line or the last *t* command.
### r file ###
Reads text from *file* and prints it after printing the pattern space but
before printing the append buffer. Failure to read *file* does not cause an
error.
### R file ###
Reads a line of text from *file* and prints it after printing pattern space
but before printing the append buffer. Failure to read *file* does not cause
an error.
### w file ###
Writes the pattern space to *file*
### W file ####
Writes the first line of the pattern space to *file*
### s/rex/repl/opts ###
Finds a matching substring with *rex* in pattern space and replaces it
with *repl*. An ampersand(&) in *repl* refers to the matching substring.
*opts* may be empty; You can combine the following options into *opts*:
- *g* replaces all occurrences of a matching substring with *rex*
- *number* replaces the <b>number</b>'th occurrence of a matching substring
with *rex*
- *p* prints pattern space if a successful replacement was made
- *w* file writes pattern space to *file* if a successful replacement
was made. It, if specified, should be the last option.
- *k* removes(kills) unmached portions from the pattern space. It is
useful for partial extraction.
### y/src/dst/ ###
Replaces all occurrences of characters in *src* with characters in *dst*.
*src* and *dst* must contain equal number of characters.
### C/selector/opts ###
Selects characters or fields from the pattern space as specified by the
*selector* and update the pattern space with the selected text. A selector
is a comma-separated list of specifiers. A specifier is one of the followings:
+ *d* specifies the input field delimiter with the next character. e.g) d:
+ *D* sepcifies the output field delimiter with the next character. e.g) D;
+ *c* specifies a position or a range of characters to select. e.g) c23-25
+ *f* specifies a position or a range of fields to select. e.g) f1,f4-3
*opts* may be empty; You can combine the following options into *opts*:
+ *f* folds consecutive delimiters into one.
+ *w* uses white spaces for a field delimiter regardless of the input
delimiter specified in the selector.
+ *d* deletes the pattern space if the line is not delimited by
the input field delimiter
In principle, this can replace the *cut* utility with the *C* command.
Examples
--------
Here are some samples.
### G;G;G ###
Triple spaces input lines. If #QSE_SED_QUIET is on, <b>G;G;G;p</b>.
It works because the hold space is empty unless something is copied to it.
### $!d ###
Prints the last line. If #QSE_SED_QUIET is on, try <b>$p</b>.
### 1!G;h;$!d ###
Prints input lines in the reverse order. That is, it prints the last line
first and the first line last.
$ echo -e "a\nb\nc" | qsesed '1!G;h;$!d'
c
b
a
### s/[[:space:]]{2,}/ /g ###
Compacts whitespaces if #QSE_SED_EXTENDEDREX is on.
### s/[0-9]/&/gk ###
Extract all digits.
$ echo "Q123Q456" | qsesed -r 's/[0-9]+/&/gk'
123456
### s/[0-9]+/&/2k ###
Extract the second number.
$ echo "Q123Q456" | qsesed -r 's/[0-9]+/&/2k'
456
### C/d:,f3,1/ ###
Prints the third field and the first field from a colon separated text.
$ head -5 /etc/passwd
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/bin/sh
bin:x:2:2:bin:/bin:/bin/sh
sys:x:3:3:sys:/dev:/bin/sh
sync:x:4:65534:sync:/bin:/bin/sync
$ qsesed '1,3C/d:,f3,1/;4,$d' /etc/passwd
0 root
1 daemon
2 bin

101
doc/page/sed-embed.md Normal file
View File

@ -0,0 +1,101 @@
QSESED Embedding Guide {#sed-embed}
================================================================================
Overview
--------
The QSESED library is divided into the core layer and the standard layer.
The core layer is a skeleton implmenetation that requires various callbacks
to be useful. The standard layer provides these callbacks in a general respect.
You can find core layer routines in <qse/sed/sed.h> while you can find standard
layer routines in <qse/sed/std.h>.
Embedding QSESED involves the following steps in the simplest form:
- create a new sed object
- compile commands
- execute commands
- destroy the sed object
The sample here shows a simple stream editor than can accepts a command string,
and optionally an input file name and an output file name.
\includelineno sed01.c
You can call qse_sed_compstdfile() instead of qse_sed_compstdstr() to compile
sed commands stored in a file. You can use qse_sed_compstd() or qse_sed_comp()
for more flexibility.
Locale
------
While QSESED can use a wide character type as the default character type,
the hosting program still has to initialize the locale whenever necessary.
All the samples shown in this page calls a common function
init_sed_sample_locale(), use the qse_main() macro as the main function,
and call qse_runmain() for cross-platform and cross-character-set support.
Here is the function prototype.
\includelineno sed00.h
Here goes the actual function.
\includelineno sed00.c
Note that these two files do not constitute QSEAWK and are used for samples
here only.
Customizing Streams
-------------------
You can use qse_sed_execstd() in customzing the input and output streams.
The sample below uses I/O resources of the #QSE_SED_IOSTD_STR type to use
an argument as input data and let the output to be dynamically allocated.
\includelineno sed02.c
You can use the core layer function qse_sed_exec() and implement the
::qse_sed_io_impl_t interface for more flexibility. No samples will
be provided here because the standard layer functions qse_sed_execstd()
and qse_sed_execstdfile() are the good samples.
Accessing Pattern and Hold Space
--------------------------------
The qse_sed_getspace() allows to you get the pointer and the length
of the pattern space and the hold space. It may not be so useful you
access them after execution is completed. The qse_sed_setopt()
function called with #QSE_SED_TRACER lets you set up a hook function
that can inspect various things during execution time.
The following sample prints the contents of the pattern space and
hold space at each phase of execution.
\includelineno sed03.c
Embedding In C++
----------------
The QSE::Sed and QSE::StdSed classes are provided for C++. The sample here shows
how to embed QSE::StdSed for stream editing.
\includelineno sed21.cpp
The following sample shows how to inherit QSE::StdSed and and create a
customized stream editor.
\includelineno sed22.cpp
\skipline ---------------------------------------------------------------------
\skipline the sample files are listed here for example list generation purpose.
\skipline ---------------------------------------------------------------------
\example sed01.c
\example sed02.c
\example sed03.c
\example sed21.cpp
\example sed22.cpp