updated sed docs
This commit is contained in:
@ -17,13 +17,13 @@ over its corresponding core layer name.
|
||||
|
||||
Embedding QSEAWK involves the following steps in the simplest form:
|
||||
|
||||
- open a new awk object
|
||||
- create a new awk object
|
||||
- parse in a source script
|
||||
- open a new runtime context
|
||||
- create a new runtime context
|
||||
- execute pattern-action blocks or call a function
|
||||
- decrement the reference count of the return value
|
||||
- close the runtime context
|
||||
- close the awk object
|
||||
- destroy the runtime context
|
||||
- destroy the awk object
|
||||
|
||||
The sample below follows these steps using as many standard layer functions as
|
||||
possible for convenience sake. It simply prints *hello, world* to the console.
|
||||
|
@ -24,30 +24,30 @@ forms:
|
||||
- address - specify a single address
|
||||
- address,address - specify an address range
|
||||
- start~step - specify a starting line and a step.
|
||||
#QSE_SED_STARTSTEP enables this form.
|
||||
#QSE_SED_EXTENDEDADR enables this form.
|
||||
|
||||
An @b address is a line number, a regular expression, or a dollar sign ($)
|
||||
while a @b start and a @b step is a line number.
|
||||
An *address* is a line number, a regular expression, or a dollar sign ($)
|
||||
while a *start* and a *step* is a line number.
|
||||
|
||||
A regular expression for an address has the following form:
|
||||
- /rex/ - a regular expression @b rex is enclosed in slashes
|
||||
- \\CrexC - a regular expression @b rex is enclosed in @b \\C and @b C
|
||||
where @b C can be any character.
|
||||
- /rex/ - a regular expression *rex* is enclosed in slashes
|
||||
- \\CrexC - a regular expression *rex* is enclosed in \\C and *C*
|
||||
where *C* can be any character.
|
||||
|
||||
It treats the @b \\n sequence specially to match a newline character.
|
||||
It treats the \\n sequence specially to match a newline character.
|
||||
|
||||
Here are examples of line selectors:
|
||||
- 10 - match the 10th line
|
||||
- 10,20 - match lines from the 10th to the 20th.
|
||||
- /^[[:space:]]*$/ - match an empty line
|
||||
- /^abc$/,/^def$/ - match all lines between @b abc and @b def inclusive
|
||||
- /^abc$/,/^def$/ - match all lines between *abc* and *def* inclusive
|
||||
- 10,$ - match the 10th line down to the last line.
|
||||
- 3~4 - match every 4th line from the 3rd line.
|
||||
|
||||
Note that an address range always selects the line matching the first address
|
||||
regardless of the second address; For example, 8,6 selects the 8th line.
|
||||
|
||||
The exclamation mark @b !, when used after the line selector and before
|
||||
The exclamation mark(!), when used after the line selector and before
|
||||
the command code, negates the line selection; For example, 1! selects all
|
||||
lines except the first line.
|
||||
|
||||
@ -55,172 +55,176 @@ A command without a line selector is applied to all input lines;
|
||||
A command with a single address is applied to an input line that matches
|
||||
the address; A command with an address range is applied to all input
|
||||
lines within the range, inclusive; A command with a start and a step is
|
||||
applied to every <b>step</b>'th line starting from the line @b start.
|
||||
applied to every <b>step</b>'th line starting from the line start.
|
||||
|
||||
Here is the summary of the commands.
|
||||
|
||||
- <b># comment</b>
|
||||
### # comment ###
|
||||
The text beginning from # to the line end is ignored; # in a line following
|
||||
<b>a \\</b>, <b>i \\</b>, and <b>c \\</b> is treated literally and does not
|
||||
introduce a comment.
|
||||
|
||||
- <b>: label</b>
|
||||
### : label ###
|
||||
A label can be composed of letters, digits, periods, hyphens, and underlines.
|
||||
It remembers a target label for @b b and @b t commands and prohibits a line
|
||||
It remembers a target label for *b* and *t* commands and prohibits a line
|
||||
selector.
|
||||
|
||||
- <b>{</b>
|
||||
### { ###
|
||||
The left curly bracket forms a command group where you can nest other
|
||||
commands. It should be paired with an ending @b }.
|
||||
commands. It should be paired with an ending }.
|
||||
|
||||
- <b>q</b>
|
||||
### q ###
|
||||
Terminates the exection of commands. Upon termination, it prints the pattern
|
||||
space if #QSE_SED_QUIET is not set.
|
||||
|
||||
- <b>Q</b>
|
||||
### Q ###
|
||||
Terminates the exection of commands quietly.
|
||||
|
||||
- <b>a \\ \n text</b>
|
||||
Stores @b text into the append buffer which is printed after the pattern
|
||||
### a \\ \n text ###
|
||||
Stores *text* into the append buffer which is printed after the pattern
|
||||
space for each input line. If #QSE_SED_STRICT is on, an address range is not
|
||||
allowed for a line selector. If #QSE_SED_SAMELINE is on, the backslash and the
|
||||
text can be located on the same line without a line break.
|
||||
|
||||
- <b>i \\ \n text</b>
|
||||
Inserts @b text into an insert buffer which is printed before the pattern
|
||||
### i \\ \n text ###
|
||||
Inserts *text* into an insert buffer which is printed before the pattern
|
||||
space for each input line. If #QSE_SED_STRICT is on, an address range is not
|
||||
allowed for a line selector. If #QSE_SED_SAMELINE is on, the backslash and the
|
||||
text can be located on the same line without a line break.
|
||||
|
||||
- <b>c \\ \n text</b>
|
||||
### c \\ \n text ###
|
||||
If a single line is selected for the command (i.e. no line selector, a single
|
||||
address line selector, or a start~step line selector is specified), it changes
|
||||
the pattern space to @b text and branches to the end of commands for the line.
|
||||
the pattern space to *text* and branches to the end of commands for the line.
|
||||
If an address range is specified, it deletes the pattern space and branches
|
||||
to the end of commands for all input lines but the last, and changes pattern
|
||||
space to @b text and branches to the end of commands. If #QSE_SED_SAMELINE is
|
||||
space to *text* and branches to the end of commands. If #QSE_SED_SAMELINE is
|
||||
on, the backlash and the text can be located on the same line without a line
|
||||
break.
|
||||
|
||||
- <b>d</b>
|
||||
### d ###
|
||||
Deletes the pattern space and branches to the end of commands.
|
||||
|
||||
- <b>D</b>
|
||||
### D ###
|
||||
Deletes the first line of the pattern space. If the pattern space is emptied,
|
||||
it branches to the end of script. Otherwise, the commands from the first are
|
||||
reapplied to the current pattern space.
|
||||
|
||||
- <b>=</b>
|
||||
### = ###
|
||||
Prints the current line number. If #QSE_SED_STRICT is on, an address range is
|
||||
not allowed as a line selector.
|
||||
|
||||
- <b>p</b>
|
||||
### p ###
|
||||
Prints the pattern space.
|
||||
|
||||
- <b>P</b>
|
||||
### P ###
|
||||
Prints the first line of the pattern space.
|
||||
|
||||
- <b>l</b>
|
||||
### l ###
|
||||
Prints the pattern space in a visually unambiguous form.
|
||||
|
||||
- <b>h</b>
|
||||
### h ###
|
||||
Copies the pattern space to the hold space
|
||||
|
||||
- <b>H</b>
|
||||
### H ###
|
||||
Appends the pattern space to the hold space
|
||||
|
||||
- <b>g</b>
|
||||
### g ###
|
||||
Copies the hold space to the pattern space
|
||||
|
||||
- <b>G</b>
|
||||
### G ###
|
||||
Appends the hold space to the pattern space
|
||||
|
||||
- <b>x</b>
|
||||
### x ###
|
||||
Exchanges the pattern space and the hold space
|
||||
|
||||
- <b>n</b>
|
||||
### n ###
|
||||
Prints the pattern space and read the next line from the input stream to fill
|
||||
the pattern space.
|
||||
|
||||
- <b>N</b>
|
||||
### N ###
|
||||
Prints the pattern space and read the next line from the input stream
|
||||
to append it to the pattern space with a newline inserted.
|
||||
|
||||
- <b>b</b>
|
||||
### b ###
|
||||
Branches to the end of commands.
|
||||
|
||||
- <b>b label</b>
|
||||
Branches to @b label
|
||||
### b label ###
|
||||
Branches to *label*
|
||||
|
||||
- <b>t</b>
|
||||
### t ###
|
||||
Branches to the end of commands if substitution(s//) has been made
|
||||
successfully since the last reading of an input line or the last @b t command.
|
||||
successfully since the last reading of an input line or the last *t* command.
|
||||
|
||||
- <b>t label</b>
|
||||
Branches to @b label if substitution(s//) has been made successfully
|
||||
since the last reading of an input line or the last @b t command.
|
||||
### t label ###
|
||||
Branches to *label* if substitution(s//) has been made successfully
|
||||
since the last reading of an input line or the last *t* command.
|
||||
|
||||
- <b>r file</b>
|
||||
Reads text from @b file and prints it after printing the pattern space but
|
||||
before printing the append buffer. Failure to read @b file does not cause an
|
||||
### r file ###
|
||||
Reads text from *file* and prints it after printing the pattern space but
|
||||
before printing the append buffer. Failure to read *file* does not cause an
|
||||
error.
|
||||
|
||||
- <b>R file</b>
|
||||
Reads a line of text from @b file and prints it after printing pattern space
|
||||
but before printing the append buffer. Failure to read @b file does not cause
|
||||
### R file ###
|
||||
Reads a line of text from *file* and prints it after printing pattern space
|
||||
but before printing the append buffer. Failure to read *file* does not cause
|
||||
an error.
|
||||
|
||||
- <b>w file</b>
|
||||
Writes the pattern space to @b file
|
||||
### w file ###
|
||||
Writes the pattern space to *file*
|
||||
|
||||
- <b>W file</b>
|
||||
Writes the first line of the pattern space to @b file
|
||||
### W file ####
|
||||
Writes the first line of the pattern space to *file*
|
||||
|
||||
- <b>s/rex/repl/opts</b>
|
||||
Finds a matching substring with @b rex in pattern space and replaces it
|
||||
with @b repl. @b & in @b repl refers to the matching substring. @b opts may
|
||||
be empty; You can combine the following options into @b opts:
|
||||
- @b g replaces all occurrences of a matching substring with @b rex
|
||||
- @b number replaces the <b>number</b>'th occurrence of a matching substring
|
||||
with @b rex
|
||||
- @b p prints pattern space if a successful replacement was made
|
||||
- @b w file writes pattern space to @b file if a successful replacement
|
||||
### s/rex/repl/opts ###
|
||||
Finds a matching substring with *rex* in pattern space and replaces it
|
||||
with *repl*. An ampersand(&) in *repl* refers to the matching substring.
|
||||
*opts* may be empty; You can combine the following options into *opts*:
|
||||
|
||||
- *g* replaces all occurrences of a matching substring with *rex*
|
||||
- *number* replaces the <b>number</b>'th occurrence of a matching substring
|
||||
with *rex*
|
||||
- *p* prints pattern space if a successful replacement was made
|
||||
- *w* file writes pattern space to *file* if a successful replacement
|
||||
was made. It, if specified, should be the last option.
|
||||
|
||||
- <b>y/src/dst/</b>
|
||||
Replaces all occurrences of characters in @b src with characters in @b dst.
|
||||
@b src and @b dst must contain equal number of characters.
|
||||
### y/src/dst/ ###
|
||||
Replaces all occurrences of characters in *src* with characters in *dst*.
|
||||
*src* and *dst* must contain equal number of characters.
|
||||
|
||||
- <b>c/selector/opts</b>
|
||||
### C/selector/opts ###
|
||||
Selects characters or fields from the pattern space as specified by the
|
||||
@b selector and update the pattern space with the selected text. A selector
|
||||
*selector* and update the pattern space with the selected text. A selector
|
||||
is a comma-separated list of specifiers. A specifier is one of the followings:
|
||||
<ul>
|
||||
<li>@b d specifies the input field delimiter with the next character. e.g) d:
|
||||
<li>@b D sepcifies the output field delimiter with the next character. e.g) D;
|
||||
<li>@b c specifies a position or a range of characters to select. e.g) c23-25
|
||||
<li>@b f specifies a position or a range of fields to select. e.g) f1,f4-3
|
||||
</ul>
|
||||
@b opts may be empty; You can combine the following options into @b opts:
|
||||
<ul>
|
||||
<li>@b f folds consecutive delimiters into one.
|
||||
<li>@b w uses white spaces for a field delimiter regardless of the input
|
||||
delimiter specified in the selector.
|
||||
<li>@b d deletes the pattern space if the line is not delimited by
|
||||
the input field delimiter
|
||||
</ul>
|
||||
|
||||
In principle, this can replace the @b cut utility.
|
||||
+ *d* specifies the input field delimiter with the next character. e.g) d:
|
||||
+ *D* sepcifies the output field delimiter with the next character. e.g) D;
|
||||
+ *c* specifies a position or a range of characters to select. e.g) c23-25
|
||||
+ *f* specifies a position or a range of fields to select. e.g) f1,f4-3
|
||||
|
||||
Let's see actual examples:
|
||||
- <b>G;G;G</b>
|
||||
*opts* may be empty; You can combine the following options into *opts*:
|
||||
|
||||
+ *f* folds consecutive delimiters into one.
|
||||
+ *w* uses white spaces for a field delimiter regardless of the input
|
||||
delimiter specified in the selector.
|
||||
+ *d* deletes the pattern space if the line is not delimited by
|
||||
the input field delimiter
|
||||
|
||||
In principle, this can replace the *cut* utility with the *C* command.
|
||||
|
||||
Examples
|
||||
--------
|
||||
|
||||
Here are some samples.
|
||||
|
||||
### G;G;G ###
|
||||
Triple spaces input lines. If #QSE_SED_QUIET is on, <b>G;G;G;p</b>.
|
||||
It works because the hold space is empty unless something is copied to it.
|
||||
|
||||
- <b>$!d</b>
|
||||
### $!d ###
|
||||
Prints the last line. If #QSE_SED_QUIET is on, try <b>$p</b>.
|
||||
|
||||
- <b>1!G;h;$!d</b>
|
||||
### 1!G;h;$!d ###
|
||||
Prints input lines in the reverse order. That is, it prints the last line
|
||||
first and the first line last.
|
||||
|
||||
@ -229,10 +233,10 @@ first and the first line last.
|
||||
b
|
||||
a
|
||||
|
||||
- <b>s/[[:space:]]{2,}/ /g</b>
|
||||
Compacts whitespaces if #QSE_SED_REXBOUND is on.
|
||||
### s/[[:space:]]{2,}/ /g ###
|
||||
Compacts whitespaces if #QSE_SED_EXTENDEDREX is on.
|
||||
|
||||
- <b>C/d:,f3,1/</b>
|
||||
### C/d:,f3,1/ ###
|
||||
Prints the third field and the first field from a colon separated text.
|
||||
|
||||
$ head -5 /etc/passwd
|
||||
|
@ -4,6 +4,30 @@ QSESED Embedding Guide {#sed-embed}
|
||||
Overview
|
||||
--------
|
||||
|
||||
The QSESED library is divided into the core layer and the standard layer.
|
||||
The core layer is a skeleton implmenetation that requires various callbacks
|
||||
to be useful. The standard layer provides these callbacks in a general respect.
|
||||
|
||||
You can find core layer routines in <qse/sed/sed.h> while you can find standard
|
||||
layer routines in <qse/sed/std.h>.
|
||||
|
||||
Embedding QSESED involves the following steps in the simplest form:
|
||||
|
||||
- create a new sed object
|
||||
- compile commands
|
||||
- execute commands
|
||||
- destroy the sed object
|
||||
|
||||
The sample here shows a simple stream editor than can accepts a command string,
|
||||
and optionally an input file name and an output file name.
|
||||
|
||||
\includelineno sed01.c
|
||||
|
||||
Accessing Pattern Space
|
||||
-----------------------
|
||||
|
||||
Accessing Hold Space
|
||||
--------------------
|
||||
|
||||
Embedding In C++
|
||||
----------------
|
||||
|
Reference in New Issue
Block a user