2009-06-05 07:36:02 +00:00
|
|
|
/** @page sed SED STREAM EDITOR
|
2009-05-28 01:01:33 +00:00
|
|
|
|
2009-06-18 06:43:50 +00:00
|
|
|
@section sed_contents CONTENTS
|
|
|
|
- \ref sed_intro
|
|
|
|
- \ref sed_command
|
|
|
|
- \ref sed_embed
|
|
|
|
|
2009-06-05 07:36:02 +00:00
|
|
|
@section sed_intro INTRODUCTION
|
2009-05-28 01:01:33 +00:00
|
|
|
|
2009-06-05 07:36:02 +00:00
|
|
|
The sed stream editor is a non-interactive text editing tool commonly used
|
2009-06-08 06:11:56 +00:00
|
|
|
on Unix environment. Sed reads text from an input stream, stores it to
|
|
|
|
pattern space, manipulates the pattern space by applying a set of editing
|
|
|
|
commands, and writes the pattern space to an output stream. Typically, the
|
|
|
|
input and output streams are a console or a file.
|
2009-06-05 07:36:02 +00:00
|
|
|
|
|
|
|
@b QSE provides an embeddable stream editor that supports most of
|
|
|
|
the conventional sed commands and implements input and output streams as a
|
|
|
|
callback function:
|
|
|
|
|
|
|
|
- #qse_sed_t - C type that encapsulates a stream editor
|
|
|
|
- QSE::Sed - C++ class that wraps #qse_sed_t
|
|
|
|
- QSE::StdSed - C++ child class of QSE::Sed that implements standard input and output streams
|
|
|
|
|
|
|
|
@section sed_command COMMAND
|
|
|
|
|
|
|
|
A sed command is composed of:
|
|
|
|
|
|
|
|
- line selector (optional)
|
|
|
|
- ! (optional)
|
|
|
|
- command code
|
|
|
|
- command arguments (optional, dependent on command code)
|
|
|
|
|
|
|
|
A line selector selects input lines to apply a command to and has the following
|
|
|
|
forms:
|
|
|
|
- address - specify a single address
|
|
|
|
- address,address - specify an address range
|
2009-06-08 06:11:56 +00:00
|
|
|
- start~step - specify a starting line and a step.
|
2009-06-17 00:05:40 +00:00
|
|
|
#QSE_SED_STARTSTEP enables this form.
|
2009-06-05 07:36:02 +00:00
|
|
|
|
|
|
|
An @b address is a line number, a regular expression, or a dollar sign ($)
|
|
|
|
while a @b start and a @b step is a line number.
|
|
|
|
|
|
|
|
A regular expression for an address has the following form:
|
|
|
|
- /rex/ - a regular expression @b rex is enclosed in slashes
|
|
|
|
- \\CrexC - a regular expression @b rex is enclosed in @b \\C and @b C
|
|
|
|
where @b C can be any character.
|
|
|
|
|
|
|
|
It treats the @b \\n sequence specially to match a newline character.
|
|
|
|
|
|
|
|
Here are examples of line selectors:
|
|
|
|
- 10 - match the 10th line
|
|
|
|
- 10,20 - match lines from the 10th to the 20th.
|
|
|
|
- /^[[:space:]]*$/ - match an empty line
|
|
|
|
- /^abc$/,/^def$/ - match all lines between @b abc and @b def inclusive
|
|
|
|
- 10,$ - match the 10th line down to the last line.
|
|
|
|
- 3~4 - match every 4th line from the 3rd line.
|
|
|
|
|
|
|
|
Note that an address range always selects the line matching the first address
|
|
|
|
regardless of the second address; For example, 8,6 selects the 8th line.
|
|
|
|
|
|
|
|
The exclamation mark @b !, when used after the line selector and before
|
|
|
|
the command code, negates the line selection; For example, 1! selects all
|
|
|
|
lines except the first line.
|
|
|
|
|
|
|
|
A command without a line selector is applied to all input lines;
|
|
|
|
A command with a single address is applied to an input line that matches
|
|
|
|
the address; A command with an address range is applied to all input
|
2009-06-08 06:11:56 +00:00
|
|
|
lines within the range, inclusive; A command with a start and a step is
|
2009-06-18 06:43:50 +00:00
|
|
|
applied to every <b>step</b>'th line starting from the line @b start.
|
2009-06-05 07:36:02 +00:00
|
|
|
|
|
|
|
Here is the summary of the commands.
|
|
|
|
|
2009-06-08 07:33:07 +00:00
|
|
|
- <b># comment</b>
|
2009-06-08 06:11:56 +00:00
|
|
|
The text beginning from # to the line end is ignored; # in a line following
|
|
|
|
<b>a \\</b>, <b>i \\</b>, and <b>c \\</b> is treated literally and does not
|
|
|
|
introduce a comment.
|
2009-06-18 06:43:50 +00:00
|
|
|
|
2009-06-06 06:34:15 +00:00
|
|
|
- <b>: label</b>
|
|
|
|
A label can be composed of letters, digits, periods, hyphens, and underlines.
|
|
|
|
It remembers a target label for @b b and @b t commands and prohibits a line
|
|
|
|
selector.
|
2009-06-18 06:43:50 +00:00
|
|
|
|
2009-06-06 06:34:15 +00:00
|
|
|
- <b>{</b>
|
|
|
|
The left curly bracket forms a command group where you can nest other
|
2009-06-18 06:43:50 +00:00
|
|
|
commands. It should be paired with an ending @b }.
|
|
|
|
|
2009-06-08 06:11:56 +00:00
|
|
|
- <b>q</b>
|
|
|
|
Terminates the exection of commands. Upon termination, it prints the pattern
|
|
|
|
space if #QSE_SED_QUIET is not set.
|
2009-06-18 06:43:50 +00:00
|
|
|
|
2009-06-08 06:11:56 +00:00
|
|
|
- <b>Q</b>
|
|
|
|
Terminates the exection of commands quietly.
|
2009-06-18 06:43:50 +00:00
|
|
|
|
2009-06-08 06:11:56 +00:00
|
|
|
- <b>a \\ \n text</b>
|
2009-06-18 06:43:50 +00:00
|
|
|
Stores @b text into the append buffer which is printed after the pattern
|
|
|
|
space for each input line. If #QSE_SED_STRICT is specified, a line selector
|
|
|
|
of an address range is not allowed.
|
|
|
|
|
2009-06-08 06:11:56 +00:00
|
|
|
- <b>i \\ \n text</b>
|
|
|
|
Inserts @b text into an insert buffer which is printed before the pattern
|
2009-06-18 06:43:50 +00:00
|
|
|
space for each input line. If #QSE_SED_STRICT is specified, a line selector
|
|
|
|
of an address range is not allowed.
|
|
|
|
|
2009-06-08 06:11:56 +00:00
|
|
|
- <b>c \\ \n text</b>
|
|
|
|
If a single line is selected for the command (i.e. no line selector, a single
|
|
|
|
address line selector, or a start~step line selector is specified), it changes
|
2009-06-18 06:43:50 +00:00
|
|
|
the pattern space to @b text and branches to the end of commands for the line.
|
|
|
|
If an address range is specified, it deletes the pattern space and branches
|
2009-06-08 07:09:14 +00:00
|
|
|
to the end of commands for all input lines but the last, and changes pattern
|
|
|
|
space to @b text and branches to the end of commands.
|
2009-06-18 06:43:50 +00:00
|
|
|
|
2009-06-08 06:11:56 +00:00
|
|
|
- <b>d</b>
|
2009-06-18 06:43:50 +00:00
|
|
|
Deletes the pattern space and branches to the end of commands.
|
|
|
|
|
2009-06-08 06:11:56 +00:00
|
|
|
- <b>D</b>
|
2009-06-18 06:43:50 +00:00
|
|
|
Deletes the first line of the pattern space. If the pattern space is emptied,
|
2009-06-08 07:09:14 +00:00
|
|
|
it branches to the end of script. Otherwise, the commands from the first are
|
2009-06-08 06:11:56 +00:00
|
|
|
reapplied to the current pattern space.
|
2009-06-18 06:43:50 +00:00
|
|
|
|
2009-06-08 06:11:56 +00:00
|
|
|
- <b>=</b>
|
2009-06-17 00:05:40 +00:00
|
|
|
Prints the current line number. If #QSE_SED_STRICT is speccified, an address
|
2009-06-08 06:11:56 +00:00
|
|
|
range is not allowed in the line selector.
|
2009-06-18 06:43:50 +00:00
|
|
|
|
2009-06-08 06:11:56 +00:00
|
|
|
- <b>p</b>
|
2009-06-18 06:43:50 +00:00
|
|
|
Prints the pattern space.
|
|
|
|
|
2009-06-08 06:11:56 +00:00
|
|
|
- <b>P</b>
|
2009-06-18 06:43:50 +00:00
|
|
|
Prints the first line of the pattern space.
|
|
|
|
|
2009-06-08 06:11:56 +00:00
|
|
|
- <b>l</b>
|
2009-06-18 06:43:50 +00:00
|
|
|
Prints the pattern space in a visually unambiguous form.
|
|
|
|
|
2009-06-08 06:11:56 +00:00
|
|
|
- <b>h</b>
|
2009-06-18 06:43:50 +00:00
|
|
|
Copies the pattern space to the hold space
|
|
|
|
|
2009-06-08 06:11:56 +00:00
|
|
|
- <b>H</b>
|
2009-06-18 06:43:50 +00:00
|
|
|
Appends the pattern space to the hold space
|
|
|
|
|
2009-06-08 06:11:56 +00:00
|
|
|
- <b>g</b>
|
2009-06-18 06:43:50 +00:00
|
|
|
Copies the hold space to the pattern space
|
|
|
|
|
2009-06-08 06:11:56 +00:00
|
|
|
- <b>G</b>
|
2009-06-18 06:43:50 +00:00
|
|
|
Appends the hold space to the pattern space
|
|
|
|
|
2009-06-08 06:11:56 +00:00
|
|
|
- <b>x</b>
|
2009-06-18 06:43:50 +00:00
|
|
|
Exchanges the pattern space and the hold space
|
|
|
|
|
2009-06-08 06:11:56 +00:00
|
|
|
- <b>n</b>
|
2009-06-18 06:43:50 +00:00
|
|
|
Prints the pattern space and read the next line from the input stream to fill
|
|
|
|
the pattern space.
|
|
|
|
|
2009-06-08 06:11:56 +00:00
|
|
|
- <b>N</b>
|
2009-06-18 06:43:50 +00:00
|
|
|
Prints the pattern space and read the next line from the input stream
|
|
|
|
to append it to the pattern space with a newline inserted.
|
|
|
|
|
2009-06-17 00:05:40 +00:00
|
|
|
- <b>b</b>
|
2009-06-08 07:09:14 +00:00
|
|
|
Branches to the end of commands.
|
2009-06-18 06:43:50 +00:00
|
|
|
|
2009-06-08 07:09:14 +00:00
|
|
|
- <b>b label</b>
|
|
|
|
Branches to @b label
|
2009-06-18 06:43:50 +00:00
|
|
|
|
2009-06-08 07:09:14 +00:00
|
|
|
- <b>t</b>
|
|
|
|
Branches to the end of commands if substitution(s//) has been made
|
|
|
|
successfully since the last reading of an input line or the last @b t command.
|
2009-06-18 06:43:50 +00:00
|
|
|
|
2009-06-08 07:09:14 +00:00
|
|
|
- <b>t label</b>
|
|
|
|
Branches to @b label if substitution(s//) has been made successfully
|
|
|
|
since the last reading of an input line or the last @b t command.
|
2009-06-18 06:43:50 +00:00
|
|
|
|
2009-06-08 07:09:14 +00:00
|
|
|
- <b>r file</b>
|
2009-06-18 06:43:50 +00:00
|
|
|
Reads text from @b file and prints it after printing the pattern space but
|
|
|
|
before printing the append buffer. Failure to read @b file does not cause an
|
|
|
|
error.
|
|
|
|
|
2009-06-08 07:09:14 +00:00
|
|
|
- <b>R file</b>
|
2009-06-17 00:05:40 +00:00
|
|
|
Reads a line of text from @b file and prints it after printing pattern space
|
2009-06-18 06:43:50 +00:00
|
|
|
but before printing the append buffer. Failure to read @b file does not cause an
|
2009-06-08 07:09:14 +00:00
|
|
|
error.
|
2009-06-18 06:43:50 +00:00
|
|
|
|
2009-06-08 07:09:14 +00:00
|
|
|
- <b>w file</b>
|
2009-06-18 06:43:50 +00:00
|
|
|
Writes the pattern space to @b file
|
2009-06-08 07:09:14 +00:00
|
|
|
|
|
|
|
- <b>W file</b>
|
2009-06-18 06:43:50 +00:00
|
|
|
Writes the first line of the pattern space to @b file
|
|
|
|
|
|
|
|
- <b>s/rex/repl/opts</b>
|
|
|
|
Finds a matching substring with @b rex in pattern space and replaces it
|
2009-08-31 07:19:02 +00:00
|
|
|
with @b repl. @b & in @b repl refers to the matching substring. @b opts may
|
|
|
|
be empty; You can combine the following options into @b opts:
|
2009-06-18 06:43:50 +00:00
|
|
|
- @b g replaces all occurrences of a matching substring with @b rex
|
|
|
|
- @b number replaces the <b>number</b>'th occurrence of a matching substring
|
|
|
|
with @b rex
|
|
|
|
- @b p prints pattern space if a successful replacement was made
|
|
|
|
- @b w file writes pattern space to @b file if a successful replacement
|
|
|
|
was made. It, if specified, should be the last option.
|
2009-06-08 07:09:14 +00:00
|
|
|
|
|
|
|
- <b>y/src/dst/</b>
|
|
|
|
Replaces all occurrences of characters in @b src with characters in @b dst.
|
|
|
|
@b src and @b dst must contain equal number of characters.
|
2009-06-18 06:43:50 +00:00
|
|
|
|
|
|
|
Let's see actual examples:
|
|
|
|
- <b>G;G;G</b>
|
|
|
|
Triple spaces input lines. If #QSE_SED_QUIET is on, <b>G;G;G;p</b>.
|
|
|
|
It works because the hold space is empty unless something is copied to it.
|
|
|
|
|
|
|
|
- <b>$!d</b>
|
|
|
|
Prints the last line. If #QSE_SED_QUIET is on, try <b>$p</b>.
|
|
|
|
|
|
|
|
- <b>1!G;h;$!d</b>
|
|
|
|
Prints input lines in the reverse order. That is, it prints the last line
|
|
|
|
first and the first line last.
|
|
|
|
|
|
|
|
- <b>s/[[:space:]]{2,}/ /g</b>
|
|
|
|
Compacts whitespaces if #QSE_SED_REXBOUND is on.
|
|
|
|
|
|
|
|
@section sed_embed HOW TO EMBED
|
|
|
|
|
|
|
|
In the simplest form,
|
|
|
|
- Create a stream editor - qse_sed_open()
|
|
|
|
- Compile editing commands - qse_sed_comp()
|
|
|
|
- Executes compiled commands - qse_sed_exec()
|
|
|
|
- Destroy the stream editor - qse_sed_close()
|
|
|
|
|
2009-05-28 01:01:33 +00:00
|
|
|
*/
|