qse/doc/page/sed-cmd.md

266 lines
9.0 KiB
Markdown
Raw Normal View History

2013-01-09 14:10:58 +00:00
QSESED Commands {#sed-cmd}
================================================================================
2009-05-28 01:01:33 +00:00
2013-01-09 14:10:58 +00:00
Overview
--------
2011-07-16 09:53:49 +00:00
A stream editor is a non-interactive text editing tool commonly used
on Unix environment. It reads text from an input stream, stores it to
pattern space, manipulates the pattern space by applying a set of editing
commands, and writes the pattern space to an output stream. Typically, the
input and output streams are a console or a file.
2009-06-05 07:36:02 +00:00
2013-01-09 14:10:58 +00:00
Commands
--------
2009-06-05 07:36:02 +00:00
A sed command is composed of:
2013-01-09 14:10:58 +00:00
- line selector (optional)
- ! (optional)
- command code
- command arguments (optional, dependent on command code)
2009-06-05 07:36:02 +00:00
A line selector selects input lines to apply a command to and has the following
forms:
2013-01-09 14:10:58 +00:00
- address - specify a single address
- address,address - specify an address range
- start~step - specify a starting line and a step.
2013-01-10 14:17:53 +00:00
#QSE_SED_EXTENDEDADR enables this form.
2009-06-05 07:36:02 +00:00
2013-01-10 14:17:53 +00:00
An *address* is a line number, a regular expression, or a dollar sign ($)
while a *start* and a *step* is a line number.
2009-06-05 07:36:02 +00:00
A regular expression for an address has the following form:
2013-01-10 14:17:53 +00:00
- /rex/ - a regular expression *rex* is enclosed in slashes
- \\CrexC - a regular expression *rex* is enclosed in \\C and *C*
where *C* can be any character.
2009-06-05 07:36:02 +00:00
2013-01-10 14:17:53 +00:00
It treats the \\n sequence specially to match a newline character.
2009-06-05 07:36:02 +00:00
Here are examples of line selectors:
2013-01-09 14:10:58 +00:00
- 10 - match the 10th line
- 10,20 - match lines from the 10th to the 20th.
- /^[[:space:]]*$/ - match an empty line
2013-01-10 14:17:53 +00:00
- /^abc$/,/^def$/ - match all lines between *abc* and *def* inclusive
2013-01-09 14:10:58 +00:00
- 10,$ - match the 10th line down to the last line.
- 3~4 - match every 4th line from the 3rd line.
2009-06-05 07:36:02 +00:00
Note that an address range always selects the line matching the first address
regardless of the second address; For example, 8,6 selects the 8th line.
2013-01-10 14:17:53 +00:00
The exclamation mark(!), when used after the line selector and before
2009-06-05 07:36:02 +00:00
the command code, negates the line selection; For example, 1! selects all
lines except the first line.
A command without a line selector is applied to all input lines;
A command with a single address is applied to an input line that matches
the address; A command with an address range is applied to all input
lines within the range, inclusive; A command with a start and a step is
2013-01-10 14:17:53 +00:00
applied to every <b>step</b>'th line starting from the line start.
2009-06-05 07:36:02 +00:00
Here is the summary of the commands.
2013-01-10 14:17:53 +00:00
### # comment ###
The text beginning from # to the line end is ignored; # in a line following
<b>a \\</b>, <b>i \\</b>, and <b>c \\</b> is treated literally and does not
introduce a comment.
2013-01-10 14:17:53 +00:00
### : label ###
2009-06-06 06:34:15 +00:00
A label can be composed of letters, digits, periods, hyphens, and underlines.
2013-01-10 14:17:53 +00:00
It remembers a target label for *b* and *t* commands and prohibits a line
2009-06-06 06:34:15 +00:00
selector.
2013-01-10 14:17:53 +00:00
### { ###
2009-06-06 06:34:15 +00:00
The left curly bracket forms a command group where you can nest other
2013-01-10 14:17:53 +00:00
commands. It should be paired with an ending }.
2013-01-10 14:17:53 +00:00
### q ###
Terminates the exection of commands. Upon termination, it prints the pattern
space if #QSE_SED_QUIET is not set.
2013-01-10 14:17:53 +00:00
### Q ###
Terminates the exection of commands quietly.
2013-01-10 14:17:53 +00:00
### a \\ \n text ###
Stores *text* into the append buffer which is printed after the pattern
2009-09-01 07:24:06 +00:00
space for each input line. If #QSE_SED_STRICT is on, an address range is not
2009-09-03 06:55:55 +00:00
allowed for a line selector. If #QSE_SED_SAMELINE is on, the backslash and the
text can be located on the same line without a line break.
2013-01-10 14:17:53 +00:00
### i \\ \n text ###
Inserts *text* into an insert buffer which is printed before the pattern
2009-09-01 07:24:06 +00:00
space for each input line. If #QSE_SED_STRICT is on, an address range is not
2009-09-03 06:55:55 +00:00
allowed for a line selector. If #QSE_SED_SAMELINE is on, the backslash and the
text can be located on the same line without a line break.
2013-01-10 14:17:53 +00:00
### c \\ \n text ###
If a single line is selected for the command (i.e. no line selector, a single
address line selector, or a start~step line selector is specified), it changes
2013-01-10 14:17:53 +00:00
the pattern space to *text* and branches to the end of commands for the line.
If an address range is specified, it deletes the pattern space and branches
2009-06-08 07:09:14 +00:00
to the end of commands for all input lines but the last, and changes pattern
2013-01-10 14:17:53 +00:00
space to *text* and branches to the end of commands. If #QSE_SED_SAMELINE is
2009-09-01 07:24:06 +00:00
on, the backlash and the text can be located on the same line without a line
break.
2013-01-10 14:17:53 +00:00
### d ###
Deletes the pattern space and branches to the end of commands.
2013-01-10 14:17:53 +00:00
### D ###
Deletes the first line of the pattern space. If the pattern space is emptied,
2009-06-08 07:09:14 +00:00
it branches to the end of script. Otherwise, the commands from the first are
reapplied to the current pattern space.
2013-01-10 14:17:53 +00:00
### = ###
2009-09-03 06:55:55 +00:00
Prints the current line number. If #QSE_SED_STRICT is on, an address range is
not allowed as a line selector.
2013-01-10 14:17:53 +00:00
### p ###
Prints the pattern space.
2013-01-10 14:17:53 +00:00
### P ###
Prints the first line of the pattern space.
2013-01-10 14:17:53 +00:00
### l ###
Prints the pattern space in a visually unambiguous form.
2013-01-10 14:17:53 +00:00
### h ###
Copies the pattern space to the hold space
2013-01-10 14:17:53 +00:00
### H ###
Appends the pattern space to the hold space
2013-01-10 14:17:53 +00:00
### g ###
Copies the hold space to the pattern space
2013-01-10 14:17:53 +00:00
### G ###
Appends the hold space to the pattern space
2013-01-10 14:17:53 +00:00
### x ###
Exchanges the pattern space and the hold space
2013-01-10 14:17:53 +00:00
### n ###
Prints the pattern space and read the next line from the input stream to fill
the pattern space.
2013-01-10 14:17:53 +00:00
### N ###
Prints the pattern space and read the next line from the input stream
to append it to the pattern space with a newline inserted.
2013-01-10 14:17:53 +00:00
### b ###
2009-06-08 07:09:14 +00:00
Branches to the end of commands.
2013-01-10 14:17:53 +00:00
### b label ###
Branches to *label*
2013-01-10 14:17:53 +00:00
### t ###
2009-06-08 07:09:14 +00:00
Branches to the end of commands if substitution(s//) has been made
2013-01-10 14:17:53 +00:00
successfully since the last reading of an input line or the last *t* command.
2013-01-10 14:17:53 +00:00
### t label ###
Branches to *label* if substitution(s//) has been made successfully
since the last reading of an input line or the last *t* command.
2013-01-10 14:17:53 +00:00
### r file ###
Reads text from *file* and prints it after printing the pattern space but
before printing the append buffer. Failure to read *file* does not cause an
error.
2013-01-10 14:17:53 +00:00
### R file ###
Reads a line of text from *file* and prints it after printing pattern space
but before printing the append buffer. Failure to read *file* does not cause
2009-09-03 06:55:55 +00:00
an error.
2013-01-10 14:17:53 +00:00
### w file ###
Writes the pattern space to *file*
### W file ####
Writes the first line of the pattern space to *file*
### s/rex/repl/opts ###
Finds a matching substring with *rex* in pattern space and replaces it
with *repl*. An ampersand(&) in *repl* refers to the matching substring.
*opts* may be empty; You can combine the following options into *opts*:
- *g* replaces all occurrences of a matching substring with *rex*
- *number* replaces the <b>number</b>'th occurrence of a matching substring
with *rex*
- *p* prints pattern space if a successful replacement was made
- *w* file writes pattern space to *file* if a successful replacement
was made. It, if specified, should be the last option.
- *k* removes(kills) unmached portions from the pattern space. It is
useful for partial extraction.
2009-06-08 07:09:14 +00:00
2013-01-10 14:17:53 +00:00
### y/src/dst/ ###
Replaces all occurrences of characters in *src* with characters in *dst*.
*src* and *dst* must contain equal number of characters.
2013-01-10 14:17:53 +00:00
### C/selector/opts ###
Selects characters or fields from the pattern space as specified by the
2013-01-10 14:17:53 +00:00
*selector* and update the pattern space with the selected text. A selector
is a comma-separated list of specifiers. A specifier is one of the followings:
2013-01-10 14:17:53 +00:00
+ *d* specifies the input field delimiter with the next character. e.g) d:
+ *D* sepcifies the output field delimiter with the next character. e.g) D;
+ *c* specifies a position or a range of characters to select. e.g) c23-25
+ *f* specifies a position or a range of fields to select. e.g) f1,f4-3
*opts* may be empty; You can combine the following options into *opts*:
+ *f* folds consecutive delimiters into one.
+ *w* uses white spaces for a field delimiter regardless of the input
delimiter specified in the selector.
+ *d* deletes the pattern space if the line is not delimited by
the input field delimiter
In principle, this can replace the *cut* utility with the *C* command.
Examples
--------
Here are some samples.
### G;G;G ###
Triple spaces input lines. If #QSE_SED_QUIET is on, <b>G;G;G;p</b>.
It works because the hold space is empty unless something is copied to it.
2013-01-10 14:17:53 +00:00
### $!d ###
Prints the last line. If #QSE_SED_QUIET is on, try <b>$p</b>.
2013-01-10 14:17:53 +00:00
### 1!G;h;$!d ###
Prints input lines in the reverse order. That is, it prints the last line
first and the first line last.
2013-01-09 14:10:58 +00:00
$ echo -e "a\nb\nc" | qsesed '1!G;h;$!d'
c
b
a
2013-01-10 14:17:53 +00:00
### s/[[:space:]]{2,}/ /g ###
Compacts whitespaces if #QSE_SED_EXTENDEDREX is on.
### s/[0-9]/&/gk ###
Extract all digits.
$ echo "Q123Q456" | qsesed -r 's/[0-9]+/&/gk'
123456
### s/[0-9]+/&/2k ###
Extract the second number.
$ echo "Q123Q456" | qsesed -r 's/[0-9]+/&/2k'
456
2013-01-10 14:17:53 +00:00
### C/d:,f3,1/ ###
Prints the third field and the first field from a colon separated text.
2013-01-09 14:10:58 +00:00
$ head -5 /etc/passwd
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/bin/sh
bin:x:2:2:bin:/bin:/bin/sh
sys:x:3:3:sys:/dev:/bin/sh
sync:x:4:65534:sync:/bin:/bin/sync
$ qsesed '1,3C/d:,f3,1/;4,$d' /etc/passwd
0 root
1 daemon
2 bin