home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Simtel MSDOS 1992 December
/
simtel1292_SIMTEL_1292_Walnut_Creek.iso
/
msdos
/
editor
/
me_cd.arc
/
REGEXP.DOC
< prev
next >
Wrap
Text File
|
1987-08-27
|
5KB
|
104 lines
Regular Expresions
Regular expression syntax for ME.
[1] char Matches itself, unless it is a special character
(meta-character):
. [ ] * + ^ $
If case-fold-search is TRUE, char will match both upper and
lower case.
[2] . Matches any character.
[3] \ Matches the character following it, except when followed by
one of: ()1234567890<> adnwW (See [7] - [15]) It is used as an
escape character for all other meta-characters, and itself.
When used in a set ([4]), it is treated as an ordinary
character.
[4] [set] Matches one of the characters in the set. If the first
character in the set is ^, it matches a character NOT in the
set. A shorthand S-E is used to specify a set of characters S
up to E, inclusive. Note that case-fold-search has no affect
on sets.
To include - in a set: [-...], [^-...] or [...-]
To include ] in a set: [[...], [^[...] or [^-[...]
examples matches
[a-z] Any lowercase alpha
[^]-] Any char except ] and -
[^A-Z] Any char except uppercase alpha
[a-zA-Z0-9] Any alphanumeric
[5] * Any regular expression form [1] to [4], followed by closure
char (*) matches zero or more matches of that form.
[6] + Same as [5], except it matches one or more.
[7] \( A regular expression in the form [1] to [10], enclosed as
\(form\) matches what form matches. The enclosure creates a
set of tags, used for [8] and for pattern substitution. The
tagged forms are numbered starting from 1.
[8] \1 ... \9 A \ followed by a digit 1 to 9 matches whatever a
previously tagged regular expression ([7]) matched.
[9] \< Matches the beginning of a word.
\> Matches the end of a word.
See (modify-syntax-entry) for what a word is.
[10] \a Matches an alpha character (same as [a-zA-Z]).
[11] \d [0-9]
[12] \n Matches an alphanumeric character: [a-zA-Z0-9]
[13] \<blank> Matches whitespace.
[14] \w Matches a word character (as defined by the syntax tables).
[15] \W Matches a non-word character (as defined by the syntax
tables).
[16] A composite regular expression xy where x and y are in the form
of [1] to [10] matches the longest match of x followed by a
match for y.
[17] ^ $ a regular expression starting with a ^ character and/or
ending with a $ character, restricts the pattern matching to
the beginning of the line, and/or the end of line anchors.
Elsewhere in the pattern, ^ and $ are treated as ordinary
characters.
RE substitutions
In the replace string, the following characters have special meaning:
[1] & Substitute the entire matched string in the destination.
[2] \n Substitute the substring matched by a tagged subpattern
numbered n, where n is between 1 to 9, inclusive.
[3] \char Treat the next character literally, unless the character
is a digit ([2]). Otherwise the text is inserted verbatim.
EXAMPLES
foo*.* matches: fo foo fooo foobar fobar foxx ...
fo[ob]a[rz] matches: fobar fooar fobaz fooaz
foo\\+ matches: foo\ foo\\ foo\\\ ...
\(foo\)[1-3]\1 (same as foo[1-3]foo, but takes less internal space)
matches: foo1foo foo2foo foo3foo
\(fo.*\)-\1 matches: foo-foo fo-fo fob-fob foobar-foobar ...
DIAGNOSTICS
No previous regular expression, Empty closure, Illegal closure,
Cyclical reference, Undetermined reference, Unmatched (, Missing
], Null pattern inside \(\), Null pattern inside \<\>, Too many
\(\) pairs, Unmatched \).
AUTHOR: Ozan S. Yigit (oz)
MODIFIER: Craig Durland
BUGS
The internal storage for the compiled regular expression is not
checked for overflows. Currently, it is 512 bytes. If your RE's
are not much longer than 80 characters, you will not have any
problems.
A pattern will not cross lines.
If a line of the file is >256 characters or a line being replaced
is >256 ME will go down in flames.
[8] only works if the referenced tagged RE is made of constants and
case matters no matter what case-fold-search is set to. ie no
RE's here. Yes, pretty worthless and should be fixed.
Others, no doubt.