home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
The Datafile PD-CD 2
/
DATAFILE_PDCD2.iso
/
utilities3
/
gnu_sed_rx
/
README
< prev
next >
Wrap
Text File
|
1993-08-06
|
2KB
|
67 lines
This directory contains an experimental release of GNU sed and GNU rx.
See the file INSTALL for compilation and installation instructions.
* The reason for this release
Older versions of GNU sed have used GNU regex, a featurful but slow
text pattern matcher. GNU rx is a reimplementation of the features in
GNU regex. Rx is much faster than its predecessor for many
interesting patterns. In (almost) no case should it be slower.
See the file ABOUT.RX for more information...about Rx.
* The status of GNU sed.
It has long been noted that GNU sed is much slower than other
implementations. Profiling of reported performance problems revealed
that the GNU regex was the first order bottle-neck.
Now that that has been fixed, GNU sed is still somewhat slower than
other implementations. Replacing GNU regex with Rx revealed a
second-order bottle-neck in the i/o habits of GNU sed (it reads input
files using `getc').
So, there will follow a new version of GNU sed with even more
performance improvements.
ABOUT BUGS
Before reporting a bug, please check the list of oft-reported non-bugs
(below).
Bugs and comments may be sent to bug-gnu-utils@prep.ai.mit.edu.
NONBUGS
* `sed -n' and `s/regex/replace/p'
Some versions of sed ignore the `p' (print) option of an `s' command
unless the `-n' command switch has been specified. Other versions
always honor the `p' option. GNU sed is the latter sort.
* regexp syntax clashes
GNU sed uses the Posix basic regular expression syntax. According to
the standard, the meaning of some escape sequences is undefined in
this syntax; notably `\|' and `\+'.
As in all GNU programs that use Posix basic regular expressions, sed
interprets these escape sequences as meta-characters. So, `x\+'
matches one or more occurences of `x'. `abc\|def' matches either
`abc' or `def'.
This syntax may cause problems when running scripts written for other
seds. Some sed programs have been written with the assumption that
`\|' and `\+' match the literal characters `|' and `+'. Such scripts
must be modified by removing the spurious backslashes if they are to
be used with GNU sed.
[If you have need of a free sed that understands the regexp
syntax of your choice, the source to GNU sed may be a good place to
start. Consider changing the call to re_set_syntax in function main
in `sed.c'. The file regex.h contains an explanation of the
supported syntax options.]