home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Fresh Fish 8
/
FreshFishVol8-CD2.bin
/
bbs
/
gnu
/
libg++-2.6.2.lha
/
libg++-2.6.2
/
librx
/
DOC
< prev
next >
Wrap
Text File
|
1994-06-03
|
4KB
|
110 lines
Most of the interfaces are covered by the documentation for GNU regex.
This file will accumulate factoids about other interfaces until
somebody writes a manual.
* RX_SEARCH
For an example of how to use rx_search, you can look at how
re_search_2 is defined (in rx.c). Basicly you need to define three
functions. These are GET_BURST, FETCH_CHAR, and BACK_REF. They each
operate on `struct rx_string_position' and a closure of your design
passed as void *.
struct rx_string_position
{
const unsigned char * pos; /* The current pos. */
const unsigned char * string; /* The current string burst. */
const unsigned char * end; /* First invalid position >= POS. */
int offset; /* Integer address of current burst */
int size; /* Current string's size. */
int search_direction; /* 1 or -1 */
int search_end; /* First position to not try. */
};
On entry to GET_BURST, all these fields are set, but POS may be >=
END. In fact, STRING and END might both be 0.
The function of GET_BURST is to make all the fields valid without
changing the logical position in the string. SEARCH_DIRECTION is a
hint about which way the matcher will move next. It is usually 1, and
is -1 only when fastmapping during a reverse search. SEARCH_END
terminates the burst.
typedef enum rx_get_burst_return
(*rx_get_burst_fn) (struct rx_string_position * pos,
void * app_closure,
int stop);
The closure is whatever you pass to rx_search. STOP is an argument to
rx_search that bounds the search. You should never return a string
position from with SEARCH_END set beyond the position indicated by
STOP.
enum rx_get_burst_return
{
rx_get_burst_continuation,
rx_get_burst_error,
rx_get_burst_ok,
rx_get_burst_no_more
};
Those are the possible return values of get_burst. Normally, you only
ever care about the last two. An error return indicates something
like trouble reading a file. A continuation return means suspend the
search and resume by retrying GET_BURST if the search is restarted.
GET_BURST is not quite as trivial as you might hope. If you have a
fragmented string, you really have to keep two adjacent fragments at
all times, even though the GET_BURST interface looks like you only
need one. This is because of operators like `word-boundry' that try
to look at two adjacent characters. Such operators are implemented
with FETCH_CHAR.
typedef int (*rx_fetch_char_fn) (struct rx_string_position * pos,
int offset,
void * app_closure,
int stop);
That takes the same closure passed to GET_BURST. It returns the
character at POS or at one past POS according to whether OFFSET is 0
or 1.
It is guaranteed that POS + OFFSET is within the string being searched.
The last function compares characters at one position with characters
previously matched by a parenthesized subexpression.
enum rx_back_check_return
{
rx_back_check_continuation,
rx_back_check_error,
rx_back_check_pass,
rx_back_check_fail
};
typedef enum rx_back_check_return
(*rx_back_check_fn) (struct rx_string_position * pos,
int lparen,
int rparen,
unsigned char * translate,
void * app_closure,
int stop);
LPAREN and RPAREN are the integer indexes of the previously matched
characters. The comparison should translate both characters being
compared by mapping them through TRANSLATE. POS is the point at which
to begin comparing. It should be advanced to the last character
matched during backreferencing.