home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Chip Hitware 4
/
CHIP_HITWARE4.iso
/
TOOLS
/
95SEARCH
/
PARATAG.S
< prev
next >
Wrap
Text File
|
1993-05-30
|
4KB
|
97 lines
\ PARATAG.S
\ This table tags the end of paragraphs in uncoded ASCII text.
\ It would be a good idea to run COMPRESS.S on your file before running
\ this table on it. That will make it easier for this table to identify
\ paragraphs.
\ Throughout these equations, flag 0 is turned ON if the type of line ending
\ has not been resolved, and turned OFF if it has. This allows other
\ equations to do further checking as to what kind of line ending should be
\ attached.
\ *************************************************************************
\ The first two equations handle the case where a line is only up to 40
\ characters long. We want to tag such lines with an end of paragraph.
\v\^*(39)\v\0d\0a*00=\p0\^1<EP>\0d\0a*00
\v\^*(39)\v\0d\0a*01=<EP>\0d\0a\p0\^1<EP>\0d\0a*00
\ In the above two equations, the first character is a printable wild card
\ code because search equations can't begin with a variable-length string.
\ Technically, then, this equation will match only lines that have at
\ least 1 printable character in them (a space, a letter, a number, etc.).
\ *************************************************************************
\ The next two equations handle lines up to 80 characters long. Such lines
\ will be left untagged and flag 0 will be turned ON to indicate that the
\ line ending has not been resolved. If an unresolved line is followed by
\ another line of up to 80 characters, the preceding line is resolved with
\ an end of paragraph.
*(40)\v\^*(40)\v\0d\0a*00=*(40)\p0\^1*01
*(40)\v\^*(40)\v\0d\0a*01=<EP>\0d\0a*(40)\p0\^1*01
\ Notice in the above two equations that we are checking for exactly 40
\ printable characters before we check for up to 40 more. This is because an
\ equation that reads \^*(80)\v will match the same lines as \^*(40)\v for
\ lines under 40 characters! Why? Because the var.len. code \^ means "UP TO
\ (xx) characters", and a 30-character line would be certainly less than 80
\ characters just as it would be less than 40 characters, hence there's no
\ distinction between the two for short lines.
\ *************************************************************************
\ The next equations handle the case where a long line is followed by a lower
\ case letter. Such lines are continued as a paragraph, unless a period or
\ paren follows the letter, in which case it is assumed to be the beginning
\ of an outline-style point such as a) or a.
\y*01= \p0*00 \ an unresolved line followed by a lower case letter
\0d\0a\y= \p2 \ a carriage return followed by a lower case letter
\y. *01=<EP>\0d\0a\p0\p1\p2*00 \ unless it looks like an outline
\0d\0a\y. =<EP>\0d\0a\p2\p3\p4
\y) *01=<EP>\0d\0a\p0\p1\p2*00
\0d\0a\y) =<EP>\0d\0a\p2\p3\p4
\ *************************************************************************
\ The following equations handle lines that begin with an upper case or
\ numbered outline point. Such lines are new paragraphs, so the preceding
\ line must be tagged.
\n. *01=<EP>\0d\0a\p0\p1\p2*00
\u. *01=<EP>\0d\0a\p0\p1\p2*00
\0d\0a\n. =<EP>\0d\0a\p2\p3\p4
\0d\0a\u. =<EP>\0d\0a\p2\p3\p4
\n) *01=<EP>\0d\0a\p0\p1\p2*00
\u) *01=<EP>\0d\0a\p0\p1\p2*00
\0d\0a\n) =<EP>\0d\0a\p2\p3\p4
\0d\0a\u) =<EP>\0d\0a\p2\p3\p4
\ a run of cap characters followed by a colon is a new paragraph.
\u\^*(20)\x: *01=<EP>\0d\0a\p0\^1: *00
\ *************************************************************************
\ These equations tag an unresolved line followed by any printable character
\ (except a lower case letter, which is handled separately above) with an end
\ of paragraph. You can change the meaning of the equations removing the <EP>
\ codes and replacing them with a word space. This would make all unresolved
\ lines default to a continuation.
\v*01=<EP>\0d\0a\p0*00
\u*01=<EP>\0d\0a\p0*00 \ *** CHANGE THESE TWO IF YOU ARE GETTING
\n*01=<EP>\0d\0a\p0*00 \ TOO MANY MISMARKED PARAGRAPH ENDINGS ***
\ *************************************************************************
\ And finally, a CRLF by itself is always a paragraph end.
\0d\0a=<EP>\0d\0a*00