home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
The Fred Fish Collection 1.5
/
ffcollection-1-5-1992-11.iso
/
ff_progs
/
miscutil
/
lit_073.lzh
/
LIT
/
LIT.MAN.PR
< prev
next >
Wrap
Text File
|
1991-08-16
|
10KB
|
307 lines
.lm 2
.rm 77
.ce 1
LIT TEXT UTILITY MANUAL
.ce 2
Version 2.0, 11/19/86
Copyright (C) 1986 Donald J. Irving
Lit is a command line invoked text utility which
filters a text file to stdout printing printable characters as
they are,
and showing all non-printable characters in
any one or more of three representation formats.
The only character interpreted (acted upon) by lit is the line feed
character which causes lit to issue a line feed.
The inspiration for lit came from the "l" command in many of the
UNIX line editors. Lit is not quite the same as any of
these, however. For one thing, lit output is never ambiguous.
Here is an example of what lit does:
.nf
Say the file 'myfile' consists of the following ascii characters:
HT, HT, h, e, l, l, o, space, w, o, r, l, d, BEL, LF
Saying 'lit myfile' would produce the following output:
\t\thello world\007\n
And saying 'lit myfile [various options]' might produce any of:
\t\thello world^G\n
^I^Ihello world^G^J
\011\011hello world\007\012
\09\09hello world\07\0A
\009\009hello world\007\010
.fi
.ne 13
You control the output with optional command line arguments which provide:
.nf
1. The name of the file to read as input.
2. What subset of the file lines to print.
3. In which format(s) to represent non-printable characters.
4. Which number base to use for numeric representations.
If you do not supply these, they default (in the original version) to:
1. Stdin.
2. The whole file.
3. Backslash constructs if possible else numeric representations.
4. Octal.
.fi
Here is the command line template. The arguments may be specified in any
order. The -bcanohd options may be stacked after one minus sign, or they
may appear as separate arguments.
lit [<filename>] [-s<linenum>] [-p<numlines>] [-[bcan][ohd]]
.ne 5
.ce 1
THE NAME OF THE INPUT FILE
The first command line argument
encountered which does not start with a minus
sign is considered to be the input file name. Any subsequent
command line argument which does not start with a minus sign is
considered to be an error.
If no command line argument is found which does not start with a minus
sign lit uses <stdin> for input.
.ne 10
.ce 1
PRINTING A SUBSET OF LINES OF THE FILE
Lit prints the whole file by default. You can tell it on which line
in the file to start printing and/or how many lines to print
by supplying either of both of these command line arguments:
.nf
-s<linenum> lit will start printing at line <linenum>
-p<numlines> lit will print <numlines> lines
.fi
There is no space between the 's' or 'p' and the number. There
is no validity checking on the number values.
.ne 4
.ce 1
FORMATS FOR REPRESENTING NON-PRINTABLE CHARACTERS
There are three formats in which non-printable characters may be
represented: C Language style backslash representations such as \n,
control character representations such as ^J, and numeric value
representations such as \012.
.ne 12
C Language Backslash Representations
The form is a backslash followed by a lower case letter. Here is the list
of the applicable characters:
.nf
line feed \n
horizontal tab \t
backspace \b
carriage return \r
form feed \f
.fi
The ascii NUL character representation \0 is omitted.
NUL is represented by its control
character representation or as a numeric value.
.ne 4
Control Character Representations
The form is a caret followed by another symbol, where the second symbol
is the keyboard control character of the character to be represented.
For example, the ascii line feed character is represented as ^J.
The ascii character DEL has an arbitrarily assigned
representation of ^?.
.ne 4
ASCII Numeric Value Representations
The representation is in the form \num where num is the character's
numeric value. (the unsigned integer value of its eight bits)
displayed
in any of the three number bases octal, decimal, or hexadecimal.
For octal representations,
num is exactly three octal digits; for
hex representations, num is exactly two hexadecimal digits; and for
decimal representations, num is exactly three decimal digits. Num is
zero-padded
on the left if necessary to make up the required number of digits.
For example, the ESC char is represented as \033, \027, or \1B in octal,
decimal, and hex respectively. NUL would be \000, \000, or \00.
This format is not limited to ascii characters; any eight bits can be
represented. Numbers of \200 (octal), \128 (decimal), \80 (hex), or
greater are byte values beyond the upper end of the ascii character set.
The largest byte value (all bits on) is \377 (octal), \255 (decimal), or
\FF (hex).
.ne 4
.ce 1
COMMAND LINE ARGUMENTS FOR SELECTING REPRESENTATION FORMATS
You tell lit which representation format or combination of formats to use
for non-printable characters by supplying one of the command line
arguments -b, -c, -a, or -n. If you supply none of
these, then -b is selected by default.
If you supply more than one, then the latter supersedes the former.
.ne 11
.nf
-b use backslash representations such as \n
if possible, else use numeric representations.
-c use control char representations such as ^J
if possible, else use numeric representations.
-a all; use backslash reps if possible, else use control
char reps if possible, else use numeric representations.
-n use numeric representations only.
.fi
.ne 4
You tell lit which number base to use for numeric representations by
providing one of the command line
arguments -o, -h, or -d. If you supply none of these, then -o is selected
by default.
If you supply more than one, then the latter supersedes the former.
.nf
.ne 3
-o octal
-h hexadecimal
-d decimal
.fi
.ne 10
.ce 1
EXCEPTIONAL CHARACTERS
Two characters have special meaning in lit output. The backslash
character \ always has special meaning. The caret character ^ has
special meaning whenever control character representations are
enabled.
.ne 4
The Backslash Character \
As already described, the \ character in lit output signals
the beginning of either a special letter representation such as \n or a
numeric representation such as \012. The \ is also used to
relieve a subsequent \ or ^ of its special meaning.
\\ represents the actual character \, and
(when control character representations
are enabled) \^ represents the actual character ^.
.ne 4
The Caret Character ^
When control character
representations are enabled, a ^ signals the beginning of a control
character representation such as ^J. Note the implication therefore
that ^^ means Control caret (ascii RS), and ^\ means Control
backslash (ascii FS). In both of these cases the second character is
relieved of its special meaning because it is part of the control
character representation.
If control character representations are not enabled, then ^ is
just another printable character.
.ne 4
.ce 1
CONCLUSION
Lit fills the gap between text editors which usually interpret special
characters in special ways, and hex dump utilities which make terrible
reading for text files.
One of lit's greatest strengths is that it interprets nothing
but the linefeed character; everything else is just represented to the
output stream.
Although lit provides a variety of output formats, perhaps its main
usefulness is in quickly locating U.F.O.s (Unidentified File Objects)
that have gotten into your text files.
(like that ESC char that's wierding out your printer)
For this purpose, the default options are adequate, and, for C
programmers at least, already familiar.
.ne 7
.nf
Donald J. Irving
9812 Gardenwood Way
Sacramento, CA 95827
(916) 366-3225
CIS: 73547,1335
PLINK: ops158
.fi
.bp
Post scripts:
**
One convenient way of getting to know lit is to use the default
input file stdin. Just say 'lit [-options]' with no file name. Now you
can type in lines one at a time and have lit filter them back to you.
Try typing control characters to see how they come back.
Keep in mind that in this configuration, the CLI is still trapping
and interpreting (acting upon) what you type, so
screen control characters like
form feed, and tab, for example, actually cause form feeds and tabs to
occur on the screen before lit has a chance to send you its output.
This may make the screen look a little messy, but
at least if the CLI is interpreting everything it can tell when you
type Control C to break out.
**
Want to have lit give you a Usage statement? Say 'lit lskdmlsdm' where
lskdmlsdm is any string of garbage which doesn't add up to the name of
a real file.
**
Why not use \0 to represent NUL?
Consider the following character sequence:
BEL, space, NUL, 0, 7
Using \0 for NUL would yield the output '\007 \007'. To avoid this
ambiguity,
the \0 construct is not included in the backslash representations.
**
Why use ^? for DEL? Keyboard control characters are always 64 places
higher in the ascii table than the non-printable characters they represent.
DEL is at the high end of the ascii character set, however,
so there's no keyboard character to represent it.
We need to arbitrarily choose some character.
The ? seems to make at least some sense as a choice; it is 64 places
less than DEL, and that kind of satisfies ones desire for
symmetry in the world. (Besides, some of the UNIX world tools already do
it that way.)
**
If you don't like the default option settings, they are very simple to
change in the C source. If you don't have a C compiler, and can't live with
the settings, I will be willing to recompile it with your desired option
settings. Send me a disk in a protective mailer and include return
postage. I will return your disk in the same mailer.