home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
The Fred Fish Collection 1.5
/
ffcollection-1-5-1992-11.iso
/
ff_disks
/
100-199
/
ff191.lzh
/
ISpell
/
ispell.hlp
< prev
next >
Wrap
Text File
|
1989-03-14
|
16KB
|
397 lines
ISPELL(local) DOMAIN/IX Reference Manual (MIT) ISPELL(local)
NAME
ispell - Correct spelling for a file
munchlist - Combine suffixes in a spelling list
isexpand - Expand suffixes in a spelling list
SYNOPSIS
ispell [ -t | -x | -S | -d file | -p file | -w chars ] file
.....
ispell [ -t | -d file | -p file | -w chars ] -l
ispell [ -t | -d file | -p file ] { -a | -A }
ispell [ -w chars ] -c
ispell -v
munchlist [ -d file | -e | -w chars ] [ files ]
isexpand [ files ]
DESCRIPTION
I_s_p_e_l_l_ is fashioned after the s_p_e_l_l_ program from ITS (called
i_s_p_e_l_l_ on Twenex systems.) The most common usage is "ispell
filename". In this case, i_s_p_e_l_l_ will display each word
which does not appear in the dictionary, and allow you to
change it. If there are "near misses" in the dictionary
(words which differ by only a single letter, a missing or
extra letter, or a pair of transposed letters), then they
are also displayed. If you think the word is correct as it
stands, you can type either "Space" to accept it this one
time, or "I" to accept it and put it in your private dic-
tionary. If one of the near misses is the word you want,
type the corresponding number. (If there are more than 10
choices, you may have to type a carriage return to complete
a single-digit number). Finally, if none of these choices
is right, you can type "R" and you will be prompted for a
replacement word. If you want to see a list of words that
might be close using wildcard characters, type "L" to lookup
a word in the system dictionary.
When a misspelled word is found, it is printed at the top of
the screen. Any near misses will be printed on the follow-
ing lines, and finally, two lines containing the word are
printed at the bottom of the screen. If your terminal can
type in reverse video, the word itself is highlighted.
The -v option causes i_s_p_e_l_l_ to print its current version
identification on the standard output and exit.
The -l or "list" option to i_s_p_e_l_l_ is used to produce a list
of misspelled words from the standard input.
The -a option is intended to be used from other programs
through a pipe. In this mode, i_s_p_e_l_l_ expects the standard
input to consist of lines containing single words. Each
word is read, and a single line is written to the standard
output. If the word was found in the main dictionary, or
Printed 7/9/87 ISPELL-1
ISPELL(local) DOMAIN/IX Reference Manual (MIT) ISPELL(local)
your personal dictionary, then the line contains only a '*'.
If the word was found through suffix removal, then the line
contains a '+', a space, and the root word. If the word is
not in the dictionary, but there are near misses, then the
line contains an '&', a space, and a list of the near misses
separated by spaces. Also, each near miss is capitalized
the same as the input word if unless such capitalization is
illegal; in the latter case each near miss is capitalized
correctly according to the dictionary. Finally, if the word
neither appears in the dictionary, and there are no near
misses, then the line contains only a '#'. This mode is
also suitable for interactive use when you want to figure
out the spelling of a single word. (These characters are
the same as the codes that the real spell program uses.)
The -A option works just like -a, except that if a line
begins with the string "&Include_File&", the rest of the
line is taken as the name of a file to read for further
words. Input returns to the original file when the include
file is exhausted. Inclusion may be nested up to five deep.
The key string may be changed with the environment variable
INCLUDE_STRING (the ampersands, if any, must be included).
When in the -a mode, i_s_p_e_l_l_ will also accept lines of single
words prefixed with either a '*' or a '@'. A line starting
with '*' tells i_s_p_e_l_l_ to insert the word into the user's
dictionary (similar to the I command). A line starting with
'@' causes i_s_p_e_l_l_ to accept this word in the future (similar
to the A command).
The -x option causes i_s_p_e_l_l_ to remove the .bak file that it
normally leaves. The .bak file contains the pre-corrected
text. If there are file opening / writing errors, the .bak
file may be left for recovery purposes even with the -x
option.
The -S option suppresses i_s_p_e_l_l_'s normal behavior of sorting
the list of possible replacement words. Some people may
prefer this, since it somewhat enhances the probability that
the correct word will be low-numbered.
The -t option selects TeX/LaTeX input mode. TeX/LaTeX mode
is also automatically selected if an input file has the
extension ".tex". In this mode, whenever a backslash ("\")
is found, i_s_p_e_l_l_ will skip to the next whitespace. Thus,
for example, given
\chapter {This is a Ckapter} \cite{SCH86}
will find "Ckapter" but will not look for SCH. The -t
option does not recognize the TeX comment character "%".
The -d option is used to specify an alternate hashed dic-
tionary file, other than the default. If the filename does
ISPELL-2 Printed 7/9/87
ISPELL(local) DOMAIN/IX Reference Manual (MIT) ISPELL(local)
not begin with a "/", the library directory for the default
dictionary file is prefixed. This is useful to allow dic-
tionaries which prefer alternate British spellings ("cen-
tre", "tyre", etc), or add lists of special-purpose jargon
and acronyms for subclasses of documents. There are some
shortcomings in attempting to provide foreign-language dic-
tionaries, but something like "-d french" could be made to
work somewhat. The -d option may specify /d_e_v_/n_u_l_l_, in
which case the dictionary is limited to the personal one.
This may be useful for certain private dictionaries.
The -p option is used to specify an alternate personal dic-
tionary file. If the file name does not begin with "/",
$HOME is prefixed. Also, the shell variable WORDLIST may be
set, which renames the personal dictionary in the same
manner. The command line overrides WORDLIST setting. If
neither is present "~/.ispell_words" is used.
The -w option may be used to specify characters other than
alphabetics which may also appear in words. For instance,
-w "&" will allow "AT&T" to be picked up. Underscores are
useful in many technical documents. There is an admittedly
crude provision in this option for 8-bit international char-
acters. Non-printing characters may be specified in the
usual way by inserting a backslash followed by the octal
character code; e.g., "\014" for a form feed. Alterna-
tively, if "n" appears in the character string, the (up to)
three characters following are a DECIMAL code 0 - 255, for
the character. For example, to include bells and form feeds
in your words (an admittedly silly thing to do, but aren't
most pedagogical examples):
n007n012
Numeric digits other than the three following "n" are simply
numeric characters. Use of "n" does not conflict with any-
thing because actual alphabetics have no meaning - alphabet-
ics are already accepted. I_s_p_e_l_l_ will typically be used
with input from a file, meaning that preserving parity for
possible 8 bit characters from the input text is OK. If you
specify the -l option, and actually type text from the ter-
minal, this may create problems if your stty settings
preserve parity.
The -c option is primarily intended for use by the m_u_n_c_h_l_i_s_t_
shell script. In this mode, a list of words is read from
the standard input. For each word, a list of possible root
words and suffixes will be written to the standard output.
Some of the root words will be illegal and must be filtered
from the output by other means; the m_u_n_c_h_l_i_s_t_ script does
this. As an example, the command "echo BOTHER | ispell -c"
produces:
Printed 7/9/87 ISPELL-3
ISPELL(local) DOMAIN/IX Reference Manual (MIT) ISPELL(local)
BOTH
BOTHE/R
BOTH/R
Unless it has been installed without the feature by your
system administrator, i_s_p_e_l_l_ is aware of the correct capi-
talizations of words in the dictionary and in your personal
dictionary. As well as recognizing words that must be capi-
talized (e.g., George) and words that must be all-capitals
(e.g., NASA), it can also handle words with "unusual" capi-
talization (e.g., "ITCorp" or "TeX"). If a word is capital-
ized incorrectly, the list of possibilities will include all
acceptable capitalizations. (More than one capitalization
may be acceptable; for example, my dictionary lists both
"ITCorp" and "ITcorp".) Normally, this feature will not
cause you surprises, but there is one circumstance you need
to be aware of. If you add a word to your dictionary that
is at the beginning of a sentence (e.g., the first word of
this paragraph if "unless" were not in the dictionary), it
will be marked as "capitalization required". A subsequent
usage of this word without capitalization (e.g., the quoted
word in the previous sentence), i_s_p_e_l_l_ will object and sug-
gest the capitalized version. You must then compare the
actual spellings by eye, and then type "I" to add the un-
capitalized variant to your personal dictionary.
The rules for capitalization are as follows:
(1) Any word may appear in all capitals, as in headings.
(2) Any word that is in the dictionary in all-lowercase
form may appear either in lowercase or capitalized (as
at the beginning of a sentence).
(3) Any word that has "funny" capitalization (i.e., it con-
tains both cases and there is an uppercase character
besides the first) must appear exactly as in the dic-
tionary, except as permitted by rule (1). If the word
is acceptable in all-lowercase, it must appear thus in
a dictionary entry.
The m_u_n_c_h_l_i_s_t_ shell script is used to reduce the size of
dictionary files, primarily personal dictionary files. It
is also capable of combining dictionaries from various
sources. The given f_i_l_e_s_ are read (standard input if no
arguments are given), reduced to a minimal set of roots and
suffixes that will match the same list of words, and written
to standard output.
Normally, words that are in the default dictionary are
removed by m_u_n_c_h_l_i_s_t_ during processing. If the list is to
be used with a different dictionary, the -d option can be
ISPELL-4 Printed 7/9/87
ISPELL(local) DOMAIN/IX Reference Manual (MIT) ISPELL(local)
used to specify an alternate (hashed) dictionary file con-
taining words to be removed from the output list. If a dic-
tionary file of /d_e_v_/n_u_l_l_ is specified, no words will be
removed from the output; this is useful when munching the
primary dictionary file.
The -w option is passed on to i_s_p_e_l_l_. The -e ("efficient")
option causes the script to use a slower algorithm that uses
somewhat less space in TMPDIR (normally /u_s_r_/t_m_p_).
The i_s_e_x_p_a_n_d_ shell script is used to expand the various suf-
fix flags in an i_s_p_e_l_l_ word list. This script can be used
when looking words up in the dictionary, or to verify that a
particular suffix flag actually produces the expected
result.
It is possible to install i_s_p_e_l_l_ in such a way as to only
support ASCII range text if desired.
ENVIRONMENT
WORDLIST Personal dictionary file name
INCLUDE_STRING Code for file inclusion under the -A
option
TMPDIR Directory used for some of munchlist's tem-
porary files
FILES
$HOME/.ispell_words user's private dictionary
/usr/dict/words list of words for the Lookup func-
tion
/t_o_o_l_s_/s_o_u_r_c_e_s_/i_s_p_e_l_l_ directory for the following
files:
ispell.hash hashed dictionary for ispell
isexp[1-4].sed sed scripts for expanding suffixes
icombine program for combining suffix flags
munchlist munchlist program
isexpand isexpand program
makedict* making your own dictionaries
ispell.4.hlp description of dictionary entries
ispell.el Sample GNU Emacs interface to ispell
SEE ALSO
spell(1), egrep(1), look(1), ispell(4)
BUGS
It takes about five seconds for i_s_p_e_l_l_ to read in the hash
table.
The hash table is stored as a quarter-megabyte (or larger)
array, so a PDP-11 version does not seem likely.
I_s_p_e_l_l_ should understand more t_r_o_f_f_ syntax, and deal more
Printed 7/9/87 ISPELL-5
ISPELL(local) DOMAIN/IX Reference Manual (MIT) ISPELL(local)
intelligently with contractions.
While alternate dictionaries for foreign languages could be
defined, and the international characters included in words,
rules concerning word endings / pluralization accommodate
English only.
When the -x flag is specified, i_s_p_e_l_l_ will unlink any exist-
ing .bak file.
M_u_n_c_h_l_i_s_t_ requires tremendous amounts of temporary file
space for large dictionaries. It does respect the TMPDIR
environment variable, so this space can be redirected. How-
ever, a lot of the temporary space it needs is for sorting,
so TMPDIR is only a partial help on systems with an
uncooperative s_o_r_t_(1). As a benchmark, the 15000-word
d_i_c_t_.1_9_1_ takes about 1200 blocks in TMPDIR, and 2000 in
s_o_r_t_'s temporary directories. Munching d_i_c_t_.1_9_1_ with
/u_s_r_/d_i_c_t_/w_o_r_d_s_ (28000 words output) took another 1500
blocks or so, and ran for the better part of an hour.
AUTHOR
Pace Willisson (pace@mit-vax)
Collected, revised, and enhanced for the Usenet by Walt
Buehring.
Further enhanced and debugged by Isaac Balbin, Stewart Cla-
men, Mark Davies, Steve Dum, Gary Johnson, Don Kark, Steve
Kelem, Jim Knutson, Geoff Kuenning, Evan Marcus, Dave Mason,
Rob McMahon, Bob McQueer, David Neves, Joe Orost, Israel
Pinkas, Gary Puckering, Bill Randle, Marc Ries, Rich Salz,
Greg Schaffer, Joel Shprentz, George Sipe, Perry Smith,
Stefan Taxhet, Andrew Vignaux, Johan Widen, James Woods, and
Ken Yap.
ISPELL-6 Printed 7/9/87