home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
CP/M
/
CPM_CDROM.iso
/
simtel
/
sigm
/
vols000
/
vol023
/
sigmv023.ark
/
COMPARE.DOC
< prev
next >
Wrap
Text File
|
1984-04-29
|
9KB
|
277 lines
COMPARE - Compare Two Textfiles 03 Jan 79
Compare - Compare Two Textfiles and Report Their Differences
James F. Miner
Social Science Research Facilities Center
Andy Mickel
University Computer Center
University of Minnesota
Minneapolis, MN 55455 USA
Copyright (c) 1977, 1978.
What COMPARE Does
-----------------
COMPARE is used to display the differences between two similar
texts (referred to as "FILEA" and "FILEB"). Such textfiles could be
Pascal source programs, character data, documentation, etc.
COMPARE is line-oriented, meaning the smallest unit of comparison
is the text line (ignoring trailing blanks). COMPARE generates a
report of differences (mismatches or extra text) between the two
textfiles. The criterion for determining the locality of differences
is the number of consecutive lines on each file which must match after
a prior mismatch, and can be selected as a parameter.
By selecting other parameters, you can direct COMPARE to restrict
the comparison to various linewidths, mark column-wise the differences
in pairs of mismatched lines, generate text-editor directives to be
used to convert FILEA into FILEB, or generate a listing which will
flag lines on FILEB indicating their addition or deletion as a result
of the application of the editor directives.
How to Use COMPARE
------------------
COMPARE is available as an operating system control statement on
CDC 6000/Cyber 70,170 computer systems. The general form of the
control statement is:
COMPARE(a,b,list,modfile/options)
COMPARE. means COMPARE(FILEA,FILEB,MODS/C6,D,W120)
"FILEA" and "FILEB" are the names of the two textfiles being
compared, "OUTPUT" is the report file, and "MODS" is the file name for
the generation of text-editor directives if the "M" option is
selected--see below. The various options are: C, D, F, M, P, and W.
- 1 -
COMPARE - Compare Two Textfiles 03 Jan 79
Cn Match Criterion (1 <= n <= 100).
C determines the number of consecutive lines on each file
which must match in order that they be considered as
terminating a prior mismatch. C therefore affects COMPARE's
"sensitivity" to the "locality" of differences. Setting C to
a large value tends to produce fewer (but longer) mismatches
than does a small value. C6 appears to give good results on
Pascal source files, but may be inappropriate for other
applications.
Default: C6.
D Report Differences.
D directs COMPARE to display mismatches and extra text
between FILEA and FILEB in a clearly annotated report. Only
one of D, F, or M can be explicitly selected at one time.
Default: selected.
F Select Flag-form output.
F directs COMPARE to list FILEB annotated with lines prefixed
by an "A" or "D" indicating "additions" or "deletions"
respectively. Such modifications could have been generated
with the M option. Only one of D, F, or M can be explicitly
selected at one time.
Default: not selected.
M Produce MODS file.
M directs COMPARE to produce a file of "INSERT" or "DELETE"
directives ready for the CDC MODIFY or UPDATE text editors
(an "IDENT" directive must be added). The insertions and
deletions will convert FILEA into FILEB. FILEA and FILEB
should be files with sequencing appearing in columns beyond
the linewidth specified by the W option. This is true of
MODIFY and UPDATE "COMPILE" files (W72 is recommended).
Sequence numbers are of the form:
{Blanks} IdentName {Blanks} UnsignedInteger.
Only one of D, F, or M can be explicitly selected at one
time.
Default: not selected.
P Mark Pairs of mismatched lines.
P alters the action of the D directive by marking differing
columns in pairs of lines which mismatch in sections of equal
length. This is especially useful for comparing packed data
files.
Default: not selected.
Wn Specify significant line Width (length) (10 <= n <= 150).
W determines the fixed number of columns of each line which
will be compared. W is ideal to use when sequence informa-
tion is present at the right edge of the text file.
Default: W120.
- 2 -
COMPARE - Compare Two Textfiles 03 Jan 79
Example
-------
Suppose FILEA is:
PROGRAM L2U(INPUT, OUTPUT);
(* CONVERT CDC 6/12-ASCII LOWER-CASE
LETTERS TO UPPER CASE. *)
BEGIN
WHILE NOT EOF(INPUT) DO
BEGIN
WHILE NOT EOLN(INPUT) DO
BEGIN
IF INPUT^ <> CHR(76) THEN WRITE(INPUT^);
GET(INPUT)
END;
READLN;
WRITELN
END;
(*ALL DONE.*)
END.
and FILEB is:
PROGRAM U2L(INPUT, OUTPUT);
(* CONVERT CDC ASCII UPPER-CASE LETTERS
TO 6/12 LOWER CASE. *)
BEGIN
WHILE NOT EOF(INPUT) DO
BEGIN
WHILE NOT EOLN(INPUT) DO
BEGIN
IF INPUT^ IN ['A'..'Z'] THEN WRITE(CHR(76));
WRITE(INPUT^);
GET(INPUT)
END;
READLN;
WRITELN
END;
END.
- 3 -
COMPARE - Compare Two Textfiles 03 Jan 79
then a report from COMPARE looks like this:
COMPARE,L2U,U2L,LIST/C1,D,P. 78/12/31. 20.23.25.
COMPARE VERSION 3.0 CDC (78/12/19)
OUTPUT OPTION = DIFFERENCES.
INPUT LINE WIDTH = 120 CHARACTERS.
MATCH CRITERION = 1 LINES.
FILEA: L2U
FILEB: U2L
***********************************
MISMATCH: L2U LINES 1 THRU 3 <NOT EQUAL TO> U2L LINES 1 THRU 3:
A 1. PROGRAM L2U(INPUT, OUTPUT);
B 1. PROGRAM U2L(INPUT, OUTPUT);
^ ^
A 2. (* CONVERT CDC 6/12-ASCII LOWER-CASE
B 2. (* CONVERT CDC ASCII UPPER-CASE LETTERS
^^^^^^^^^^^^^^^^^^^^^^^^
A 3. LETTERS TO UPPER CASE. *)
B 3. TO 6/12 LOWER CASE. *)
^^^^^^^ ^ ^^^^^^^^^^^^ ^^
***********************************
MISMATCH: L2U LINE 9 <NOT EQUAL TO> U2L LINES 9 THRU 10:
A 9. IF INPUT^ <> CHR(76) THEN WRITE(INPUT^);
B 9. IF INPUT^ IN ['A'..'Z'] THEN WRITE(CHR(76));
B 10. WRITE(INPUT^);
***********************************
EXTRA TEXT ON L2U, BETWEEN LINES 15 AND 16 OF U2L
A 15. (*ALL DONE.*)
How COMPARE Works
-----------------
COMPARE employs a simple backtracking-search algorithm to isolate
mismatches from their surrounding matches. Each mismatch requires
dynamic storage roughly proportional to the size of the largest
mismatch, and time roughly proportional to the square of the size of
the mismatch. Thus it may not be feasible to use COMPARE on files
with very long mismatches.
- 4 -
COMPARE - Compare Two Textfiles 03 Jan 79
History
-------
COMPARE was developed as a portable-Pascal software tool by James
Miner of the Social Science Research Facilities Center at the
University of Minnesota, in early 1977. It was written in standard
Pascal and developed initially under CDC 6000 Pascal. Although the
original version simply reported differences in a textfile, COMPARE
was designed to fit naturally into a larger text-editing system.
Plans for COMPARE's accommodating later enhancements to generate
text-editor directives were made from the beginning. In summer of
1977, John Strait at the University of Minnesota Computer Center
adapted COMPARE not only to generate such a modifications file, but
also flag-form output and user-selectable options.
COMPARE has been distributed to several Pascal enthusiasts in the
United States who have made it operational on other Pascal implementa-
tions. See Pascal News #12, May, 1978, pages 20-23. In late 1978,
Willett Kempton of the Anthropology Department at the University of
California Berkeley, installed COMPARE (with no changes required
whatsoever) under Berkeley UNIX Pascal on a PDP 11/70 computer system.
He later adapted the program to note column-wise differences in pairs
of different lines and made minor changes to the format of the report.
Rick Marcus and Andy Mickel at the University of Minnesota
Computer Center made minor enhancements to COMPARE and fully documen-
ted it it for Release 3 of Pascal 6000 in December, 1978.
COMPARE is a model program in many respects. It serves to
illustrate just how powerful and flexible such a comparison program
can be.
- 5 -