home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Usenet 1994 October
/
usenetsourcesnewsgroupsinfomagicoctober1994disk2.iso
/
misc
/
volume34
/
flogger
/
part01
/
README
< prev
next >
Wrap
Text File
|
1992-12-18
|
8KB
|
173 lines
VERSION
This is version 0 of the the Sort Flogger.
OVERVIEW
This is a small collection of some of the more popular sort
algorithms, including:
bubble sort -- any 1st semester course in CS
heap sort -- contributed by der Mouse
insertion sort -- any 1st semester course in CS
merge sort -- roughly patterned after Knuth Vol. 3
quick sort -- C.A.R. Hoare's as given in K&R 2 pg 87
shell sort -- D.L. Shell as given in K&R 2 pg 62
bogo sort -- worst-case sort by Richard Krehbiel
A slightly weird test routine is provided which calculates some
performance measurements on the above algorithms and also on the
qsort() function provided with your system, as well as verifying that
the sort actually worked and that it didn't modify anything immediately
above or below the array, that it doesn't leak memory, and whether
or not it's stable. These tests are carried out in a variety of
situations designed to highlight worst-case behavior.
These implementations of the sort functions are all compatible with
qsort()'s parameter list, so if you think your application spends too
much time sorting, you can just s/qsort/merge_sort/ and link in the
sort library and give it a try. If you think it doesn't spend *enough*
time sorting, try bubble_sort instead.
The merge sort declares a global int called _maybe_sorted which
tells merge sort that the data in the array may already be mostly in
order. It is advisory--the algorithm will produce the correct results
regardless of _maybe_sorted's value; however the execution time for
merge_sort will be quite a bit less if this is set and the array is
already mostly sorted. If the array isn't sorted at all, this option
extracts about 2% speed penalty.
The heapsort was contributed by mouse@larry.mcrcim.mcgill.edu
aka "der Mouse".
The merge_sort() function is the only one I've taken any trouble to
optimize.
INSTALLATION
See the INSTALLATION document which should be nearby.
USAGE
Just type "flogger" or "flogger <number>" and watch what happens.
DESCRIPTION OF OUTPUT
It seems like it doesn't really do anything but spit out numbers,
and rather slowly at that. The version and patch level of the program
is printed; the resolution of the clock is printed; the size of the
elements is reported; and the number of elements to be sorted is
reported.
For each sort algorithm, a table of measurements is reported. The
rows describe the situation the sort algorithm was presented with for
those measurements and the columns represent the type of measurement
taken. The rows (situations) are:
random: The contents of the array to be sorted are pseudo
random numbers generated by rand(), with two special
values excluded for the purpose of detecting overwrites
at the ends of the array.
ascend: The array is already completely in order. To simply
verify this fact is all that's required of the sort
algorithm; ironically, bubble sort outperforms all
the rest in this case.
descend: The array is exactly in reverse order. Theoretically,
an sort function could detect this and swap ends in
linear time, but it hardly seems useful.
fib asc: The contents of the array are irrelevant; the comparison
function passed to the routine always says that its
first argument is less than its second.
fib desc: Similarly, the comparison function always says that its
first argument is greater than its second.
surprise: A pseudorandom result is returned from the comparison
function in an attempt to confound the algorithms' efforts
at sorting.
mostly: Similar to ascend, but the first and last elements
are reversed. This is meant to simulate an index which
is pretty much already sorted and only a small number
of new items need to be inserted.
equiv: The comparison function reports that all elements
compare equal. The data is arranged so that a
check can be made for whether the algorithm is
stable; that is the property that records with
equal keys stay in the same relative order in the
array that they were in before the sort began.
The columns refer to these measurements:
compares: The number of calls to the comparison routine during
the sort. I consider this very important, as most
of the sorting bottlenecks I've experience center
around the comparison function.
stack: (Estimated) Some semi-non-portable attempts to record
the growth of the stack are made and a number comes
out. I consider O(log n) use of the stack acceptable
on modern machines and none of these algorithms are
worse than that.
heap: Amount of malloc'd storage as reported by mallinfo().
If your C library doesn't support mallinfo(), the
compiler will probably barf on flogger.c. The flogger
checks for memory leaks.
user: Amount of user mode CPU time as reported by times().
This includes time for both the operation of the
sort function and the comparison function; as these
are only marginally related, this number is not
useful for estimating how long it might take to
sort useful data in some other program.
system: Amount of system CPU time as reported by times().
Generally 0.00 or very small. If a sort algorithm
uses lots of system time it's probably a Trojan horse
trying to break into the passwd file or delete your
home directory or something so maybe you should
start shuffling through your desk looking for a recent
backup tape. I may remove this in a future version
(the report of system time, not the Trojan horse)
in favor of another metric, perhaps wall clock time.
But it isn't enough simply to be fast, a sort algorithm also has
to be correct. To that end, the flogger routine verifies that
the array was actually sorted, at least when the comparison routine
wasn't lying. It also checks that magic values in the memory locations
immediately above and below the array were not accidentally accessed
or overwritten.
The bogo sort run is skipped if n is sufficiently large that
it's execution would run into trillions of cycles (approx n = 15).
MISCELLANEOUS
When running the test routine, some sorts take so long that you
might lose patience. Keyboard interrupt (aka Control-C) will abort the
current sort test and move on to the next. If you really want the
program to end, give it a quit signal (Control-\) or send it a signal
with the kill command. You can also background it (Control-Z) from
the C-shell then kill it.
The size of the elements being sorted is hard-coded at the moment,
but it can be changed by going into flogger.c and changing the typedefs for
#define TEST_TYPE double
#define TEST_FUNC double_compare
into
#define TEST_TYPE int
#define TEST_FUNC int_compare
or back again. Or add a new type and a new comparison function.
Several of the sort algorithms are not re-entrant which means
you shouldn't call the sort from within the very same comparison
function you passed into that sort in the first place.
COPYRIGHT
Since none of the sort algorithms are mine and anyone with a copy
of Knuth Vol 3 and a little knowledge of C could type them in as easily
as I did, there isn't much point in claiming a copyright, much less
applying for a patent. I wanted to disclaim liability though, so I put
a few minor restrictions on the code, as set forth in each of the
source files.