home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Zodiac Super OZ
/
MEDIADEPOT.ISO
/
FILES
/
16
/
CCACHE11.ZIP
/
EQANDA.TXT
< prev
next >
Wrap
Text File
|
1996-06-22
|
21KB
|
532 lines
EQANDA COPYRIGHT 1995-1996 horio shoichi EQANDA
NAME
eqanda.txt - Expected Questions And Answers
CONTENTS
This section contains the questions probably raised in using
concache.exe and the family programs, the DOS disk cache
program, and their answers. Following is the contents of
this section.
Why And How Cache Programs Speed Up Disk Io ?
What Are The Elements To Limit Concurrency ?
How Much Memory Should Be Prepared For Cache ?
How Concache.exe Can Be Tuned, In Terms Of Conven-
tional Memory ?
Is There Anything To Note With Relation To Serial Com-
munications Software ?
Troubleshooting
QUESTION
Why And How Cache Programs Speed Up Disk Io ?
ANSWER
Actually, disk cache programs don't speed up disk io.
Instead, they reduce the number of disk io operations. They
work to the user program as if disk io is completed as soon
as possible. They buffer disk data in a large memory area
called disk cache buffer (hereafter simply termed cache).
For read requests, if the data to be read reside in the
cache, data is supplied from cache. Also, data to be read
next by user programs are read and stored in the cache. This
method of speed up is called "read ahead" or "preread". For
write requests, the data to be written is copied into the
cache and user programs "think" the data to be written are
really written to disks. The data are actually written at
the cache program's convenience. This method of speeding up
the write requests is called "delay write", "write behind",
"write after", or "postwrite".
First generation of PC cache programs were generally reluc-
tant to use postwrite. This is thought of as a too special
luxury. Data to be written are written to disks as soon as
requested. The method to handle writes this way is called
"write-through".
When cache programs arrived on the market which use post-
write, it is found the programs more than double the speed
of writes. This is because disk allocation table, known as
FAT, is located at the top of disk and data space at the
Concache 1.10 Last Update: 19 June 1996 1
EQANDA COPYRIGHT 1995-1996 horio shoichi EQANDA
opposite corner, every write request first writes FAT mark-
ing as used and then turns head to the allocated area and
write data sectors. Postwrite in effect eliminates repeated
writes on FAT by submitting to DOS yet unwritten FAT. So,
not only actual number of write operations are reduced but
most head movements are eliminated by not needing to actu-
ally go back and forth to FAT area.
When working on floppy, you might have experienced severe
performance degradation if buffers= statement in config.sys
file is inadequately written. Also you might have observed
writes get slow down as your program proceed. What cache
programs do, up to this generation, is to extend the con-
fig.sys statement buffers= to a large cache buffer.
Next come so called "advanced" cache programs which attempt
to write data back concurrently with user programs. These
cache programs don't wait keyboard idle time, for example,
to write back cached data. This means traditional DOS pro-
grams' common inception that because disk writes are slow
they must be held into application program's buffers until
absolute needs arise to write them back is wrong. Writing
data as required is in fact faster and, perhaps less impor-
tant, eliminates the need of huge buffers from each applica-
tion program. In addition, because data are written as they
are produced, there are less chances of accidental data
loss. They become faster, safer and leaner.
It might be possible to think disk speed up has taken place
beginning with this generation.
Concache.exe belongs to this generation, and has added
another generality. It allows concurrency as far as there is
no reason to refrain from. The result is one floppy, one
BIOS disk, and as many as SCSI disks configurable into DOS
can be driven concurrently with DOS/user programs.
QUESTION
What Are The Elements To Limit Concurrency ?
ANSWER
From hardware point of view, floppies can not perform io
concurrently each other due to floppy controller design.
Also, IDE disks cannot. SCSI disks can perform io in paral-
lel, as seen on many multiprogramming operating systems. At
this level, one floppy, one IDE disk and SCSI disks can
operate concurrently.
Concache 1.10 Last Update: 19 June 1996 2
EQANDA COPYRIGHT 1995-1996 horio shoichi EQANDA
The next level to consider is BIOS to support io operations.
As far as published BIOS listing is concerned, there is no
reason floppy and IDE disk cannot operate concurrently.
SCSI drivers are usually written to do io asynchronously.
Here comes BIOS capability to distinguish disk events. Stan-
dard BIOS handle only two "type"s of disks, which is suffi-
cient for floppies and IDE disk environments, as found in
most PC configurations. Fortunately, ASPI (advanced SCSI
programming interface) specification, now broadly employed,
supports a mechanism effectively similar to BIOS disk event
notification, called command posting. (See appropriate man-
ual about this.) This allows handle individual disk's
events.
At this level no situations about concurrency issue is
changed.
The next level of the factor is device driver's non-
reentrancy. Even if a device driver manages several disks,
it expects its requests come serially but not while the pre-
vious requests are in progress. In fact, most known device
drivers lose reentrancy necessary for concurrency at the
very first two steps of driver code execution.
Also, io.sys handles int13, which is passed through by
almost any disk device call, in non-reentrant way. So, you
may think if third party device driver is used, for example
using io.sys for floppies and that for the other disk
devices, then at least the combination of one floppy and one
hard disk should work concurrently. But no. If both share
int13, then they don't work concurrently.
Next comes the DOS drive letter availability. If, for exam-
ple, a SCSI disk is split into two partitions, with many
good reasons, the user loses one drive letter for one disk.
These two partitions cannot share the io operation time.
Those constitute inherent limitations of concurrency. In
practice, there are resource limitations for programs under
DOS. For example ASPI drivers may limit the number of pack-
ets that it can accept at once.
Likewise, ccdisk.exe can limit the concurrency of SCSI disks
from its command line.
Finally, concache.exe can limit concurrency in two ways.
1) concurrency= option limits the number of concurrent
devices.
Concache 1.10 Last Update: 19 June 1996 3
EQANDA COPYRIGHT 1995-1996 horio shoichi EQANDA
2) io_buffers= option specifies insufficient io buffers to
let devices work concurrently.
QUESTION
How Much Memory Should Be Prepared For Cache ?
ANSWER
There are certainly optimal points of cache size. Unfortu-
nately, the points are too dependent on application and the
mix. There is no clear way to estimate the size and perfor-
mance of cache.
Fortunately, concache.exe allows change cache size on the
fly. You can observe the performance of various cache sizes.
If adding memory doesn't improve, then probably your mix
needs more memory, or you decide decrease cache memory size
without degrading performance.
A "pathetic" looking example is presented below. This kind
of anomaly is not uncommon in practice.
Consider following hypothetical example. I edit, compile,
link, and debug programs, just cyclically repeating these
steps. For simplicity, assume each step requires exactly
one megabyte. And assume each step needs a set of files
completely unrelated to the other steps (unrealistic ? but
think simple this way for now.) Now let's have 3 megabytes
cache. Then how this 3 mb will be used ?
Each of first three steps loads editor and source files into
first megabyte, loads compiler, header, source, and object
files into the next megabyte, finally loads loader, library,
object and exe files into the last megabyte.
The fourth step finds no free megabyte. So it must select
one from among three. Now familiar algorithm is in its turn.
Since the content of first megabyte is least recently used,
it is considered unlikely to be used very soon. So the
algorithm loads exe file, debugger, test data into, you see,
into the first megabyte.
I go back to editor. It is not in the first megabyte as you
have just witnessed. The editor etc. must be loaded into
second megabyte under similar fuss. This will purge com-
piler and so on from second megabyte. ...
In this example cache performance is no better than if I
used only one megabyte cache. If I added another megabyte
then the performance will be jump improved but adding the
Concache 1.10 Last Update: 19 June 1996 4
EQANDA COPYRIGHT 1995-1996 horio shoichi EQANDA
more does no good. If your job mix consists of five mutu-
ally unrelated steps each requiring one megabyte and cache
size is four megabytes, then four megabyte space is no bet-
ter than one megabyte.
This extremity comes out of commonly used LRU algorithm and
extremely simplistic assumptions of usage pattern. The least
recently used space is unlikely to be used very soon, but
actually it is in this case. So, to pick up a victim out of
already used three megabytes, let us select it randomly.
The probability of the survival of the next needed megabyte
is 0.67, and cache performance is improved that much, isn't
it ?
A similar situation is when copying a large file. Never read
again and never written again records continually flows into
cache data area, thereby erasing useful data from there. So,
more than double the file size cache area is necessary to
keep important data cached.
In practice, however, situation is not that bad. Even for
file copying, FAT and directory images are repeatedly refer-
enced from cache data area so disk head movements, as well
as repeated reads and writes to these area on disks are
avoided, thus improving the speed of the copy operation. In
the case of file copy, a rather small cache area works as
well as large ones.
QUESTION
How Concache.exe Can Be Tuned, In Terms Of Conventional Mem-
ory ?
ANSWER
An inevitable penalty of concurrency is memory requirements.
Each concurrently driven device needs its own io buffer,
control and stack space to switch to and fro, request packet
to organize io, and, for ccdisk.exe, SCSI control block, in
addition to descriptors needed for drives managed by con-
cache.exe.
Following is the description to save memory space used up by
concache.exe.
First, you can load concache.exe into upper memory, either
through config.sys as a device driver or through
autoexec.bat as a TSR (terminate and stay resident).
Second, io buffer size can be changed by buffer_size=
option, which can slow down data transfers. Note the size
Concache 1.10 Last Update: 19 June 1996 5
EQANDA COPYRIGHT 1995-1996 horio shoichi EQANDA
must be at least the size of the largest sector to be
cached.
Third, the number of io buffers can be changed. This change
can affect io performance done by concache.exe so experi-
ments are needed.
Fourth, directory space can be made at a minimum to the con-
currency you want.
Fifth, if full stack space, currently 440 - 500 bytes, is
not used, then it can be reduced to bare minimum 320 bytes
provided no SCSI disks are used. However, this may be
affected by the other external interrupt devices so experi-
ments may be needed. (After all, under DOS, the proof of
the stack is in the eating.)
Finally, on ccdisk.exe command line, concurrency require-
ments can be reduced down to somewhere bare minimum. If
unfortunately concurrency mode cannot be used, then saying
"concurrency=1" would save hundreds of bytes.
QUESTION
How Concache.exe Can Be Tuned, In Terms Of Performance ?
ANSWER
Speeding up is gained by either letting io efficient or by
taking maximal concurrency.
First, make tick_delay= value larger, to avoid clash between
DOS and concache.exe write back actions. This goes with
almost no penalties.
Second, make io buffer size or number of io buffers larger.
Options for these two factors work almost synonymously,
since concache.exe doesn't do io in fixed size buffers.
This will improve each io time and, if number of buffers is
sufficiently large, will also allow concurrent actions.
Third, as cache data area is split into multiple units of
8kb, which is fairly large compared to cluster size many
people prefer, if the drives are heavily fragmented, then a
large amount of space can be wasted in cache area. Note
drive fragmentation is not the least influential on perfor-
mance, and this is not particular to concache.exe but to all
disk cache programs that work on FAT oriented file systems.
Fourth, splitting files into disks in a scheme io overlap-
ping is possible would avoid the io clashes.
Concache 1.10 Last Update: 19 June 1996 6
EQANDA COPYRIGHT 1995-1996 horio shoichi EQANDA
Fifth, although preread improves performance in most cases,
it can degrade overall performance in certain cases; if read
pattern is random then preread is not only useless but also
further slows down by access clashes. If such files are
frequently accessed, it might be better move them to a par-
tition that does not preread. If cache data area is of
marginal size then preread can purge still useful data from
there and instead read out yet unnecessary data.
QUESTION
Is There Anything To Note With Relation To Serial Communica-
tions Software ?
ANSWER
Serial communications are notorious for their severe timing
requirements. For example, when communication speed is
38.4kb and the communication device is a model that lacks
buffer, then each character received through it must be han-
dled within 25 microsecond. Failing to handle the received
character within the interval would result in overrun error
familiar to programmers. Note this problem is particular to
receive side; a few delays on send side usually make no
severe problems.
On the other hand since concache.exe works asynchronously
with serial io, disk io is initiated and completed concur-
rently with character transmissions. This means con-
cache.exe causes various housekeepings in DOS context to be
performed within the short interval, which is almost impos-
sible on most PCs other than recent high performance ones.
Alleviations do exist, fortunately. Following lists several
of possible ways.
write after mode
This is to avoid overlapping operations with serial
transfers, thus the severe timing problem disappears.
buffered controller
If controller used for serial communication has receive
buffer it allows extend the short interval several
times longer. For example, using NS16550 chip enables,
when properly programmed, lengthen the interval 16
times.
hardware flow control
If this is possible on your PC and the counterpart,
this prevents receiving when there is no room to do so,
Concache 1.10 Last Update: 19 June 1996 7
EQANDA COPYRIGHT 1995-1996 horio shoichi EQANDA
thus the short interval is (unlimitedly ?) extended.
Troubleshooting
In the following, common conflicts such as irq, dma, memory,
SCSI option settings are not discussed. They are treated in
respective manufacturer's manual, and (probably) not partic-
ular to concache.exe per se.
First, stack issue must be tried, as this causes most
obscure effects on the workings of DOS programs.
Concache.exe is designed to work in the environment
stacks=0,0. However, because of variety of BIOS manufactur-
ers and existence of so many BIOS versions, it is not cer-
tain the estimate on concache's own stack requirements is
enough in every environment it encounters. In addition,
there may exist programs which expect a large stack space is
available at any time. For testing purpose, first try
"extremely wasteful" stack space in config.sys. If this
solves problem, your remaining task is find out the best
values for the config.sys line.
Alternately, stacksize= option on concache.exe can be tried
to find if concache.exe is experiencing stack overflow.
Let's discuss the problem in each mode of concache.exe.
Respective mode is to be given by option or by drive
description.
Fail On Stop Mode
If concache.exe fails in stop mode, there are two cases to
consider.
CPU overhead concache.exe incurs can be the problem. See the
section on the relations to communication. There is no gen-
eral solutions whatsoever.
The conflict can be between third party device drivers or
hardware. The gnaw_interrupt option on concache.exe may
help in some cases.
Write Through Mode Doesn't Work
Added complexity from stop mode to write through mode is the
actual access to memory manager and device driver.
Empirically, conflicts with memory managers are very rare,
except for pre-'90 EMS managers.
Concache 1.10 Last Update: 19 June 1996 8
EQANDA COPYRIGHT 1995-1996 horio shoichi EQANDA
Some device drivers may not be prepared with recent device
driver conventions.
Write After Mode Doesn't Work
Concurrency problems start from this mode. A variety of
assumptions about single-taskness of DOS programs where io
actions are enclosed within DOS context begin to cause con-
flicts.
Interrupt intensive applications can fail due to switching
overhead caused by concache.exe. If this might be the case,
then try write through mode. Slowing down is far better than
losing data.
Concurrency Mode Fails
If write after mode works but concurrency mode doesn't, it
seems most of problems are of synchronizations. One of
cases encountered while testing compatibilities are due to
improper int2a8x handling.
For example a network program ignores int2a8x critical sec-
tion interrupts while within int13 period, exactly which is
what concache.exe is going to do. Consequently, the program
miscounts int2a8x and erroneously identifies DOS idle
period.
Another example. There are certain periods concache.exe
does not want to be interrupted and reentered. In such cases
it issues DOS synchronization interrupt and warns not to
call DOS. Unfortunately, the interrupt is ignored or ill-
treated, thus causing hang.
SEE ALSO
ccdisk.txt, concache.txt, floppies.txt, overview.txt.
Concache 1.10 Last Update: 19 June 1996 9