Part 1 of 2
Is it worth setting aside memory for caches and buffers? Howard Carson
takes a look at hard drive management...
What's a cache? What's a buffer? What's the difference between a cache
and a buffer? Are all these questions related -and do we have to care?
If dialogs like these bring you out in a cold sweat you need to read
this article!
Most of us spend lots of time optimising our systems. There's always
the chance the next minor setup change will unleash untold power! One
of the areas shrouded in mystery is the use of caches and buffers and
now is the time to learn all about caches and buffers which should
explain how you can make worthwhile improvements to your system
performance.
At the outset, it's worth pointing out caches (of any type), are much
more important to PC users than Atari enthusiasts. On a PC, portions of
programs are constantly loading and unloading. On our platform entire
programs are loaded into RAM in one complete accessible chunk however
it's still important to cache the directories and file allocation
tables (FATs) otherwise significant hard drive slowdowns occur as the
drive fills up - especially if you're older TOS versions.
The following scenarios provide some useful and surprising details...
If you have a large, newish, SCSI hard drive (made after 1993), and if
the drive has an onboard cache (or cache buffer) of at least 128Kb, and
if the average seek time is faster than 17ms (milliseconds), and if
your machine is fitted with TOS v2.05 or higher the chances are you
won't gain much benefit from a software cache.
If you're using a TOS v1.62 or earlier with an Atari Megafile hard
drive or even one of the SCSI drives manufactured prior to 1993
(typically with access times in the 20-40ms range) the chances are a
medium size software cache will speed up your work considerably.
A combination of a modern SCSI drive and an older TOS version means a
small to medium sized software cache is probably worthwhile.
Modern drives boast spectacular seek times and data transfer rates (see
the throughput definition) which knock the stuffing out of pre TOS v2
equipped machines. If you're still using TOS v1.0/1.02 you'll need
FATSPEED.PRG (or other suitable software) in your Auto folder to obtain
reasonable drive access rates. Even TTs and Falcons slow to a crawl
when asked to access old drives - the throughput from the older drives
leaves the TT and Falcon data and address buses with time to kill. A
software cache will help, but often needs to be so large (512Kb or
more) that you may be better off upgrading the drive.
For example, with a TOS v4.02 Falcon with 4Mb of memory, and the three
year old 65Mb Conner IDE internal hard disk, running Calamus SL a 256Kb
software cache works extremely well but switching to 640x480x256
colours (occupying around 3Mb of memory plus the 256Kb cache) and
adding a few Auto folder programs and desktop accessories will take up
most of the free memory - and even that will be fragmented (i.e. broken
up into smaller chunks). Calamus SL hates fragmented memory and the
chances are you'll run out of memory - or into trouble fast.
Before you begin testing cache software consider how your system fits
into the overall scheme of things. Hopefully the information here will
help you decide whether replacing or upgrading components is worthwhile
and help you evaluate cache performance later on. Getting the right
combination of hardware and software together can produce some
delightful speed increases so it is well worth the effort.
Caches
Hardware and software caches work in pretty much the same way. On
receiving a request to read data from disk, the operating system checks
to see if the data is already in the cache. If the data is there
(called a "hit") the cache sends the data to the program without
accessing the disk and it's party time down at the racetrack.
TOS v1.0/1.02 users were left back on the starting grid because TOS
didn't incorporate any system calls to automatically look for the
presence of a hardware or software cache - this means the software
cache utilities have to be smart enough to interrupt calls for data at
a low level (i.e. early in the process) to be effective. Such caches
are called read (or read-through) and are capable of dramatic
performance increases.
Write (or writeback) caching stores data to be written to disk in
memory until the disk becomes idle or a preset amount of time has
passed without any other input at which point the data is written to
disk. Although write caching can improve performance, the results are
not as dramatic as read caching and carries some risk! If the computer
crashes, locks up or there's a power failure before the data is written
to disk the data is permanently lost - even after a save in some cases!
If you're going to use write caching invest in an Uninterruptible Power
Supply (UPS).
In most ST systems, software caches tend to provide faster system
performance, primarily because data transfer is faster across the
memory bus than across the data bus. In high end systems such as TT's
and Falcons etc. some reliance on hardware caching (found on the larger
hard drives), is likely to provide reasonable benefits.
Allocating system memory to a cache is a trial and error process - the
objective being a trade off between hits and memory. A cache somewhere
around twice the size of the largest file you normally load is a good
starting point. So, if the largest text file you work with is 50Kb, a
100Kb cache should suffice and isn't too much of a burden on a 2.5Mb
system. If you have the luxury of 4Mb memory (or more) you can consider
allocating up to one sixteenth of your system memory to a cache (i.e.
256Kb in a 4Mb system, 0.5Mb in an 8Mb system, 750Kb in a 14Mb Falcon
etc.) but do carry out some tests.
Buffers
These are typically found onboard hard drive controllers. In contrast
to a cache which stores data that has been read or written, a buffer
stores data about sectors adjacent to the data that was requested, in
an attempt to anticipate what might be requested next.
Segmented buffers are divided into smaller sections in order to store
more adjacent (or consecutive) sector data in an effort to improve the
hit rate and there's also adaptive segmented buffers which can expand
or shrink the number of segments used to store sector information,
depending on the average demand during a series of reads. Buffers tend
to be smaller than caches, because of the nature of the data they
store, and can be effective from 64Kb up to around 512Kb.
Another fundamental difference between caches and buffers is that a
cache can differentiate between directories/FATs and data whereas a
buffer cannot normally tell the difference.
The information in this article applies across all computer platforms
and should help you make an informed hard disk purchase. Next time
we'll concentrate on Atari specific requirements and software.
Hard disk terminology
Seek time
The time the drive takes to move its read/write heads across the
platters to a requested track. Smaller, older, SCSI drives (between
20-60Mb) typically average seek times of between 20 and 40 milliseconds
(ms) which looks poor compared with the current large capacity drives
(540Mbs upwards) which are capable of turning in times under 10ms.
Latency
Specifies the average time it takes to spin the platters until the
requested portions of a track are spinning under the head. Typical
latency times are around 5.6ms for 540Mb drives dropping to around
4.2ms for drives of 1Gb or larger.
Average access time
This is derived from the average seek and latency times. Somewhere
between 12-20ms is about right for modern SCSI drives. It's worth
mentioning most hard drive ads quote the faster seek time instead of
the average access time, so be careful when comparing performance
figures.
Data transfer rate (DTR) or throughput
This specifies the rate at which data is read from or written to the
drive, once the heads are in position. Applications which mainly read
data sequentially (business/graphics applications) are most affected by
the DTR. The rate is specified in either megabytes or megabits per
second. There are two main kinds of DTR: Burst DTR (also called
external DTR) and sustained DTR (also called internal DTR). Burst
specifies the rate at which data is read from the hardware cache (see
below). SCSI 1 and SCSI 2 drives have burst DTR of between 8 and
40Mb/sec. Sustained DTR specifies the performance when the hardware
cache is not being used. Despite spectacular DTR figures getting data
from a drive into system memory is often much slower. The data has to
be funnelled into memory across the data bus in your computer. ST's do
not operate anywhere near 8Mb/sec although Falcons, TT's etc. have
higher capacity buses which can more closely match the DTRs of current
drives.
Hardware Cache
An area of memory (typically between 64Kb to 4Mb) which is integrated
with the hard disk controller (HDC) and usually part of the onboard
memory buffer of the controller.
Software Cache
A chunk of your system memory, reserved for disk caching and controlled
by utility software (HD Util, TCache, Master Cache, Cachennn, etc.).
Hard/floppy disks
Hard disks store data on thin round aluminium plates, called platters,
with a magnetised oxide coating. Floppy disks are similar except the
plates are made of flexible plastic instead of aluminium.
When a disk is formatted, the coating is divided into concentric rings
called tracks. Each circular track is divided into sections called
sectors. A typical double-sided, double density (DD) floppy is
formatted to 80 tracks, with each track divided into nine sectors. Each
sector can accommodate 512 bytes of data so 9x80x512x2 sides yields
storage space for 737,280 bytes. After allocating a few thousand bytes
for the boot sector and the file allocation table (FAT) what's left is
typically around 730Kb. High density (HD) floppy disks double the
storage capacity by dividing each of the 80 tracks into 18 sectors
instead of nine. Hard disk capacity is calculated in a similar manner,
except the formatting method stores more sectors on the tracks nearest
the edge of each platter, which results in a higher storage capacity
and more efficient use of the available surface area on each platter.
Data is recorded on disks electrically via a tiny read/write head
(similar to, but much smaller than record/playback heads found in audio
cassette machines). The location of data (text files, images and MIDI
recordings etc.) is maintained in the file allocation table or FAT.
Every formatted disk has a FAT. If you wipe the FATs any data on the
disk is inaccessible.
The read/write head (in a hard drive) moves in an arc across the
platters, from the outer edge to the centre, and back again in much the
same way as a tone arm traverses a record except the read/write head
never actually touches the surface of the disk. The head (or heads, in
a multi-platter mechanism), is driven by an actuator motor, so-called
because it is actuated by read/write commands. The platters are mounted
on a central spindle and spun by another motor. The two way action
between the head (traversing the disk), and the spinning platter makes
it possible to quickly access any area of the disk's surface. Floppy
disks spin at around 310 revolutions per minute (RPM) and can be
traversed by the head at up to 12 metres per second. Hard drives spin
at between 4500 up to 7200 RPM depending on the age and size of the
individual models. This spindle (or rotational) speed is measured at
the spindle, not at the outer edge of the platters.
|