14. Ultra-fast Searching with Indexes
FileFlex excels at index files. Letting FileFlex use indexes is like giving
a prized racehorse the chance to just let it all out, throttling up a perfectly
tuned race car, or punching a fighter jet into afterburner. When we say
that FileFlex can locate any record in a sea of billions of records faster
than the blink of an eye, we're talking FileFlex indexes.
Whenever possible, we recommend you use indexed-based searching. Indexes
work by algorithm, rather than brute force. FileFlex looks at the string
you're search for, does a mathematical calculation that basically tells
it how far into the file to move, and boom, it's on the record you need.
By contrast, both DBQuery and the full-text search DBFindMemo scan on a
record by record basis. This means that if the data you're looking for is
at the end of the file, DBQuery and DBFindMemo must individually check all
the preceding records prior to finding the match. DBSeek (the interface
to indexes) just does a calculation and whammo! You're on the record.
Indexes are not without their price, however. Since indexes can't check
every record and rely instead on complex offset calculations, they don't
support complex queries. But they're ideal for most queries you'll need.
Need to find someone's address? Construct an index combining last name and
first name, do a DBSeek, and--poof!--you're on the record. Need to find
everyone in the Southeast Region who's booked over $1.25 million and who
hasn't gotten a recent raise? Use DBQuery and be prepared to wait a while.
Indexes also take disk space, often quite a lot. Each index is it's own
file and that file contains the complete data of the field it's indexing,
as well as some overhead space used internally by FileFlex. So, if you've
got this handy name and address database and you want to index on last name,
followed by first name, you've got one index file. If you want to index
based on zipcode, you've got another index file, and so forth. But what
the heck. CD-ROMs are big, text is small, and new hard drives are cheap.
Use indexes and rejoice in the raw speed!
Note: It is possible to use DBSeek and DBQuery incorrectly and get
dog-poor performance. These tools provide you with the capabilities. But
it's up to you to design something that works efficiently. Just because
you've got a hot database engine is no excuse for poor application design.
Index Files Supported
FileFlex supports the use and updating of dBASE III-compatible index files
only. Other index file architectures such as FoxPro indexes cannot be used.
However, FileFlex can reindex a file using the dBASE index file structures.
This makes it easy to use files which have been indexed using other methods.
When Indexes are Updated Automatically
Any time you have opened one or more index files related to an open database
file and you make changes to that database file, FileFlex automatically
updates those indexes to reflect the new file contents.
Opening and Using Index Files
Among the functions you may wish to perform on an index file are the following:
- open a specific index file
- check an open index file to be sure it matches the file's contents
- select from among two or more open index files to make one current
- close an open index file
Opening an Index File
Use the FileFlex DBUseIndex function to open an index file for use. Supply
the index file's name as an argument. Assign the result of this function
to a variable (usually global) because you'll need to refer to the index
file's ID in other scripts and handlers. Here's an example of the use of
this function:
put DBUseIndex("STARS") into StarIndex
Note that the name of the index file should include any extension the file
might have. In the Macintosh environment, extensions are normally omitted,
but in DOS environments, the file extension ".NDX" will generally
be used for dBASE III-compatible index files.
We regularly get tech support calls from customers who get index file errors.
The most common reason is that the database the index file indexes must
be open and selected (DBUse and DBSelect) before executing a DBUse. One
good practice is to do all your DBUse's and DBUseIndex's at the start of
your application's life (for instance, in a startMovie handler). Use DBUse
to open a data file and assign an ID to a global variable. Then call DBUseIndex
repeatedly, opening all the index files for that data file, again assigning
the IDs to globals. Repeat this until all your data and index files are
open.
Note: Windows 3.1/DOS users may need to check the FILES= option in
your CONFIG.SYS to make sure you're allowed to open this many files. If
not, you'll need another file opening strategy, like only opening those
files in use by a given module.
Checking an Index File for Currency
If you suspect that an index file may not be "in sync" with its
database file, you can confirm its currency with the DBCheckIndex function.
This function returns an error code of -8 if its contents do not match those
of the database file. Otherwise, it returns the normal result code of 0.
This function would be most likely used in conjunction with the DBReindex
function described below in a script line that looks like this:
if DBCheckIndex(indexID) = -8 then
put DBReindex(indexID) into temp
end if
The value of indexID would, of course, be the result of the previous call
to the DBUseIndex function as described above.
Selecting an Open Index File
You must inform FileFlex which of (potentially) several open index files
it should use to access the current file with the DBSelectIndex function.
Supply the ID of the index file (returned by the DBUseIndex function) as
an argument:
put DBSelectIndex(StarIndex) into dbResult
Only one index file can be active at any time. It dictates the order in
which record retrieval takes place. However, all open index files are updated
each time a record is added or permanently.
If you're jumping between databases as well as indexes, don't forget to
execute a DBSelect for the appropriate database file prior to calling DBSelectIndex
on it's associated indexes.
Closing an Index File
Once you have opened an index file for use, you must also close it as part
of cleaning up after your application. The FileFlex DBCloseIndex function
closes a designated index file:
put DBCloseIndex(StarIndex) into dbResult
To close an index file use the DBCloseIndex function along with its associated
ID (returned by the DBUseIndex function) to close it.
Finding a Record by Index
One of the main reasons for using indexes is to enable the database to find
a record by the specific content of a specific field, namely, the one on
which the index is based. The other reason is to keep the file in a specific
order, or at least to give the appearance of retrieving information in that
specific order. Remember that FileFlex never physically reorders your data.
Indexes are the FileFlex equivalent of sort...but an equivalent that lets
you switch between an unlimited number of sort orders instantly.
To locate a record which matches a specific expression in the indexed field,
using the currently open index file, use the FileFlex DBSeek function, supplying
a parameter containing the expression to be matched in the indexed field:
put DBSeek("Fred Jones") into foundRec
The seekExpr should contain the value for which you wish to search the indexed
field. If this command fails to find the record (i.e., the search moves
beyond the end of the file), the function returns a value of 3. You can
then retrieve the contents of the located record.
Once you have found a record that matches the index key given as an argument,
subsequent uses of the function will move the current record pointer to
succeeding records in the file until a value of 3 is returned, indicating
no more records match the criterion.
Building a Seek Expression
Seek expressions can be complex. One important thing to know is that the
seek expression must match the exact number of characters as the data stored.
So, in the example above with Fred Jones, DBSeek would have indicated an
exact match only if the field being searched was defined to contain exactly
ten characters and the contents was "Fred Jones".
But what if you're not looking for a name that exactly fits the field width?
The short answer is you need to pad out the string to the right number of
characters. Let's assume we're searching a name field defined as 15 characters
wide. We'd need the following search expression to get an exact match ('~'
indicates a space character):
put DBSeek("Fred Jones~~~~~") into foundRec
New in 2.0: In previous versions of FileFlex, you had to hand-construct
a function to pad the string to the appropriate number of spaces. But in
FileFlex 2.0, we introduce a new utility function called DBBuildSeekExpr.
This function requires as it's parameters the ID of the index to be searched,
as well as one search expression for each field in the index. It returns
a constructed search string you can then pass to DBSeek.
Let's assume, for example, that the index we're using has the ID of 1. We'd
call the following to get a perfectly constructed seek expression:
put DBBuildSeekExpr(1,"Fred Jones") into theExpr
put DBSeek(theExpr) into foundRec
DBBuildSeekExpr is very smart. If the index has been defined to convert
all data to upper case, the DBBuildSeekExpr will convert your search string
to upper case. If you pass a number as a string, it will appropriate pad
the spaces and decimal points to be sure that your search string is in the
appropriate order. Further, DBBuildSeekExpr will construct complex queries
with exactly the right spacing for use in complex searches.
Say, for example, that you created a multi-field index, searching last name,
then first name. Using the following code would generate a perfectly balanced
and sized search string:
put DBBuildSeekExpr(1,"Jones", "Fred") into theExpr
put DBSeek(theExpr) into foundRec
Getting Index File Information
New in 2.0: In previous versions of FileFlex, once you created an
index, you better darn well name it clearly because there was no way to
find out the original index expression. But now, you can get at the index
expression in two ways: DBIndexExpr and DBListIndexFields.
DBIndexExpr
New in 2.0: The new DBIndexExpr returns the index expression that
was used to create the index file. Make sure the index file in question
has been opened, then pass the index ID as the sole parameter to DBIndexExpr.
The function will return the actual string used to create the index:
put DBIndexExpr(1) into theExpr
-- theExpr contains the following:
UPPER(NAME)+STR(SALARY,6,2)
DBListIndexFields
New in 2.0: The new DBListIndexFields takes two parameters, the index
file ID and a delimiter and returns a list of fields indexed by the specified
index, separated by the delimiter. Here's an example:
put DBListIndexFields(1,RETURN) into theResult
-- theResult contains
NAME
SALARY
Creating New Indexes and Updating Old Ones
FileFlex includes functions that enable you to create a new index on a database
file and to re-index a file whose index may be out of date. Like with creating
a database, FileFlex 2.0 includes the FileFlex Database Designer that takes
much of the work out of creating index files. Simply launch the Database
Designer (in Director 5, it's under the Tools menu) and choose Index. Drag
and drop your way into an easily designed index.
Creating a New Index
If you have a database file open and you wish to use programming to index
it on a field for which no index is presently available, you can create
a new index file with the FileFlex DBCreateIndex function. This function
requires four arguments, as explained below, and follows this format:
put DBCreateIndex(fileName, indexExpr, unique, safety)
into dbResult
The four arguments required are:
- fileName, which is the name of the index file to be created (including
a path name if you wish to place it somewhere specific)
- indexExpr, which is the name of the field on which you wish to create
the index
- "0", reserved for future use in FileFlex
- safety, which is a logical value that is set to 0 if you want to overwrite
an existing file of the same name if one exists, 1 if you wish to be prohibited
from doing so
Re-Indexing a File
There are times that an index file and its associated data file can become
unsynchronized. This might happen in a mixed-platform database environment,
for example, where many people are using FileFlex to access a central database
but a database administrator is using Access or FoxPro to maintain the files
separately. You can determine if a particular index file is synchronized
with the database in use by using the FileFlex DBCheckIndex function described
earlier. If you find that it is not synchronized, you can use the FileFlex
function DBReindex with the following syntax:
put DBReindex(indexID) into dbResult
Note that this operation uses an existing and previously opened index file
and replaces it with the newly generated index.
Note: From a performance perspective, it's often not wise to do a
DBCheckIndex and then a DBReindex if the index is out of sync. DBCheckIndex
will scan each record until it finds a mismatch, and this takes time. It's
often easier just to run a DBReindex and go for a cup of coffee, go out
to lunch, or go home for the night (depending on the size of the database).
Multi-Field Indexes
FileFlex will permit you to index a single file on multiple fields. To create
such an index, simply list the names of all the fields you wish to use in
the index field group as a single string, with a plus sign concatenating
them. Here's an example:
put DBCreateIndex("TERRSALE","TERRITORY+SALES",0,0) into dbResult
Fields should be listed from primary key to secondary key. Indexing will
take place in major-minor order from left to right. In the above example,
the file will end up being sorted by the field called TERRITORY and, within
TERRITORY, by SALES.
There is no theoretical limit to the number of fields that can be combined
into an index provided, of course, that the length of the argument to the
DBCreateIndex command does not exceed the limitations of the environment
in which you are running FileFlex, usually 256 characters.
Remember, once you create a multi-field index, you're going to want to search
for data. The easiest way is to create a multi-field index seek expression
using DBBuildSeekExpr as described above.
Note: Multi-field indexes only work on character fields. All fields
must be character fields, so use the intrinsic functions to convert appropriately.
Intrinsic Functions in Indexes
To create an index using an intrinsic function, simply include it inside
the string defining the index expression and call DBCreateIndex. Here's
an example:
put DBCreateIndex("TERRSALE","UPPER(SALES)",0,0) into dbResult
The above would index the field SALES, but would sort as though all the
letters in the SALES field were converted to upper case.
DBCreateIndex supports the following intrinsic functions:
DTOC
DEL
DELETED
RECNO
STR
SUBSTR
UPPER
And yes, you can actually do bizarre index expressions like:
"UPPER(LAST)+UPPER(FIRST)+STR(AGE)"
Intrinsic functions are documented in detail in the Intrinsic
Function Reference.
You Must Index on a String Data Type
Because FileFlex was originally designed for use with scripting environments,
it expects all its parameters to be strings.
Even so, if you were to index a FileFlex file by a numeric field, the indexing
process would work. Problem is, you'd never be able to seek out any data.
That's because FileFlex expects to receive string values when running DBSeek.
Indexing on a numeric value would require FileFlex to somehow convert the
string it always receives to a numeric value. It doesn't do this. So, if
you want to index something in numeric or date order, use the intrinsic
functions DTOC and STR.
[Previous Chapter] [Table of Contents] [Next Chapter]
Copyright (c) 1996 David Gewirtz under license to Component Software Corp.
All rights reserved worldwide.