home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
1st Canadian Shareware Disc
/
1st_Canadian_Shareware_Disc_1991.ISO
/
graphics
/
ocr_vec
/
ocr_vec.doc
< prev
next >
Wrap
Text File
|
1990-05-10
|
13KB
|
337 lines
OCR_VEC Copyright (C) by Ron Mignery, 1990
Introduction:
OCR_VEC is a trainable optical character recognition system for
the pc. It enables the computer to scan a graphics screen
image in order to recognize and to translate whatever text it
finds. The graphics screen image would normally come from a
hardware scanner device, though its actual source is not
important to the software. Test images could be generated with
a paint program, for example.
This package is intended only as a demonstration of a novel
approach to optical character recognition. The classical
approach follows the "eye" model. The entire character image
is compared against a library of stored images to find a
match. OCR_VEC's approach follows a "Braille" model. The
computer feels the shape of the character and matches the
vectors generated against a library of stored vectors. Vectors
are perhaps more easily scaled and abstracted than images and thus
this system is somewhat more tolerant than the classical approach
to variation in letter shapes and sizes.
The software can be trained on any font or fonts. When the
computer scans an unrecognized letter, it displays the letter
and prompts for user input. The user may then identify the
letter and add its vectors to the lookup library. In
subsequent encounters, the computer will now recognize that
letter.
The OCR_VEC package:
The OCR_VEC packages contains the following files:
OCR_VEC.DOC This file
OCR_VEC.EXE The OCR executive program
CGA_CAP.EXE A TSR utility to capture CGA screens
EGA_CAP.EXE A TSR utility to capture EGA screens
HGA_CAP.EXE A TSR utility to capture HERC screens
VGA_CAP.EXE A TSR utility to capture VGA screens
OCR_SHP.SHP A sample letter shape library file
OCR_LET.LET A sample letter description file
OCR_SCR.BIN A sample graphics screen capture file
*.BGI Turbo C graphics drivers
Operation:
Optical Character Recogniton:
1. Enter OCR_VEC at the command line. The program will then prompt:
Enter name of shape file (default=OCR_SHP.SHP)
2. Press the ENTER key to accept the default name or type
in the filename of your choice and then press ENTER.
The known letter shapes are read from this file. If the file
does not exist, the file will be created but no known shapes
will be loaded. All new letter shapes encountered during the
scan will be written to this file.
If the shape file is empty, the program next prompts:
Lookup file not found!
Enter granularity (default=8.0)
3. Press the ENTER key to accept the default value or type
in the value of your choice and then press ENTER.
The granularity value determines how precisely the computer
feels the shape of each letter. If this value is too large,
the shape library becomes too large and performance is
compromised. If this value is too low, ambiguities in letter
recognition occur. The program may be unable to distinguish Q
from O for example. A value of 8.0 means that any feature of a
letter must be larger than 1/8 the size of the entire letter to
be considered significant. Empirically for normal fonts this
value is sufficient. Use a larger number if ambiguities in
character recognition occur.
The program next prompts:
Enter name of letter file (default=OCR_LET.LET)
4. Press the ENTER key to accept the default name or type
in the filename of your choice and then press ENTER.
This file contains information about each letter in the ASCII
set and about user specified doublets and character fragments
described below in the Adding a character section.
The program next prompts:
Enter screen image file (default=OCR_SCR.BIN)
5. Press the ENTER key to accept the default name or type
in the filename of your choice and then press ENTER.
This file must exist and be in the format required by OCR_VEC.
A sample file is included with the system. To generate your
own files from your own graphics, use the capture programs
described below.
The program then loads the requested graphic screen and
presents the prompt:
Scan entire screen (Y/N)?
6. If you want to scan only a portion of the screen, press N and
follow the instructions to mark off the portion of the screen
to be used.
The program next begins scanning for characters. The progress
of the scan is indicated by screen inversion. If the upper
left pixel is off, the program assumes a black background and
scans for non-black characters. If it is set, the program scans
for black characters. Scanning stops when a blank line is
encountered following non-blank lines. The program then
highlights a character and prompts as follows:
If the character is found in the shape library:
Is it _ ? (Y/N/P/A/D/ESC)
Else:
Not found. Add to list (Y/N/A/D/ESC)?
7. For the first prompt:
Type Y if the character highlighted is the
character scanned. Note that only a portion
of the scanned character may be highlighted and
that the highlighted portion may be identical in
shape and location to the suggested character.
The / in % for example. Type Y if / is suggested.
Type N otherwise. Do this only if the highlighted
part is of a different shape or location than the
suggested character.
Type P to purge the indicated letter from the
shape library.
For the second prompt:
Type Y to add its shape to the shape library.
Follow the directions for Adding a Character
outlined in the next section.
Type N otherwise.
For both prompts:
Type A to toggle auto mode. When auto mode is on, the
first prompt is bypassed. Use this mode when all
characters in a font have been loaded into the shape
library.
Type D if the line being scanned has no letters with
descenders (letters that extend below the baseline).
This rare occurrence can confuse the system. An asterisk
will appear in the prompt box when this mode is active.
Type ESC to quit.
8. Repeat step 7 until the entire text line has been scanned.
The program will then present the text of the line just scanned
and the following prompt:
<Text of line just scanned>
Press any key to continue...(A/O/D/ESC)
9. Type A to toggle auto mode as described above.
Type O to append the line of text to the file OCR_VEC.OUT.
Type D to rescan the line in no descenders mode as described
above.
Type ESC to quit.
Type any other key to continue.
10. Continue this procedure until the entire screen has been
scanned.
Adding a Character:
When you type Y in response to the Add to list
(Y/N/A/D/ESC)? prompt described above, the program then
prompts:
Enter letter that was scanned.
You should then enter the correct case letter that is
highlighted. Since OCR_VEC identifies a character as any group
of non-background color pixels that is completely surrounded by
background color pixels, the progam will sometimes identify
parts of characters or multiple characters as a single
character.
In particular, the dot above i and j will be treated as a
single character and will be identified before the body of the
letter since the program scans downward. The dot and all other
letter fragments should be treated as special cases and
identified to the program through control characters. That is,
if the program highlights the dot above i or j, then ctrl-a
should be entered as the letter scanned. You will then be
able to enter two letters to correspond to the highlighted
character. Similarly, the tops of the two-part characters ?
and ! should be entered as ctrl-b and ctrl-c respectively. The
dots above : and ; may or may not be identified as the same as
the dots above i and j depending on the font scanned. If they
are not the same, then ctrl-d should be used. These
assignements are abitrary and are only suggestions. The point
is not to assign multiple values to a single shape. Later,
when the program scans the second part of the character, it
will recognize the first part above it and combine the
information to generate a single character output. Parts are
combined only when one part completely overlaps another part.
Characters that partially overlap will not be combined; thus,
even a kerned font can potentially be read.
Some legitimate fonts have no whitespace between certain letter
combinations like fl or YZ. OCR_VEC will highlight two
characters in such a case. Again, these combinations can be
handled with control characters. Enter an unused control
character to correspond to the doublet. You may then enter two
letters to represent the scanned character. A little awkward
but it works. If a font has no whitespace between lines,
however, you are out of luck.
Graphics Screen Capture:
OCR_VEC requires screen image files in a special format. Files
in this format are generated by the ?GA_CAP.EXE programs
included in the package. These programs are TSRs (terminate
and stay resident) that are invoked after loading by the
simultaneous press of the Alt, Ctrl and S keys.
To capture a screen into the file OCR_SCR.BIN:
1. To capture CGA graphics screens, enter CGA_CAP at the Dos prompt.
To capture EGA graphics screens, enter EGA_CAP at the Dos prompt.
To capture VGA graphics screens, enter VGA_CAP at the Dos prompt.
To capture HERCULES graphics screens, enter HGA_CAP at the Dos
prompt.
Load only one TSR at a time. These TSRs are quick and dirty
little affairs and must not have other TSRs loaded after them.
Use of Mark and Release utilities is recommended for their
management.
2. Run whatever program you use to display the graphic screen
you want OCR_VEC to scan.
3. Simultaneously press the Alt, Ctrl and S keys.
4. The file OCR_SCR.BIN is created in the current directory.
Copyrights:
OCR_VEC Copyright (C) Ron Mignery 1990.
Created using Turbo C, Copyright (C) Borland 1987, 1988.
Graphics drivers included as permitted by Borland's
No-Nonsense License statement.
TSRs created using the KyCorp Memory Resident Library Version
2.01, Copyright (C) KyCorp Information Group, Inc. 1987.
Disclaimers:
This software and instructions are provided "as is" without
warranty of any kind either expressed or implied including but
not limited to fitness for a particular purpose. The entire
risk as to the results and performance of the software is
assumed by the user.
Distribution:
The OCR_VEC package may be copied, distributed (free of
charge), and used non-commercially provided that it is not
modified in any way.
Anyone interested in this approach to optical character
recognition may contact the author at the following address:
Ron Mignery
85 Bartlett Street
Somerville, MA 02145
(617) 628-0206
(or via GENIE ron mignery)
***end of OCR_VEC.DOC***