1st Canadian Shareware Disc

home *** CD-ROM | disk | FTP | other *** search

/ 1st Canadian Shareware Disc / 1st_Canadian_Shareware_Disc_1991.ISO / graphics / ocr_vec / ocr_vec.doc < prev next >

Wrap

Text File | 1990-05-10 | 13KB | 337 lines

OCR_VEC Copyright (C) by Ron Mignery, 1990 Introduction: OCR_VEC is a trainable optical character recognition system for the pc. It enables the computer to scan a graphics screen image in order to recognize and to translate whatever text it finds. The graphics screen image would normally come from a hardware scanner device, though its actual source is not important to the software. Test images could be generated with a paint program, for example. This package is intended only as a demonstration of a novel approach to optical character recognition. The classical approach follows the "eye" model. The entire character image is compared against a library of stored images to find a match. OCR_VEC's approach follows a "Braille" model. The computer feels the shape of the character and matches the vectors generated against a library of stored vectors. Vectors are perhaps more easily scaled and abstracted than images and thus this system is somewhat more tolerant than the classical approach to variation in letter shapes and sizes. The software can be trained on any font or fonts. When the computer scans an unrecognized letter, it displays the letter and prompts for user input. The user may then identify the letter and add its vectors to the lookup library. In subsequent encounters, the computer will now recognize that letter. The OCR_VEC package: The OCR_VEC packages contains the following files: OCR_VEC.DOC This file OCR_VEC.EXE The OCR executive program CGA_CAP.EXE A TSR utility to capture CGA screens EGA_CAP.EXE A TSR utility to capture EGA screens HGA_CAP.EXE A TSR utility to capture HERC screens VGA_CAP.EXE A TSR utility to capture VGA screens OCR_SHP.SHP A sample letter shape library file OCR_LET.LET A sample letter description file OCR_SCR.BIN A sample graphics screen capture file *.BGI Turbo C graphics drivers Operation: Optical Character Recogniton: 1. Enter OCR_VEC at the command line. The program will then prompt: Enter name of shape file (default=OCR_SHP.SHP) 2. Press the ENTER key to accept the default name or type in the filename of your choice and then press ENTER. The known letter shapes are read from this file. If the file does not exist, the file will be created but no known shapes will be loaded. All new letter shapes encountered during the scan will be written to this file. If the shape file is empty, the program next prompts: Lookup file not found! Enter granularity (default=8.0) 3. Press the ENTER key to accept the default value or type in the value of your choice and then press ENTER. The granularity value determines how precisely the computer feels the shape of each letter. If this value is too large, the shape library becomes too large and performance is compromised. If this value is too low, ambiguities in letter recognition occur. The program may be unable to distinguish Q from O for example. A value of 8.0 means that any feature of a letter must be larger than 1/8 the size of the entire letter to be considered significant. Empirically for normal fonts this value is sufficient. Use a larger number if ambiguities in character recognition occur. The program next prompts: Enter name of letter file (default=OCR_LET.LET) 4. Press the ENTER key to accept the default name or type in the filename of your choice and then press ENTER. This file contains information about each letter in the ASCII set and about user specified doublets and character fragments described below in the Adding a character section. The program next prompts: Enter screen image file (default=OCR_SCR.BIN) 5. Press the ENTER key to accept the default name or type in the filename of your choice and then press ENTER. This file must exist and be in the format required by OCR_VEC. A sample file is included with the system. To generate your own files from your own graphics, use the capture programs described below. The program then loads the requested graphic screen and presents the prompt: Scan entire screen (Y/N)? 6. If you want to scan only a portion of the screen, press N and follow the instructions to mark off the portion of the screen to be used. The program next begins scanning for characters. The progress of the scan is indicated by screen inversion. If the upper left pixel is off, the program assumes a black background and scans for non-black characters. If it is set, the program scans for black characters. Scanning stops when a blank line is encountered following non-blank lines. The program then highlights a character and prompts as follows: If the character is found in the shape library: Is it _ ? (Y/N/P/A/D/ESC) Else: Not found. Add to list (Y/N/A/D/ESC)? 7. For the first prompt: Type Y if the character highlighted is the character scanned. Note that only a portion of the scanned character may be highlighted and that the highlighted portion may be identical in shape and location to the suggested character. The / in % for example. Type Y if / is suggested. Type N otherwise. Do this only if the highlighted part is of a different shape or location than the suggested character. Type P to purge the indicated letter from the shape library. For the second prompt: Type Y to add its shape to the shape library. Follow the directions for Adding a Character outlined in the next section. Type N otherwise. For both prompts: Type A to toggle auto mode. When auto mode is on, the first prompt is bypassed. Use this mode when all characters in a font have been loaded into the shape library. Type D if the line being scanned has no letters with descenders (letters that extend below the baseline). This rare occurrence can confuse the system. An asterisk will appear in the prompt box when this mode is active. Type ESC to quit. 8. Repeat step 7 until the entire text line has been scanned. The program will then present the text of the line just scanned and the following prompt: <Text of line just scanned> Press any key to continue...(A/O/D/ESC) 9. Type A to toggle auto mode as described above. Type O to append the line of text to the file OCR_VEC.OUT. Type D to rescan the line in no descenders mode as described above. Type ESC to quit. Type any other key to continue. 10. Continue this procedure until the entire screen has been scanned. Adding a Character: When you type Y in response to the Add to list (Y/N/A/D/ESC)? prompt described above, the program then prompts: Enter letter that was scanned. You should then enter the correct case letter that is highlighted. Since OCR_VEC identifies a character as any group of non-background color pixels that is completely surrounded by background color pixels, the progam will sometimes identify parts of characters or multiple characters as a single character. In particular, the dot above i and j will be treated as a single character and will be identified before the body of the letter since the program scans downward. The dot and all other letter fragments should be treated as special cases and identified to the program through control characters. That is, if the program highlights the dot above i or j, then ctrl-a should be entered as the letter scanned. You will then be able to enter two letters to correspond to the highlighted character. Similarly, the tops of the two-part characters ? and ! should be entered as ctrl-b and ctrl-c respectively. The dots above : and ; may or may not be identified as the same as the dots above i and j depending on the font scanned. If they are not the same, then ctrl-d should be used. These assignements are abitrary and are only suggestions. The point is not to assign multiple values to a single shape. Later, when the program scans the second part of the character, it will recognize the first part above it and combine the information to generate a single character output. Parts are combined only when one part completely overlaps another part. Characters that partially overlap will not be combined; thus, even a kerned font can potentially be read. Some legitimate fonts have no whitespace between certain letter combinations like fl or YZ. OCR_VEC will highlight two characters in such a case. Again, these combinations can be handled with control characters. Enter an unused control character to correspond to the doublet. You may then enter two letters to represent the scanned character. A little awkward but it works. If a font has no whitespace between lines, however, you are out of luck. Graphics Screen Capture: OCR_VEC requires screen image files in a special format. Files in this format are generated by the ?GA_CAP.EXE programs included in the package. These programs are TSRs (terminate and stay resident) that are invoked after loading by the simultaneous press of the Alt, Ctrl and S keys. To capture a screen into the file OCR_SCR.BIN: 1. To capture CGA graphics screens, enter CGA_CAP at the Dos prompt. To capture EGA graphics screens, enter EGA_CAP at the Dos prompt. To capture VGA graphics screens, enter VGA_CAP at the Dos prompt. To capture HERCULES graphics screens, enter HGA_CAP at the Dos prompt. Load only one TSR at a time. These TSRs are quick and dirty little affairs and must not have other TSRs loaded after them. Use of Mark and Release utilities is recommended for their management. 2. Run whatever program you use to display the graphic screen you want OCR_VEC to scan. 3. Simultaneously press the Alt, Ctrl and S keys. 4. The file OCR_SCR.BIN is created in the current directory. Copyrights: OCR_VEC Copyright (C) Ron Mignery 1990. Created using Turbo C, Copyright (C) Borland 1987, 1988. Graphics drivers included as permitted by Borland's No-Nonsense License statement. TSRs created using the KyCorp Memory Resident Library Version 2.01, Copyright (C) KyCorp Information Group, Inc. 1987. Disclaimers: This software and instructions are provided "as is" without warranty of any kind either expressed or implied including but not limited to fitness for a particular purpose. The entire risk as to the results and performance of the software is assumed by the user. Distribution: The OCR_VEC package may be copied, distributed (free of charge), and used non-commercially provided that it is not modified in any way. Anyone interested in this approach to optical character recognition may contact the author at the following address: Ron Mignery 85 Bartlett Street Somerville, MA 02145 (617) 628-0206 (or via GENIE ron mignery) ***end of OCR_VEC.DOC***