home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
DP Tool Club 16
/
CD_ASCQ_16_0994.iso
/
news
/
2536
/
xtxt
/
anglais
/
xtxt.doc
< prev
next >
Wrap
Text File
|
1994-07-01
|
24KB
|
498 lines
XTXT 1.01 User's Manual.
1. Program description.
The purpose of XTXT is to extract a set of characters from any type
of file, the text included in a program for instance. A front end
which uses menus, dialog boxes, etc. makes its usage handy.
The program works as follows : It looks inside a file for a
continuation of n successive characters. These characters are
chosen in predefinite sets or selected one by one. You can display
the result of extraction on the screen (in the extraction buffer)
or save it in a text file.
So, one can very easily recover help screens embedded in an executable
file, examine messages, look for keywords. Yet, you can extract the
text from a document which has been formatted by a word processor, a
database, etc.
Moreover, XTXT can display any text file (its own files for instance)
or any binary file in a specific window (reading buffer).
These two buffers (Extraction and reading) use the Borland's
Turbo-Vision syntax. This syntax is given in an annexe at the end
of this manual.
Still, ASCII / ANSI text conversion functions have been included.
2. Working with XTXT.
2.1. "File" menu.
- Change Dir. allows to modify the current directory.
- TXT Extraction. Displays a dialog box for choosing the file
you want to examine and performs the text extraction. The
resulting text is displayed in a window.
- You can save the result in a text file for a later reading.
This file which extension is ".XTX", is created in the same
directory than the source file. If an "XTX" file already
exists, it is replaced by the new one.
If you extract toward the screen and a file at the same time,
the saturation of one of them does not involve to stop the
extraction. For instance, if you have no more free memory,
the extraction will continue toward the output file (unless
the disk is full, naturally).
- TXT Reading. Allows to read any text file (an "XTX" file for
instance). WARNING, the text files must not have lines
longer than 255 characters.
- Binary Reading. Displays the content of any file (programs,
pictures, etc.). Each line begins with its address in hexa.
The line's length which can vary from 16 to 240 is always a
multiple of 16. If you specify the line's length in the
" output" dialog box, this length will be adjusted
automatically.
The buffers' size is limited to 32767 lines. The size of the
output file is only limited by the amount of free space on the
disk.
One can open up to 8 windows at the same time (extracting plus
reading windows).
- ASCII -> ANSI and ANSI -> ASCII conversion. XTXT uses the
international ASCII set of characters for text output (default
set). Some programs, words processing under WINDOWS for
instance use another set : the ANSI one. The difference
concerns the characters comprise between 128 and 255,
accentuated letters for the most part. A good deal of them
are present in both sets so one can convert fairly an
accentuated text from one set to the other (seen in annex the
conversion table : file CONVERT.DOC).
WARNING : Most of the documents which come from a word
processing, a spreadsheet or a database contain control
characters. Translating these characters is obviously a
nonsense so, the conversion program replace them by blank
spaces.
You must keep in mind that ASCII / ANSI conversion concerns
only the standard sets of characters and certainly not all
characters sets (look at the "Winding" font on WINDOWS WRITE
to understand the problem).
The conversion resulting file has the same size than the
original document but owns a different extension (".ASC" for
ANSI -> ASCII and ".ANS" for ASCII -> ANSI conversions).
- DOS Shell. Allows to temporarely quit the program in order to
execute a DOS command. Typing "Exit" brings you back in the
program.
- Exit. End of the program.
2.2. Options menu.
- Research area. Displays an ASCII table with the characters
you have selected (white) and not selected (black). You can
modify this selection with the keyboard or the mouse.
- With the keyboard, move in the ASCII table by typing the
arrow keys or type directly the letter you want to select.
You can also select the letter under the cursor with the
"ENTER" key.
- With the mouse. Just click on the selected character.
There are some predefinite research areas. You can select
them in typing the function keys Fx (F2 to F7) or unselect
them with Alt-Fx. There too, you can select theses areas
by clicking on Fx or Alt-Fx (on the right in the table).
F6 and F7 are useful for french text only.
Remark. Three characters are not displayed correctly :
ASCII 0, 32 AND 255 which would appear as blank spaces
are replaced by points. This is the only way to show if
they are selected or not. Obviously, in the output file
or in the displaying window, they are represented as
they actually are.
- Output filters. The menu items are :
- Number of contiguous characters. Defines the minimum number
of characters of a "word". The default value (5) eliminates
the erratic characters which are scattered in a file and
permits an optimal text extraction from an exe file.
On the other hand, if you wish to extract data from a
word processor or editor document, the best result is
obtained when using a value of 1.
That number must be comprise between 1 and 254.
- Line feed. If you choose that option, every ASCII $0A
character provokes a jump to the next line, even if this
character is disactivated in the research area. This option
is especially useful when you extract help text from an
exe file. This type of text if very frequently formatted
and the "line feed" option keeps its correct shape.
- Empty line. When you check that box, the empty or too
short lines are retained. It is preferable to activate this
option when you extract text from a text file (editor,
word processor or database) in order to maintain its
making up whereas, if you extract text from an exe file,
that option makes big blank areas in the extraction buffer.
- Accents conversion. As seen above, some programs use
ANSI characters for accentuated letters. For instance the
"à" (ASCII 133) is coded "α" (ASCII 224). The same probleme
arise when you get texts from non-PC computers (VAX for
example).
If the conversion is off, these characters will be
displayed just as they are (i.e. like greek letters
α, Φ, Θ for instance).
If you wish to perform "binary" extraction (for viewing
a whole exe file for example), it is better to put the
conversion off.
- Control-Z. If you check that box, the ASCII 26 character
(end of text file) will be included in the research area.
WARNING : If you include this character in the output file,
you will not be able to read it fully : the reading process
will be ended on the first occurence of this character.
The default setting does exclude Control-Z.
- Control-M and Control-J (carriage return and end of line).
These characters control the text file format. If you keep
them, the output file displaying may be corrupted.
- Printer commands. Some characters (Form feed, "Escape"
codes, etc.) are used for controls when printing. It is
better to eliminate them if you wish to print the output
file correctly. These characters are ASCII 0 to 31 + ASCII
127.
When you modify the research area, it is not possible to
activate these control characters if the corresponding
option is off.
An advice to end this chapter : If you wish to avoid the
cutting of words which are at the end of a line (frequent
when you extract data from a text file) remove the
ASCII 32 character (space) in the research area.
- Result output.
- Output. Select the output peripheral : File and/or screen.
The default output is the screen.
- File format. If you generate an output file, you can
format it by 60 lines page ("page" mode). This option is
useful if you wish to print a listing. Be careful to
the control codes ("Escape" codes for instance). If
you choose "continuous" (default), the output file remains
just as it is.
- Information. Add a few lines on the top of the display
(or output file) which remind the parameters of
extraction.
- Line size. Sets the width of the line in the output buffer
or file. This number must be comprise between 2 and 255.
The line size must be greater than the number of
continguous characters. The default setting is 78 which
is the width of the extraction buffer.
- File extension. The default extension is ".XTX". It has
been choosen in order to decrease the risk to erase an
important file. Nevertheless, the user can custom this
extension. When you read a file (TXT read), this
extension is used as a filter.
- Complete name. The output file default name is Source
path + source file name + Output extension. You can
change the path and name if you wish by typing it in
that box. Obviously, this option is used only if an
output file is created. WARNING : If you wish to change
the sole path, don't forget to end this one by the
character "\".
- File conversion.
This menu allows to choose the parameters used in the ASCII /
ANSI conversion. It works just like the previous menu.
- Save config.
Every parameter you can change in the dialog boxes of the
"Options" menu (except the output file complete name) can be
saved in the XTXT.CFG file. This file is created in the same
directory than the XTXT.EXE's one. If the configuration file
is missing, the default settings are used.
WARNING : the configuration file's name is made of : program
name + ".CFG". You will remark that both programs XTXT.EXE and
XTXT386.EXE generate two different configuration files which
are XTXT.CFG and XTXT386.CFG. If you rename XTXT.EXE
ZGLURB.EXE for instance, the configuration file will be named
ZGLURB.CFG.
- Default config.
Sets the default values of the parameters. WARNING : if you
have already saved a configuration file, it will be used in
the next session. There are two ways to suppress the old
configuration : Erase the XTXT.CFG file or activate "Default
config." and then "Save config.".
2.3. "Tools" menu.
- Memory. Indicates the amount of free or used memory and the
size of each buffer.
- Find.
- Search again.
Classical tools of string searching. The search string is
the one which is placed under the cursor (if there is one).
The search string's length cannot be wider than 255
characters.
- Go to line number.
Allows to jump to the desired line. The "search again"
command starts from the line you've jumped to. If the line
number is greater than the buffer size, the cursor jumps to
the last line.
2.4. "Window" menu.
- The "tile" and "cascade" commands set the windows for they are
all visible.
3. Information warnings.
- Erasing a file.
When you create an output file, XTXT informs you when another
file with the same name already exists and asks you to confirm its
erasement.
- Buffer full.
Each buffer is limited to 32767 lines. This size is quite sufficent
for almost every case. If you try to pass beyond this limit, the
program warns you that the display will be uncomplete.
- Not enough memory.
If you extract too much data or read a too big file, memory
saturation can happens. XTXT itself retains about 15 Ko. (8086
version) or 250 Ko. (386 version) for its own running.
- Unable to run the DOS shell.
If the memory amount is unsufficient, you cannot run a DOS session
from the program.
- Packed files.
Some considerations on packed files. It is generally impossible to
extract any relevant data from this kind of files. XTXT warns you
when it detects one of them. Be careful : this program will not
find ALL packed files (it is not exactly its purpose) but can gives
you a hint in some cases. For instance the ".LZH", ".ARJ" or ".ZIP"
which are usually related to LHARC, ARJ or PKZIP/UNZIP programs
are detected.
WARNING : In some case, the packer's name can be related to several
programs. Example : PKLITE can be used by PKLITE (obviously), SQZ
or ZIP2EXE (and maybe others).
Some packers generate self-extractible archives (exe files). XTXT
detects most of them thanks to their header's signature.
Unfortunately, I have not been able to register all these programs,
so, these tests aren't absolutely complete. Anyway, running
theses archives will give you some information on their kind.
A special chapter must be devoted to self-extractible programs.
(let's notice for instance the excellent Fabrice Bellard's LZEXE
or Microsoft's EXEPACK). XTXT warns you when it detects one of
these programs.
In that case, except if you own the adequate unpacker, there is only
one method for scanning the program's content : To take a "snapshot"
in the course of their running. You will find on the disk a tool
(SNAPMEM.EXE), resident program which purpose is to save in a file
a "picture" of the DOS memory (less than 1 Mo.). This tools also
allows to read crypted exe file : the original file is crypted but
its image in memory is obviously not.
- Empty file.
If you try to extract data from a zero byte size file, XTXT warns
you.
- Probably not text file.
When you use the "TXT reading" command, the programs checks if
the file is a text file. If it is not (an executable file for
instance), the text reading is not allowed (risk of program crash
and awful display). In this case, you'd better to use the
"Binary reading" command.
4. Errors messages.
- Syntax.
Every syntax error about the parameters is pointed out.
- Line size.
The line limits are set between 2 and 255 characters.
- Number of contiguous characters.
The number of continguous characters must be greater than 1 and
less than 254.
Anyway, the line size must be greater than the number of
contiguous characters.
If theses limits are exceeded, an error message is displayed.
- No definite research area
If no character is selected in the research area table, XTXT
displays an error message.
6. Miscellanous errors.
- If there is not enough free memory, you cannot run the DOS shell.
In this case, close one window to free the memory.
- A certain number of messages points out the read or write errors
(disk full, write protect, etc.). There is no need for a further
explanation.
ANNEX 1. Turbo-Vision front end.
In order to make the best use of XTXT, it is important to know the basic
commands for using windows, dialog boxes, etc.
1. Menu line. One activate this line by clicking on or by typing
the F10 key then the arrow keys. Menu items are displayed in
grey if they are disactivated. The first letter (the red one)
is used as an activation key for the related item (hit Alt +
letter for direct running or Menu + letter if no hotkey is
definite).
2. Status line (last line). This line recalls the most used commands.
Sometimes, error messages can be displayed in.
3. Display window. This window appears as follows :
a b c d
╔═[■]════════════ Extr. 386SPART.PAR ═════════════[|]═╗
║ █ e
║ ▒
║ ▒ f
║ ▒
║ ■ g
║ ▒
║ ▒
║ ▒
║ ▒
║ █
╚═══21:10═══█▒▒▒▒▒▒▒▒▒▒▒▒■▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒█─┘ h
l k j i
The buffer type (extraction or reading) is indicated in "b".
The file name is given in "c".
The cursor position Line:Column is displayed in "l".
The relative position of the text is indicated by the marks "g" and
"j" in the scrollers "f" and "i".
Mouse Keyboard
a : closing box Click Ctl-F3
d : zoom box Click F5
e : vertical Click on Up and down arrows
scrolling vertical (line feed)
scroller PgUp and PgDwn (next page)
Ctl-PgUp and Ctl-PgDwn
(top and bottom of text)
k : horizontal Click on Right and left arrows
scrolling horizontal (character)
scroller "Home" and "End"
(start and end of line)
Next window Click on F6
the window
Resize window Click on "h" Ctl-F5 then Shift-arrow
and move "Enter" to end.
Move window Click on the Ctl-F5 then arrow keys
title bar "Enter" to end.
4. Dialog box. You activate an option by clicking or typing the
hotkey Alt + letter (in this case, the selected letter is
colored in yellow). In several cases, one can see a little
box close to the input line (on the right). It is the "memo"
box where the previous inputs are maintained. You activate
the memo by typing the down arrow.
Jumping from one item to the next is performed by typing "Tab"
or "Shift-Tab".
Moving the dialog boxes is identical to the windows' : With
the mouse, click on the title bar and move. With the keyboard,
Ctl-F5 then arrow keys.
5. One special dialog box is used for grabbing the file names.
It is figured as follows :
a b
╔═[■]═══════════════ Open file ══════════════════╗
║ ║
║ Name ║
j ║ [*.* ] [|] [Open ] ║ c
║ ║
║ Files [Cancel ] ║ d
║ │AUTOEXEC.BAT │ │ ║
║ │CONFIG.SYS │ │ ║
i ║ │NDOS.COM │ │ ║
║ │386SPART.PAR │ │ ║
h ║ │ ..\ │ │ ║
║ │ │ │ ║
g ║ █▒▒▒▒▒▒▒▒■▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒█ ║
║ ║
f ║ C:\*.* ║
e ║ AUTOEXEC.BAT 1254 Oct 29, 1994 19:00am ║
╚════════════════════════════════════════════════╝
Mouse Keyboard
a : closing box Click Escape
b : memo box Click Down arrow key
c : Open button Click Alt-O or Enter
d : Cancel button Click Escape or Enter
g : scroller Click Arrow keys
h : Previous directory Double click idem + Enter
i : File selection Double click idem + Enter
The selected name is displayed in "j". The current directory and
file informations are located in "f" and "e".