home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
CP/M
/
CPM_CDROM.iso
/
simtel
/
sigm
/
vols000
/
vol010
/
rez.doc
< prev
next >
Wrap
Text File
|
1984-04-29
|
26KB
|
780 lines
REZ
A disassembler for 8080 programs
by Ward Christensen as modified for Z-80
with TDL mnemonics by William Earnest
with Zilog mnemonics by Hank Kee
CP/M U.G. 1/80
Suggestions? Call me eve's at (312) 849-6279
W. Earnest is at (215) 398-1634
----------------
REZ commands are inconsistent at best. - REZ is a
kludge based on years of disassembler experience and hacking,
and was never "planned" - just coded sitting at a tube, and
modified over 2 years before being contributed to the CP/M UG.
For example, to kill a symbol: k.label but to kill a
control value: caddr,k and to kill a comment: ;addr,
but REZ does the job like no other I have seen.
----------------
N-O-T-E: Pardon the editorial, but I feel hardware without
good software is useless to 99% of us. Most good software
has to be paid for. I strongly support the legitimate purchase
of licensed software. I do not regularly use any programs
which I have not purchased. (Yes, I do occasionally "try"
one, but then buy it if I plan on using it). I have been
asked by software businesses to NOT distribute REZ -
because of it's ability to produce good .asm source quickly.
But, there are so many disassemblers out, why not a good,
conversational one? Please use it in the spirit in which it
was contributed: to enlarge your understanding of the micro-
computer world around you, and to allow you to customize
programs which you legitimately own, for your own use.
"Semper non rippus offus"
----
NOTE: any command taking a hex address (Dnnnn, etc)
may take a value in the form .label but arithmetic
may not be performed. (i.e. d.start is ok, but d.start+8 not)
----------------
Overall structure of REZ: It is a .COM file,
which runs at 100H. It goes thru 1B00 or so, then the stack.
At 1C00 is a 512 entry table for control commands. Each is
3 bytes long, and controls the format of the re-sourced list,
i.e. is it .BYTE, .WORD, .BLKB instructions, etc.
At 2200 is the start of the symbol table. It has no
defined length as such. If it is not used, it occupies only
2 bytes.
If you want to re-source something which is in memory,
such as a PROM, a program previously loaded in high memory,
"CP/M itself", or whatever, you can just do so.
However, typically you want to disassemble a program
which runs at 100H, which is were REZ runs. Bob
Van Valzah would have solved that by making resource
relocatable and moving itself up under BDOS. I wasn't that
industrious.
Instead, REZ uses the concept of an "invisible"
OFFSET. After all, what do you care where it is as long
as it LOOKS like it's at 100h?
So, you set an offset. O2F00 sets it to 2F00 Hex.
Reading a .COM file (RFOO.COM) causes it to come into 3000 on.
If you say D100 or L100 it dumps or lists what LOOKS like your
program. Internally, REZ is adding the offset to the
D and L addresses.
What should you set the offset to? Well, that depends
upon how many symbols you will use. O2400 will load the
program at 2400, thus allowing only 2200-23FF for symbols,
i.e. 512 bytes or about 50-60 labels. If you didn't leave
enough space, then used B to build a default symbol table,
the table could run into and clobber your .com file! (easy
recovery, however: just change the offset to being higher,
and read in the .COM file again) Each entry takes 3 bytes
+ the symbol length, and if you like 7 byte labels like I
do, that means 10 bytes/label. An offset of 3300 should
be adequate.
If you want to put comments into the disassembled
program, you will have to designate an area to Use for the
comments. The U command (e.g. U4000) specifies what area
is to be used.
Before issuing the O (offset) command, do:
L5 7
which will show the JMP to BDOS, which is the highest memory
you may use. (Note if you have, for example, an empty 4K
memory board in high memory, you can Use THAT for comments).
Let's take an example: You have an 8K file, FOO.COM
which you want to disassemble. It will have about 300 labels.
300 x 10 is 3000, or call it 4K (what's a K unless your
tight). The symbol table starts at 2200. 4K more is 3200.
Let's load the .COM at 3200, so since it normally starts
at 100H, the offset is 3100. O3100 is the command.
We then RFOO.COM to read it in. It says 5200 2100 which
means it came into actual memory to 5200, but 2100 if we
are talking with respect to loading at 100.
Thus, we could set our comments table up after
the .COM program - say at 5400: U5400
The ? command shows the memory utilization for
control, symbol, and comments entries. (no, I never put
in anything to keep track of the .COM - you'll just have
to do that yourself).
If you ever want to dump real memory, you'll have
to reset the offset to 0: O0 but then set it back.
If you are not sure what it is, typing O will tell the
current offset.
Hoo, boy! Hope this kludge of documentation
is enough to get you going - hmmm, better give you
some of the gotcha's I've discovered...
---- WATCH FOR ----
* Symbols overflowing into the .COM.
(Use ? command to see how full symbol table is)
* Control entries overflowing into .SYM (altho I
can't believe anyone will have a program with
more than 512 control entries!!!)
* Comments overflowing into BDOS (ug!!)
* Using an offset which is not in free memory
and overlaying BDOS or whatever.
* The B(uild) command gobbling up too much when building
a .BYTE: "B" will take a .BYTE 'GOBBELDY GOOK' followed
by LXI H,FOO and take the LXI as a '!' (21H) so
you'll have to manually stick a new "I" control
entry in at the address of the LXI. You might
also delete the incorrect "I" entry which REZ
stuck in (typically at the second byte of the LXI)
* Trying to dump real memory without setting the
offset back to 0. (then forgetting to set it back
to its proper value)
* Forgetting how big the .COM file you are disassembling
was.
* Using REZ to rip off software (yes, I know, you
heard that before, but only 3 in 100 needed to be
told, and 2 in 100 needs to be told again, and 1 in
100 doesn't give a rat's fuzzy behind anyway!!)
* Forgetting to take checkpoints when disassembling
large files. You may even want to rotate the names
under which things are saved:
STEMP1.SYM
STEMP1.CTL
STEMP1.DOC
* Missing a label: Suppose you have a control entry
for a .WORD, resulting in:
DFLT: ;172C
.WORD 100H
but somewhere in the program, the following exists:
LDA 172DH
Even if you did a B and have a label L172D, it won't
show up since it's in the middle of a .WORD. Instead,
do this:
K.l172d kill the old label
e172d,.dflt+1 put in the new label as a displacement
off the beginning.
* improperly disassembling .WORD's (see previous item).
You might be tempted to make DFLT a .BYTE so that
DFLT: ;172C
.BYTE 0
L172D: ;172D
.BYTE 1
Note that while this disassembles and reassembles
properly, it is not "as correct" as the technique
used in the previous item.
* Having the "B" command overlay your "E" control entry.
What? Well, "B"uild is pretty dumb. If he finds 8
.BYTE type characters in a row, he fires off a .BYTE from
then on until he runs out of those characters. Suppose
your program was 200 long (ended at 3FF), and you
had zeroed (aha! Nice .BYTE candidates) memory there
(there meaning at your offset address + whatever).
Then you QB100,400 and viola!! REZ overlaid
your "E" control with a "B".
----------------
REZ is relatively complete. (well, actually,
the phrase "rampant featureitis" has been "mentioned").
...But there's always another day, and another K...
SO... Here's my "wish list"
..it might save you telling me YOU think such-and-such
would be nice...
* Targets of LHLD, SHLD should automatically be flagged
as type .WORD in the control table. Ditto LDA and STA
as .BYTE or as second half of .WORD. Ditto targets of LXI
as .BYTE (?).
* E5C,.FCB
followed by
E6C,.FCB+
should automatically calculate the appropriate
displacement, and put it in the symbol table.
* The comments facility should be enhanced to allow
total SUBSTITUTION of entire line(s) of the code,
i.e. at address such-and-such, replace the next 3
bytes with the following arbitrary line. This
would help those "how do I explain what was being
done" cases such as: LXI H,BUFFER AND 0FF00H
* Add the ability to, in one instruction, rename
a default (LXXXX) label to a meaningful name.
----------------
REZ types an "*" prompt when it is loaded. You may
then enter any of the following commands. Each command
is a single letter followed by operands. Commas are shown
as the delimiter, but a space will also work.
----------------
----------------
; Put comments into the program.
(must execute 'u' command first, to assign area
for comments to be placed)
;addr,comment enter a comment
;addr lists existing comment
; lists entire comments table
;addr, deletes existing comment
note that '\' is treated as a new line, i.e.
\test\ will be formatted:
;
;TEST
;
----------------
Attempt to find .BYTE's while listing the program.
- This command works just like 'L', but attempts
to find .BYTE's of 8 chars or longer.
(see 'L' command for operand formats)
----------------
Build default sym tbl (LXXXX) labels for each
- 2 byte operand encountered. Note 'B' is
identical to 'L' except labels are built.
(see 'L' command for operand formats)
----------------
Control table usage:
- c dump ctl tbl
cnnnn dump from starting
cnnnn,x define format from nnnn
to next entry. values of x:
B = .BYTE (attempts .ASCII if
printable, 0DH, 0AH)
W = .WORD (attempts label)
S = .BLKB to next ctl entry
I = instructions
K = kill this ctl entry
E = end of disassembly
NOTE every control entry causes a "control break"
(NO, REZ was NOT written in RPG) which means
a new line will be started. Thus if you have a
string in memory which disassembles as:
.BYTE 'Invalid operand',0DH
.BYTE 0AH
You might want to change it putting the 0DH,0AH
together on the second line - just enter a "B"
control entry for the address of the 0DH.
The same technique could be used to make
.ASCII 'TYPESAVEDIR ERA REN '
appear as
.ASCII 'TYPE'
.ASCII 'SAVE'
.ASCII 'DIR '
.ASCII 'ERA '
.ASCII 'REN '
----------------
dump:
- dxxxx Dumps 80H from xxxx on
daaaa,bbbb Dumps from aaaa thru bbbb
d,bbbb Continues, thru bbbb
d Continues, 100H more
NOTE 100H is the default dump length. If you have
a larger display, you can change the default via:
d=nn nn is the HEX new default.
For example, a 16 line tube could display 80H:
d=80 or..
d=80,200 Defaults to 80, dumps 200-27f
----------------
enter symbol:
- ennnn,.symbol symbol may be of any length,
and contain any char A-Z or 0-9,
or "+" or "-". This allows:
E5D,.FCB+1. Note the "+" is not
checked, i.e. E5D,.FCB+2 would be
wrong (assuming FCB is at 5C) but
would be allowed to be entered.
Note if you enter two symbols for the same address,
whichever one is first alphabetically will show up
on the disassembled listing. If you have a label
which has the wrong address, you need not explicitly
kill the old one before entering the new. A label
which is spelled exactly the same as an existing one
will replace the existing one even if the addresses
are different.
----------------
Find occurrence of address or label. Note this function
- runs until interrupted (press any key).
fnnnn,ssss find address nnnn in memory. Start
the search at ssss. Runs forever.
Press any key to stop.
f continue previous find command
fnnnn find nnnn starting at address you last
stopped at in the f command
----------------
kill symbol from table
- k.symbol
----------------
list (disassemble). This command is used to list the
- file, or to list it to disk after enabling
the .ASM file save via 'SFILENAME.ASM' command
l lists 23 lines from prev pc
lssss,eeee lists from ssss to eeee
l,eeee lists from current pc to eeee
lssss lists 23 lines at ssss
Note that if you have a control 'e' entry, then the
list will stop when that address is found. This allows
you to 'lstart,ffff'.
The 23 line default may be changed via:
L=nn where nn is a HEX line count, e.g.
L=0f set to 15 lines/screen
You can change the default and list, e.g.
L=9,100 Dflt to 9 lines, list at 100.
NOTE when using L to list the .ASM program to disk,
you should either list the entire program at once
using: Lssss,eeee or, you can list small pieces
at a time. As long as you list again without
specifying a starting address, (L or L,nnnn) then
the output file will continue uninterrupted.
You may do dump commands, and others, without
affecting what is being written to disk.
----------------
offset for disassembly
- o print current offset
onnnn establish new offset
(note the offset is always added to any
address specified in an a, b, d, or l command.
to dump real memory, the offset must be reset to
0 (O0) before the dump.)
----------------
prolog generation - this routine generates an
- .LOC instruction, and equates for any label
outside of a given low-hi address pair.
(the start and end addresses of your program).
e.g. if disassembling from 100 to 3ff, it will
generate 'fcb = 5ch' if FCB is in the symbol
table. In typical use, you would 'sfilename.asm'
then use the P command to write the prolog, then
the L command to write the program itself.
Pstart addr,end addr
quiet command: any command which is preceeded by a q
- will be done 'quietly'. For example, to save
a .asm program, you could just do:
ql100,3ff or ql100,ffff if you have
set the 'e' control in the control table.
Another use is to build a default symbol table
by taking a pass thru the program: QB100,xxxx
----------------
read .com, .ctl, .sym, or .doc file
- rfilename.com reads in at offset+100h
rfilename.ctl loads the ctl table
rfilename.sym loads the sym file
rfilename.doc loads the comments table (note
'u' command must have been issued)
----------------
save .asm, .ctl, .sym, or .doc file
- sfilename.asm use 'l' command to write, z to end
sfilename.CTL saves the CTL table
stablename.sym saves the sym file
sfilename.doc saves the comments table
----------------
use area of memory for comments table
- unnnn such as ud000 if you had an
open board at 0d000h
----------------
purge sym tbl and CTL tbl
x
-
----------------
close .asm file (note that a preferred way to close the
.asm file is to have specified a control entry
for the end address (e.g. c1ff,e))
z
-
--------------------------------------------
--------------------------------------------
Here is a sample of the REZ usage.
Given: a COM file (lets say test.com) which runs at 100
(as any good COM file should), and goes thru 2FF.
lines preceeded with ---> are typed by you.
---> REZ
---> o2600 set the offset to 2600, which means the
program will read into 2600 + 100 = 2700.
---> rtest.com reads the com file into memory. system says:
2900 0300 which is the actual hi load addr,
(2900) and the original hi load addr (300)
REMEMBER this address (300) because you might
want to put a "E" (end of assembly) control
entry there.
<<<<note>>>> that all 'L' (disassembly list) and 'D' (dump)
commands work with the offset added. Thus, you
should learn to forget that the disassembler is
in memory, and think of it as if your program were
actually at 100. D100 will dump your program.
also note: if the program being "REZd" will
have a fairly large symbol table, then you will
have to set the offset higher: o3300 or some such.
(the ? command will show symbol table usage: if your
symbol table is nearing the .com file, then just
set a new offset (higher) and re-load the .com)
if you want to dump r-e-a-l memory, you would have
to reset the offset to 0: o0 (but don't forget to
reset it to 1f00 before continuing with your program.)
If you are disassembling something which is in memory at
it's correct address (such as looking at ccp) then don't
set the offset. It defaults to 0 when dis is first loaded.
---> l100 list your program - lists "about" 10 lines.
---> d100 do a dump of your program.
NOTE that typically here are the steps to disassembling
a program which has just been read into memory:
Use the dump command to find the .ASCII areas.
Note that the 'a' command may be used to automatically
find the .BYTE's, but you must then check them to insure
that they don't extend too far. All printable characters,
0dh, 0ah are considered candidates for .ASCII .
At least 8 characters in a row must be found to make sure
that long sequences of mov instructions won't be taken
as .BYTE's.
Use the cnnnn,k command to kill erronious entries put
in the control table by the a command, but then immediately
put in the right address, such as via cnnnn,i
if you wanted to scan the program for .ASCIIs yourself,
use the 'c' (control) command to set the beginning and
end of ascii areas. For example, a program
which starts out:
0100 jmp start
0103 .ASCII 'copyright .....'
0117 start .....
would show up in the dump as:
0100 c3170144 4f50xxxx xxxxxxxx xxxxxxxx *...copyr ight....*
0110 xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx *xxxxxxxx ........*
thus you would want to instruct the disassembler to switch
to .ASCII mode at 103, and back to instruction mode at 117, thus:
c103,b
c117,i
Continue doing this, bracketing every .ascii which is in
the middle of instructions, by a b control instruction
and an i control instruction. Note that multiple db's in
a row need not have separate cnnnn,b instructions, but that
these do cause a 'line break', i.e. if you have a table
of ascii commands, for example:
02e5 .ASCII 'load'
02e9 .ASCII 'save'
the disassembler would disassemble these as:
02e4 .ASCII 'loadsave'
you could put in an additional control entry: c2e9,b, which
would cause the disassembler to generate:
02e4 .ASCII 'load'
02e8 .ASCII 'save'
which is much more readable and realistic.
Note that before generating each byte of a db, a symbol
table lookup is done to determine if there is a label
at that location, and if so, a new line is started.
Thus if 'loadlit' and 'savelit' were in the symbol table,
as the labels on the 'load' and 'save' above, no separate
'b' control instruction would be required as the label
would cause the break.
<<<<NOTE>>>> that at this time the automatic label checking
is n-o-t done for .BLKB instructions. Make sure that each
ds instrucion references only up to the next label. This
means that multiple .BLKB's in a row must each be explicitly
entered into the control table. Presence of a label is
not sufficient.
----
steps, continued:
After building the control entries with cnnnn,b and cnnnn,i
put in a control entry cnnnn,e which defines the address of
the end of your program. The l command will then automatically
stop there, and in addition, if you are in 'save xxx.asm'
mode, the output .asm file will be closed. If you do
mot define a control 'e' entry, then you will have to use
the break facility to stop the l command (don't use control-c
as that will re-boot cp/m). If you were writing an .asm
file, you would have to user the z command to close the
file.
Next, you would list your program to determine how it looks.
when you recognize a routine by it's function, insert a label.
For example, if you saw that location 7ef was a character
out routine (type) then enter a label into the symbol table:
E7EF,.TYPE
NOTE that all symbols start with a '.', so as to be dis-
tinguished from hex data.
NOTE that if you want the disassembler to make default labels
for you, use b (for build labels) instead of l (for list
program). The b commands causes lnnnn default labels to
be inserted in the symbol table for every 2 byte operand
encountered (LXI, SHLD, JMP, etc). It will undoubtedly
make some you don't want, such as L0000. You will have to:
K.L0000 kill label L0000 from the table.
When you encounter data reference instructions, try
to determine what type of area the instruction points to.
Typically,, LXI instructions may point to a work area
which should be defined as a .BLKB, or to an ASCII string,
in which case we will have already made it a 'b' control
instruction. Operands of LHLD and SHLD instructions
should be made .WORD instructions. For example if you
encounter LHLD 0534H, then issue a control instruction:
C534,W
NOTE that whatever mode you are last in will remain in
effect. Therefore if 534,w is the last entry in the
control table, all data from there on will be taken to be
.WORD's.
Suppose that you determine that address 7cf is a 128 byte
buffer for disk I/O. You want it to disassemble to:
DKBUF ;07CF
.BLKB 80H
You do this as follows:
C7CF,S to start the .BLKB
C84F,B to define it's end, and
E7CF,.DKBUF to put the symbol in the table.
Continue, iteratively using the 'l' command and the 'c'
and 'e' commands until you have the listing in a nice
format. You will then probably want to save the control
symbol, and comments tables. Or, you could have been
saving them at checkpoint times (so if you make a
major mistake you could go back to a previous one).
To save a control file:
sfilename.CTL (any filename, may include a: or b:)
To save a symbol file:
sfilename.sym
To save a comments file:
sfilename.doc (not ".com" of course)
NOTE that the filetypes must be used as shown, but
that any legal filename (or disk:filename such as b:xxxx.CTL)
may be used.
You could now control-c to return to CP/M, and come back
later to resume your disassembly:
REZ
o2200
rtemp.com
rtemp.sym
rtemp.ctl
uxxxx (such as u4000)
rtemp.doc
This will take you back exactly where you left off.
If you want to save a .asm file out to disk, do the following:
Make sure that there is a control entry defining the end
of the program (such as c200,e) or else you will have to
specify the ending address and manually type a z command to
close the file.
sfilename.asm
A message will indicate that the file is opened. Any
subsequent a, b, or l command will have whatever is listed
written to disk. Encountering a 'e' control, or typing a z
command will then close the .asm file. The listing may
be interrupted, and continued. Since the l command types
only 23 lines, use laddr,ffff to list thru the end of the
assembly.
If this is the 'final' save of the .asm file, you will
probably want to put an 'org' at the beginning of the
output file, as well as generate equ instructions for
any references outside of the program. For example, a
typical cp/m program will have references to:
bdos at 5
fcb at 5ch
tbuff at 80h
the 'p' (for prologue) command generates the org, then
scans the symbol table and generates equates:
BDOS = 05H
FCB = 05CH (etc.)
If you have a "e" control entry in your file, you can
list as follows: laddr,ffff - the listing will continue
until the "e" control entry is found
additional commands:
if you entered a label in the symbol table but now want
to get rid of it:
k.symbol
note to rename a symbol, such as when you had a system-
assigned lnnnn label but now want to make it meaningful:
k.l0334
e334,.type
you could even:
e.l0334,.type
k.l0334
but that takes more typing.
here are some more commands:
? prints statistics on symbol and control table
usage, etc.
c prints the entire control table
cnnnn prints the control table starting
at address nnnn
ds dumps the symbol table. Interrupt it
by typing any key.
ds.symbol starts dumping at the specified symbol,
or the nearest symbol. thus "ds.f" starts
the dump at the first label starting
with the letter 'f'.
t Toggles format mode such that labels are
placed on the same line with the next
instruction, or are on their own line.
....have fun, and let me know of any problems or
suggested improvements
------------------------
REZ
"Quick" command summary
Any address may be replaced by .symbol i.e. D.START
;addr,comment Enter a comment
;addr Lists existing comment
; Lists entire comments table
;addr, Deletes existing comment
A(see "L" for operands) Attempt to find .BYTE's
B(see "L" for operands) Build default sym tbl (Lxxxx)
C Dump ctl tbl
Cnnnn Dump ctl starting at nnnn
Cnnnn,x Define format from nnnn (B,E,I,S,W)
Dxxxx Dumps 80H from xxxx on
Daaaa,bbbb Dumps from aaaa thru bbbb
D,bbbb Dump thru bbbb
D Dump 100H more
D=nn nn= Hex dump size default.
Ds Dumps the symbol table.
Ds.symbol Sym dump starting at .symbol
Ennnn,.symbol Enter symbol into table
Fnnnn,ssss Find address nnnn starting at ssss
F Continue previous find command
Fnnnn Find nnnn
K.symbol Kill symbol from symbol table
L Lists 23 lines from prev pc
Lssss,eeee Lists from ssss to eeee
L,eeee Lists from current pc to eeee
Lssss Lists 23 lines at ssss
L=nn nn is hex list default # of lines
O Print current offset
Onnnn Establish new offset
Pstart addr,end addr Generate program prolog
Q Before any command suppresses
console output: QB100,200
Rfilename.COM Reads in at offset+100h
Rfilename.CTL Loads the ctl table
Rfilename.SYM Loads the sym file
Rfilename.DOC Loads the comments table (note
Sfilename.ASM Save .ASM file. Write w/L, Z to end
Sfilename.CTL Saves the CTL table
Sfilename.SYM Saves the sym file
Sfilename.DOS Saves the comments table
T Toggles trimmed format mode
Unnnn Use nnnn for comments table
X Purge all symbols and control
Z Write eof to .ASM file (
? Prints statistics (sym, ctl, comments)