home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Frozen Fish 1: Amiga
/
FrozenFish-Apr94.iso
/
bbs
/
alib
/
d1xx
/
d110
/
a68k.lha
/
A68k
/
A68k.Doc
< prev
next >
Wrap
Text File
|
1987-10-28
|
15KB
|
339 lines
A68k - a freely distributable assembler for the Amiga
by Charlie Gibbs
with special thanks to
Brian R. Anderson and Jeff Lydiatt
(Version 1.02 - September 9, 1987)
Note: This program is NOT Public Domain. Permission is given to freely
distribute this program provided no fee is charged, and this
documentation file is included with the program.
This assembler is based on Brian R. Anderson's 68000 cross-assembler
published in Dr. Dobb's Journal, April through June 1986. I have converted
it to produce AmigaDOS-format object modules, and have made many enhancements,
such as macros and include files.
My first step was to convert the original Modula-2 code into C.
I did this for two reasons. First, I had access to a C compiler, but
not a Modula-2 compiler. Second, I like C better anyway.
The executable code generator code (GetObjectCode and MergeModes) is
essentially the same as in the original article, aside from its translation
into C. I have almost completely rewritten the remainder of the code,
however, in order to remove restrictions, add enhancements, and adapt it to
the AmigaDOS environment. Since the only reference book available to me
was the AmigaDOS Developer's Manual (Bantam, February 1986), the assembler
and the remainder of this document work in terms of that book.
RESTRICTIONS
Let's get these out of the way first. There are a few things that I
have not yet implemented, and some outright bugs that would take too long
to correct for this version.
o The verification file (-v) option is not supported. Diagnostic
messages always appear on the console. They also appear in the
listing file, however (see extensions below).
o The file names in the include directory list (-i) must be separated
by commas. The list may not be enclosed in quotes.
o Labels assigned by EQUR and REG directives are case-sensitive.
o The following directives are not supported, and will be flagged as
invalid op-codes:
RORG
OFFSET
NOPAGE
LLEN
PLEN
NOOBJ
FAIL
FORMAT
NOFORMAT
MASK2
I feel that NOPAGE, LLEN, and PLEN should not be defined within a
source module. It doesn't make sense to me to have to change your
program just because you want to print your listings on different
paper. The command-line option "-p" (see below) can be used as a
replacement for PLEN.
EXTENSIONS
Now for the good stuff:
o Labels can be any length that will fit onto one source line
(currently 127 bytes maximum). Since labels are stored on the
heap, the number of labels that can be processed is limited only
by available memory, which can be increased by using the "-w"
option (see below).
o Since section data and user macro definitions are stored on the
same heap as the symbol table (see above), they too are limited
only by available memory. (Actually, there is a hard-coded limit
of 32767 sections, but I doubt anyone will run into that one.)
o The only values a label cannot take are the register names - the
assembler can distinguish between the same name used as a label,
instruction name, macro name, directive, or section name.
o Section and user macro names appear in the symbol table dump, and
will also be cross-referenced. Their names can be the same as any
label (see above); the assembler can sort them out.
o Includes and macro calls can be nested indefinitely, limited only
by available memory. The message "Secondary heap overflow -
assembly terminated" will be displayed if memory is exhausted.
You can increase the size of this heap using the -w parameter
(see below). Recursive macros are supported; recursive includes
will, of course, result in a loop that will be broken only when
the heap overflows.
o The EVEN directive forces alignment on a word (2-byte) boundary.
It does the same thing as CNOP 0,2.
(This one is left over from the original code.)
o Branch (Bcc) instructions to a previously-defined label will be
automatically converted to short form if possible. This feature is
not available for forward branches, since in pass 1 the assembler
doesn't yet know how far the branch must go.
o If a MOVEM instruction only specifies one register, it is converted
to the corresponding MOVE instruction. Instructions of the form
MOVEM D0-D0,label will not be converted, however.
o ADD, SUB, and MOVE instructions will be converted to ADDQ, SUBQ,
and MOVEQ respectively if possible. Instructions coded explicitly
as (for example) ADDA or ADDI will not be converted.
o ADD, CMP, SUB, and MOVE to an address register are converted to
ADDA, CMPA, SUBA, and MOVEA respectively, except if an ADD, SUB,
or MOVE instruction has already been converted to quick form.
o ADD, AND, CMP, EOR, OR, and SUB of an immediate value are converted
to ADDI, ANDI, CMPI, EORI, ORI, and SUBI respectively (unless the
address register or quick conversion above has already been done).
o If both operands of a CMP instruction are postincrement mode, the
instruction is converted to CMPM.
o The SECTION directive allows a third parameter. This can be
specified as either CHIP or FAST (upper- or lower-case). If this
parameter is present, the hunk will be written with the MEMF_CHIP
or MEMF_FAST bit set. This allows you to produce "pre-ATOMized"
object modules.
o The synonyms DATA and BSS are accepted for SECTION directives
starting data or BSS hunks. A section name is mandatory for
all non-CODE hunks.
o The ability to produce Motorola S-records is retained from the
original code. The -s option causes the assembler to produce
S-format instead of AmigaDOS format. Relocatable code cannot be
produced in this format.
o Error messages include the name of the source, macro, or include
module that contains the statement in error, plus the line number
within the module of the offending line. If a statement has
multiple errors, this information appears only on the first
error message for the statement.
HOW TO USE IT
The command-line syntax to run the assembler is as follows:
a68k <source file>
[-e<equate file>]
[-h<header file>]
[-i<include dirlist>]
[-l<listing file>]
[-o<object file>]
[-p<page depth>]
[-d]
[-s]
[-w[<primary-heap-size>][,secondary-heap-size]]
[-x]
These options can be given in any order, so if you like to specify your
switches first, you can. Option values, if any, must immediately follow
the keyword with no intervening spaces.
If the -o keyword is omitted, the object file will be given a default
name. It is created by replacing all characters after the last period in
the source file name by "o". For example, if the source file name is
"myprog.asm", the object file name defaults to "myprog.o". A source name
of "my.new.prog.asm" produces a default object file name of "my.new.prog.o".
If the source file name does not contain a period, ".o" is appended to it
to produce the default object file name.
The default value for the listing file name is arrived at in the same
way as the object file name, except that ".lst" is appended instead of ".o".
If you don't specify this parameter, no listing file will be produced.
If you specify -x (see below), -l (with the default name) is assumed,
although you can still use this parameter if you wish.
The default value for the equate file name is arrived at in the same
way as the object file name, except that ".equ" is appended instead of ".o".
The include directory list is a list of directory names separated by
commas. No embedded blanks are allowed. For example, the specification
-imylib,df1:another.lib
will cause include files to be searched for first in the current directory,
then in "mylib", then in "df1:another.lib".
The -d keyword causes symbol table entries (hunk_symbol) to be written
to the object module for the use of symbolic debuggers.
The -p keyword causes the page depth to be set to the specified value.
If omitted, a default of 60 lines (-p60) is assumed.
The -s keyword, if specified, causes the object file to be written in
Motorola S-record format. If omitted, AmigaDOS format will be produced.
The default name for an S-record file has ".s" appended to the source name,
rather than ".o"; this can still be overridden with the -o keyword, though.
The -w keyword specifies the size of the heaps used. The primary heap
stores the symbol table, user macro text, relocation information, and
cross-reference information. The secondary heap stores information for
nested macro calls and include files. The primary heap size defaults to
32768 bytes, which should be enough for all but the largest assemblies.
The secondary heap size defaults to 1024 bytes, which should be enough
unless you use very deeply nested macros and/or include files with long
path names. You can specify either or both parameters. For example:
-w40000 secondary heap size remains at 1024 bytes
-w,2000 primary heap size remains at 32768 bytes
-w40000,2000 increases the size of both heaps
If you're really tight for memory, and are assembling small modules, you
can use this keyword to shrink the heaps below their default sizes.
At the end of an assembly, a message will be displayed giving the
amount of heap space actually used, in the form of the -w command
you would have to enter to allocate the mininum heap space.
See below for a layout of the heaps.
The -x keyword will produce a symbol table dump, including
cross-reference information. If you haven't also specified -l (with
or without a file name), -l with the default file name will be assumed.
If you wish to override the default object and (optionally) listing
file names, you can omit the -o and -l keywords. The assembler interprets
the first three parameters without leading hyphens as the source, object,
and listing file names respectively. Anything over three file names is an
error, as is attempting to respecify a file name with the -o or -l keywords.
The primary heap is built from both ends. Symbol table entries
(including labels) and macro text are stored during pass 1. Cross-reference
data is stored during pass 2. Relocation information is also stored during
pass 2, but is cleared at the end of each SECTION. Since it is no longer
needed once dumped, the space is freed for re-use by the next section's
relocation information. The expression parser also uses the primary heap
to store its working stacks - this space is freed as soon as an expression
has been evaluated.
The fixed portion of each symbol table entry occupies 16 bytes. The
labels and macro text occupy just enough space to hold their strings
(including the end-of-string delimiter) - they are all pointed to by fixed
symbol table entries. Relocation entries occupy 10 bytes each.
Cross-reference entries are 12 bytes long - each holds four references to
one symbol. The expression parser creates temporary entries for terms
(10 bytes each) and operators (4 bytes each). Since terms are combined
as soon as possible, the parser almost never needs to store the entire
expression on the heap.
The diagram below illustrates the layout of the primary heap. High
memory addresses are at the top of the diagram, while low addresses are
at the bottom. The names on the left of the diagram are the names of the
pointers to the various tables within the heap.
Heap + maxheap -------------> ___________________________
| |
| Symbol table |
struct SymTab *SymStart ---> |___________________________|
| |
| Symbol references |
struct Ref *RefStart -------> |___________________________|
| |
| (unused space) |
char *HeapLim --------------> |___________________________|
| |
| Relocation data |
struct RelTab *RelStart ----> |___________________________|
| |
| Labels and macro text |
char *Heap -----------------> |___________________________|
Note that the pointers are to various types. This makes for
lots of interesting casts. (Ain't C fun?) Since the relocation
data is cleared at the end of each section, HeapLim will move up and
down. The "high-water mark" is stored in char *HighHeap, which is
used solely to produce the memory usage message at the end of the
assembly. Note that a program may consist of a section containing
many relocatable references, followed by a section with fewer
relocatable references but lots of symbol references. In this case,
RefStart might end up below HighHeap, and the final message would
indicate that more heap space was used than was available. This is
not an error - only if RefStart hits HeapLim will an error be reported.
The secondary heap is also built from both ends, but it grows and
shrinks according to how many macros and include files are currently open.
At all times there will be at least one entry on the heap, for the original
source code file.
The bottom of the heap holds the names of the source code file and
any macro or include files that are currently open. The full path is
given. A null string is stored for user macros. Macro arguments are
stored by additional strings, one for each argument in the macro call line.
All strings are stored in minimum space, similar to the labels and user
macro text on the primary heap. File names are pointed to by the fixed
table entries (see below) - macro arguments are accessed by stepping past
the macro name to the desired argument, unless NARG would be exceeded.
The fixed portion of the heap is built down from the top. Each entry
occupies 16 bytes. Enough information is stored to return to the proper
position in the outer file once the current macro or include file has been
completely processed.
The diagram below illustrates the layout of the secondary heap.
Heap2 + maxheap2 -----------> ___________________________
| |
| Input file table |
struct InFCtl *InF ---------> |___________________________|
| |
| Parser operator stack |
struct OpStack *Ops --------> |___________________________|
| |
| (unused space) |
struct TermStack *Term -----> |___________________________|
| |
| Parser term stack |
char *NextFNS --------------> |___________________________|
| |
| Input file name stack |
char *Heap2 ----------------> |___________________________|
The "high-water mark" for NextFNS is stored in char *High2,
and the "low-water mark" (to stretch a metaphor) for InF is stored
in struct InFCtl *LowInF. Again, these figures are used only to
determine the maximum heap usage.
Please send me any bug reports, flames, etc. I can be reached on
Dorean BBS (604/432-8579), Mind Link (604/533-2312), at any Panorama
(PAcific NORthwest AMiga Association) meeting, or via Jeff Lydiatt
or Larry Phillips. (I don't have the time or money to live on
Usenet or CompuServe, etc.)
Charlie Gibbs
#21 - 21555 Dewdney Trunk Road
Maple Ridge, B.C. CANADA
V2X 3G6
P.S. I plan to add 68010/68020 support in the future. Stay tuned.