home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
CP/M
/
CPM_CDROM.iso
/
beehive
/
os
/
super8.arc
/
XS8.DOC
< prev
Wrap
Text File
|
1990-09-21
|
22KB
|
575 lines
Super8 Cross Assembler, v1.01
-------------------------------
Overview
--------
The Super8 Cross assembler is the first of the META family of
assemblers. These are very fast, fairly simple & reliable soft-
ware tools based on a tested common core. META assemblers are
absolute, therefore they will not produce relocatable code.
Given that assemblies run at 6,000 to 10,000 lines/minute, this
is not much of a penalty, as it is faster than many linkers.
Invocation:
-----------
The source file name and all options are extracted from the
command line when the cross assembler is invoked. The extension
on the source file must be <.S8>. The command line format is as
follows:
A>xs8 <filename> [/options]
The options, as one might expect, are optional. If no options
are specified, the file <filename>.S8 is assembled, but produces
no output other than to list errors on the console (this is the
fastest way to check for assembly errors). The options are each
a single letter, and they may be used in any combination:
L Generate listing
F Direct listing to <filename>.prn
P Direct listing to printer
S Generate a symbol table
H Generate Intel hex record file to <filename>.hex
T Generate TEK HEX file to <filename>.tek
For example, the line "xs8 mon/LSPH" would take the source
file "mon.s8", produce a listing and a symbol table to the CP/M
LST: device, and generate an output file of "mon.hex". Note that
whenever a listing is generated (via L), it is sent to the con-
sole as well as any other specified devices. F and P are useful
without the L option to list just assembly errors. Note that T
and H are mutually exclusive options. In the case where both are
specified, the right-most determines the type of output file.
Symbols:
-------
Symbols (labels and equates) always start in column one, may
optionally be followed by a colon, and may be up to 32 charac-
ters in length (all characters are significant, although the
symbol table will only print the first ten characters). Inter-
nally, everything is upcased when defining or referring to a
symbol, therefore case has no significance other than aesthetics.
Symbols may not start with a digit (0-9) address mode specifier
(@ or #). or a base specifier (% or $), but it is permissible to
embed these characters. Symbols should not contain characters
which would cause them to be evaluated as expressions, such as +,
-, * or / (i.e. lbl*2 is a bad name) Sticking with letters and
numbers is generally safest.
Note that internally, the symbol table requires three plus the
length of the symbol bytes for each entry. A typical CP/M system
will have 50-54k of memory available for the symbol table. This
means the longer the symbols, the fewer can be stored. Prac-
tically speaking, though, the difference between 5000 seven char-
acter symbols vs 2500 17 character symbols is academic, as a
program that size would be a little too big for a Super8.
All internal register names on the Super8 are pre-defined sym-
bols. These symbols are reserved, and attempting to create a
symbol with the same name will result in a 'Previously defined
Symbol' error.
Numbers
-------
Numeric values always start with a digit (0-9) or a base speci-
fier ($ for hex, % for binary). This first digit is used to help
the assembler differentiate numbers and symbols. The pseudo op
".RADIX" may be used to change the default number base, which de-
faults to decimal. Unless preceded by a base specifier, hex
numbers (or any base > 10) starting with A-F must be preceded by
a 0. A trailing base specifier ('h' or 'H') may be used to in-
dicate hexadecimal (e.g. 80h).
Strings
-------
Strings are used to embed text using one of the four pseudo ops
db, .ascii, .ascil or .ascic. Either single or double quotes may
be used as delimiters, however the lead and trailing delimiters
must match. The other delimiter character may be used inside the
string (eg. "jeff's code").
C-style escape sequences are allowed inside strings to embed
special characters. The following are supported:
\q The string delimiter (' or ")
\n Carriage return/line feed
\r Carriage return
\f Form feed
\t Tab
\b Backspace
\z Clear screen (control Z)
\' The single quote character (')
\" The double quote character (")
Expressions, (Demo Version)
---------------------------
Expressions work on values (symbols or numbers) and operators.
All math is with 16b binary. In the demo version, expressions
may be in only two forms:
1) <unary op> <value> [eg: -label]
2) <value><binary op><value> [eg: label+4]
Unary operators are:
+ unary plus (eg. +value)
- unary minus (eg. -value)
^HB high byte (eg. ^HB value)
^LB low byte (eg. ^LB value)
^C complement (eg. ^C value)
Binary operators are:
+ add (eg. value+23)
- subtract (eg. $-value)
A special symbol is '$', which is used to reference the cur-
rent value of the location counter (thus 'jr $' would be an end-
less loop).
Expressions, (Commercial Version)
----------------------------------
A major subset of the asmS8 operators are included in XS8.
Arithmetic expressions can be as long as 16 bits, and are parsed
left to right. Unary operators must be followed by either
another unary operator, a number or symbol, or by an expression
in parentheses.
Operator Precedence Function
------- ---------- --------
^HB 2 ; high byte
^LB 2 ; low byte
^C 2 ; 1's complement
^REV 2 ; byte reverse
* 4 ; Multiply
/ 4 ; Divide (unsigned)
MOD 4 ; Modulo
SHL 4 ; Shift left
SHR 4 ; Shift right
+ 5 ; Addition
- 5 ; Subtraction
AND 6 ; bitwise and
& 6 ; bitwise and
OR 7 ; bitwise or
| 7 ; bitwise or
XOR 7 ; bitwise exclusive or
= 8 ; Equal
> 8 ; Greater than
< 8 ; Less
>= 8 ; Greater than or equal
<= 8 ; Less than or equal
<> 8 ; Not equal
Parentheses may be used to alter precedence of evaluation. the
forms [expression] and {expression} are also valid. Note that
with some instructions, (ld, lde, ldc) the presence of a '('
indicates base index addressing. In such cases, use one of the
other forms. Parentheses may be nested without limit.
One character operators and all relational operators do not
require surrounding white space (e.g. gork+3 is equivalent to
gork + 3). Operators starting with ^ require no leading space.
All relational operators output 0 for false, -1 for true.
For more details on usage, see the example source file MATH.S8.
**************
* Pseudo ops *
**************
Note that pseudo ops can not start in column one, or they will
be considered labels.
.ASCIC <string>
---------------
Similar in function to .ASCII, but sets bit 7 in the last char-
acter in the string expression. If the statement has more than
one string, the last character in each will have bit 7 set.
.ASCII <string>
---------------
Define ascii string. .ASCII allows one or more ascii strings
to be defined in memory. See the section on strings for infor-
mation about escape sequences and delimiters. More often than
not, it is easier to use the DB statement.
.ASCIL <string>
---------------
Yet another variation of .ASCII, .ASCIL precedes the string
with a length byte equal to the number of characters in the fol-
lowing string. As with .ASCIC, multiple strings in one .ASCIL
statement will act as if they were each in their own .ASCIL
statement (e.i. each will be preceded with a length byte)
.LIST
-----
See .XLIST.
.RADIX <value>
-------------
Sets the default number base. If <value> is a number, it will
be treated as decimal regardless of current system number base.
.XLIST
-----
If a listing is being generated, .XLIST will suspend the list
ing until a .LIST is encountered.
DB <string or expressions>
--------------------------
Define Byte. Used to define data areas in memory. One or more
numeric expressions (separated by commas) may be defined. Exp-
ressions > 255 are truncated to eight bits. Strings may also be
used, in which case each character of the string defines a byte.
Note that DB can do what .ascii can do, but not vice versa
(.ascii handles only strings)
DS <value>
-----------
Define storage. This statement is used to reserve some number
of bytes by advancing the location counter by <value>. DS state-
ments produce no code, therefore no .HEX or .TEK output. The
statement ORG $+<value> has the same effect.
If should be noted that in order to use base index addressing
with the LDC and LDE instructions, the offset must be defined in
pass one before encountering the instruction (otherwise the as-
sembler would not know weather to use the short or long forms of
these instructions). For this reason, it is not a bad idea to
define storage at the beginning of a program rather than at the
end.
DW <value>
----------
Define word. Used to define data areas in memory. One or more
numeric expressions (separated by commas) may be defined. The
byte order used is high byte at the lower address, thus the line
"ADR: DW $AA55" would place $AA at ADR and $55 at ADR+1.
EQU <value>
-----------
Sets the most recently defined symbol to <value>. Normally,
this symbol is declared in the same line as the equ statement.
END
---
Used to mark the end of the source code, END also keeps track
of pass 1 and pass 2 end addresses, and generates an error mes-
sage if they change. This error should only be caused by some
other error. If a program assembles with only this error, it is
due to a bug in the assembler and I would appreciate getting my
hands on any source files that cause such errors.
INCLUDE <filename> (commercial version only)
--------------------------------------------
The include pseudo op allows some other file to be included in
the file being assembled. The effect is the same as pulling in
<filename> at the location of the include. If <filename> is not
in upper case, it will be upcased internally prior to opening the
file. No default extension is provided by the assembler,
<filename> may be any legal CP/M file. If <filename> can not be
opened, the assembler will abort.
In the current implementation, include files may be nested
without limit as long as there is stack space. Future implemen-
tations will support at least four levels of nesting.
ORG <value>
------------
Sets the location counter to <value>. The default location
counter value is zero.
* * *
asmS8 compatibility
-------------------
The Zilog Super8 cross assembler 'asmS8' (which is not avail-
able for CP/M) differs from XS8 in several areas. Porting source
between the two is not difficult given a good wordprocessor, as
long as you are aware of the following:
1) Zilog uses a different scheme for representing number
bases. A lead '%' means hexadecimal, and '%(n)' means base
N. XS8 uses a lead % in the more conventional way to
indicate binary, and a lead '$' or trailing 'h' to indi-
cate hex.
2) Modules are not supported under XS8. To convert to XS8,
comment out all .BEGIN and .END statements. This prohib-
its local symbols, and if there are duplicate symbol
names, the assembler will issue error messages. By taking
a listing of the errors, and referring to the commented
out .BEGIN and .END statements, it is not difficult to
change what needs changing.
3) XS8 is not a relocating assembler. This means programs
written as several source files and linked must be con-
catenated to be assembled, as well as all .GLOBAL and
.EXTERN statements removed. If the concatenated file is
too big to handle easily, one solution if you have the
commercial version is to use INCLUDE statements in the
first file to perform the concatination.
4) The expression handler under the demo version XS8 is
not very sophisticated. Some of this can be resolved by
using a few more EQU's to do the math. For any work in-
volving more advance expression, purchase of the commer-
cial version is strongly recommended.
5) XS8 is not a macro assembler. Any macros must be
manually expanded.
6) According to the test listing on page E-11 of the users
guide, it appears that asmS8 will not correctly assemble
LDC or LDE instructions when using the direct addressing
mode. Given that I am rather fond of using memory vari-
ables as well as register variables, XS8 will correctly
assemble these instructions. Be aware of this error if
you are trying to get the output of XS8 to match that of
asmS8. If you are curious, the monitor in the Super8
contest board does not use this instruction in this form,
which may explain why an unnecessarily large number of
registers are tied up by the monitor.
7) In the monitor source there is a string at the label
BRK_MSG which is delimited by a ` (grave). This seems to
add $60 $0A to the end of the string, which is useless, as
the string is delimited by the \r. Use of the grave in
this fashion is not documented in the asmS8 manual. The
grave is not a valid string delimiter under XS8.
8) Many (if not all) of the zilog pseudo ops begin with a
period. Most under XS8 (such as EQU, DB, DW, ORG, etc) do
not.
* * *
Sample code
-----------
Two source files are provided as examples for using XS8. The
first, T.S8 contains all instructions in all forms and may be
used to validate this assembler (note LDC error on asmS8 men-
tioned earlier). The second program (MON.S8) is a replacement
monitor for the super8 demo board, basically the same as the
monitor that came with, with following modifications:
1) It doesn't clobber the user program's working registers
2) The user's stack pointer is maintained
3) A single step function (T) has been added
4) The routine HOST_DELAY has been disabled
5) Breakpoint set routine has been debugged
6) W command also shows user stack
7) Monitor uses external stack, preserving registers
8) EMT is set up for a 250 ns rom.
9) Various routines have been cleaned up.
10) Baud rate is set at 19,200
The HOST_DELAY routine is called in many places to allow the
IBM-PC to catch up with the monitor. As CP/M computers as a rule
are not as grossly inefficient as IBM-PC's in performing i/o,
this annoying and unnecessary delay was disabled. If you are
using a CP/M emulator on a PC, and intend to use HOST.EXE, you
will have to re-enable HOST-DELAY. See the source for details.
If you have need of a new monitor prom and can not program one
yourself, contact me for details in obtaining one.
Another thing that should be mentioned is that HOST.EXE off-
loaded the in line assembler to the host computer. This function
has not yet been incorporated into the monitor rom.
Performance
-----------
The two programs just mentioned provide good bench marks of
this assembler. On a 4mhz Z-80 system running CP/M 3.0 off an
ST-225 hard disk, the following results were obtained:
T.S8 MON.S8
Assembly time: 3.0 seconds 86 seconds
Lines assembled: 500 7744
Lines/minute 10,000 6,114
MON.S8 assembles at a slower rate for two reasons: 1) It con-
tains a large number of comments, thus increasing disk i/o time.
2) It contains a large number of symbols. The time required to
search the symbol table is a linear function of the number of
elements in the table.
In any case, assembly time is a good order of magnitude or
better than asmS8 under MS-DOS, due in part to the fact the 8088
processor is not a very good machine to run 'C' on. To assemble
and link the four monitor files into a hex file takes 12:02 on a
Leading Edge Model D w/st-225 hard disk (one of the faster PC-XT
clones). 3:08 of this is just for linking (twice as long as the
entire assembly on the CP/M machine). It would appear from these
numbers that a low-end floppy-based CP/M system running XS8 would
be roughly comparable in performance to a 80286 machine running
asmS8.
In case you think I must be some sort of super hack, I should
point out that assemblers from other companies that are available
for both CP/M and ms-dos (eg. Avocet's XASM05 for the 6805)
perform in very similar ratios. XS8 is actually optimized for
compactness and simplicity of code rather than for speed.
******************
* Error Messages *
******************
Can't Open File
---------------
More often than not, this is caused by the source file not
being present (remember, it must have a .S8 extension). This
error can also be caused by being unable to open an output file
(.hex, .prn or .tek) due to the directory being full, or the disk
being in r/o mode. This error is fatal, and will immediately
abort the program.
Disk Full
---------
Self explanatory.
Previously defined symbol
-------------------------
Rather self explanatory. Be aware that at the start, there are
50 some odd register names in the symbol table (they do not show
in the table listing, however). When a duplicate symbol is en-
countered, it is ignored, and the first symbol remains valid.
Unrecognized op code
--------------------
Having something in the op code field that is neither an op
code nor a pseudo op will cause this error.
Bad number encountered
----------------------
Self explanatory. Something like add r14,47xyz will cause
this error.
Value expected
--------------
Self explanatory
Operand syntax error
--------------------
If an error is encountered in parsing the arguments to an op-
code, this error is generated.
Symbol too long
---------------
If a symbol if more than 16 characters in length, this error is
generated.
Missing END statement
---------------------
An END statement is required at the end of a source file.
Offset not defined on pass1
---------------------------
As mentioned earlier, the offset used with LDC & LDE in the
base index addressing mode must be evaluable in pass 1 for the
assembler to know if the short or long form of the instruction is
to be used.
Relative jump out of range
-------------------------
A relative jump can only range -128 to +127 from the address
following the instruction. Trying to go further results in this
error.
Symbol not found
----------------
Self explanatory. Certain ops like EQU, ORG, and LDC/LDE must
be able to evaluate symbols on pass 1 as well as pass two.
Too many Symbols! Out of memory!
--------------------------------
It takes one hell of a lot of symbols to do this, but if you
do, this is the error message you get.
Missing "
---------
If an expression contains an improperly terminated string, this
is the error you will get.
End address changed on pass 2
-----------------------------
Pass 1 figures out how much memory each line of code will take,
and defines all the symbols. Pass 2 actually generates the code.
If the location counter at the end changes from pass one to pass
two, something went wrong.
* * *
Royalties, etc.
---------------
If you are reading this, you no doubt obtained it directly or
indirectly through a BBS system. The topic of 'share ware' is
very controversial these days, and I do not wish to muddy the
waters further by asking you to send me money, especially if your
intention is just for hobbyist use.
On the other hand, if you would like to be kept posted on im-
provements, updates, and bug fixes on this product, if you need
some support, or if you are a commercial user, it is strongly
advised that you purchase the commercial version.
The commercial version of this assembler has a few more bells
and whistles, most noticeably 1) a full blown expression evalua-
tor, 2) support for nested INCLUDE files 3) support for con-
ditional assembly and 4) Support for threaded language genera-
tion. The current price for this product is $150.00. Also under
development is a multi-tasking ultra high speed FORTH tool kit
for the Super8. For more information, please contact:
Jeffrey D. Wilson
96 E. Broad Street
Bergenfield, NJ 07621
(201) 384-1596
this assembler has a few more bells
and whistles, most noticeably 1) a full blown