GEMini Atari

home *** CD-ROM | disk | FTP | other *** search

/ GEMini Atari / GEMini_Atari_CD-ROM_Walnut_Creek_December_1993.iso / files / math / classdoc / class.doc next >

Wrap

Text File | 1993-07-28 | 44KB | 1,313 lines

Last changes made on 28-Sep-91 Version 0.99b #### # #### ##### ##### # # # # # # # The Clarke Assembler # # ###### #### #### # # # # # # # (C)1991 Lutz Vieweg #### ##### # # ##### ##### ------------------ Manual ------------------------ 0 Preface --------- Please excuse linguistic mistakes in this text. English is not my native language. If you are not great at english, enjoy the easy words that are used, if your native language is english, think of what would be if you had to translate this text from german... All the text following is up-to-date only at 27th of April 1991. The research goes on and on, there may be mistakes, the hardware may change.... etc. This manual is written for people who have some experience in programming assembler on any computer. The HP48sx is not the right processor to learn assembler. Of course there are NO (0.00, zero) warranties that there's one single true word in this text. Your Monitor may blow up while you read it, your grand-ma may die instantly having taken a spot on it, I cannot take responsibility on anything that happenes with this text. 1.0 The Clarke-CPU Architecture ------------------------------- The Clarke processor is a child of the "Saturn" processor family. The Saturn processor family is a rather old one. As far as I know it was first used in HP's 71, later modified and customized for usage in the HP19B, 28C, 28S, 10B, 20S, 21S and finally 48sx. 1.1 The Saturn-CPU's characteristics ------------------------------------ - CMOS technology (low power consumption) - 4 bit data-bus (memory organized in nibbles) - 20 bit adress-bus (able to adress 1 mega-nibble = 512kB unpaged) - Micro-code based instruction-set - 4 not-omni-purpose 64-bit data-registers (called a,b,c,d) - 5 64-bit "scratch"-registers (called r0,r1,r2,r3,r4) - 2 20-bit adress-registers (called d0,d1) - one 4-bit "field-pointer" register (called p) - one 20-bit program counter - 8-level return-adress stack, 20-bit each level - one 16-bit status-register (called st) - one 4-bit hardware-status-register (called hst) - one 16-bit I/O input-register (called in) - one 12-bit I/O output-register (called out) - a carry flag - operating in either HEX-mode or DEC-mode 1.2 Clarke-specific features ---------------------------- The Clarke processor runs up to 2000000 cycles per second. Actually, the processor-clock runs at a speed of about 1,96 MHz in the HP48sx. There has been some new instructions implemented in the processor's micro-code, compared to the 28s' CPU: RSI MOVE.dd #$xxx,a (needs more nibbles than MOVE.dd #$xxx,c !) MOVE.fs|a a|c,rx MOVE.fs|a rx,a|c EXG.fs|a a|c,rx ADD.fs|a #$x,a|b|c|d SUB.fs|a #$x,a|b|c|d LSR.fs|a #1,a|b|c|d BCLR #$x,a|c BSET #$x,a|c BBC #$x,a|c,label BBS #$x,a|c,label JMP (a) JMP a|c move.a pc,a|c exg.a a|c,pc BUSCD 1.3 Register usage ------------------ The data-registers are the ones that are used to hold values, do arithmetics on them or move them elsewhere. There is some kind of hierarchy between the four registers, because it is not possible to do the same instructions on each of them as it is at the 680xx-family's data-registers. From "good" to "bad" this hierarchy is c,a,b,d. There are a lot of adressing modes available only to the c or a register... There are no differences in the way the processor supports each of the five "scratch"-registers. I set the quotation marks because these registers a very important for machine-language programmers even if they are supposed to hold "scratch" due to their name. The two adress-registers are also equivalent, they are used to hold the adress of any location you want to access in the memory. Note that there is no "absolute" adressing possibility if want to move data around the memory. The p register is not found similar on other machines. The normal usage of this register will be clear if you look at the "size extensions" chapter. But you can use this register also to hold one nibble of any kind of data. There are some strange adressing modes using this register. Because of it's meaning to the move.dd #expr,a.p|c.p instruction it is usually reset to zero after it has been changed and used. The program-counter does not need to be discussed here. It just holds the adress were execution takes place actually. The stack is much to small!!!! I really don't know how people can wish to save sillicium by sizing the stack to 8*20 bit. You'll have to place your own software-handled stack into memory when you wish to run complex machine-language programs. The status register consists of 16 bit, which are used by the operating-system to hold important flags. The hardware-status-register consists of four bit, having the following meanings: Bit 0 XM Module missing (set by opcode $00, "empty memory") Bit 1 SB Sticky bit (used as some sort of carry for shifting operations) Bit 2 SR Service request (set if an I/O device needs "service") Bit 3 MP Module pulled (really don't know the sense of this bit yet) The input and output registers are used to transfer data from and to "devices" that are connected "daisy-chained" to the I/O bus. I guess HP has decided to keep this chain rather short, at the HP48sx I know only two devices accessed via I/O registers (piezo-beeper and keyboard), and they are using memory-register based I/O in the 48sx as it is done in most computers... The carry-flag is the only conventional flag that exist in the Clarke CPU. It is used as on any other processor I know... 2.0 The CLASS Assembler's Mnemonics ----------------------------------- The mnemonics used by the CLASS-Assembler do not equal those published by HP in the IDS-manual. HP's mnemonics were a real horror to every guy who had programmed machine-language once before. I decided to use mnemonics that are similar to the ones used by the 680xx-family Assemblers. This is not the same as the BABEL command-set used by some former assemblers, but I did not like to use this one because I think it is rather inconsequent structured. You may define macros to make BABEL source-codes processable CLASS. A typical CLASS source-line looks like this: label mnemonic.size parameter,parameter,parameter ; comment where label is a typical assembler-label which can be of any lenght up to 32767 bytes. Of course, the label should not be equal to any register- or other special name. Mnemonic is just the command that should be translated into machine-code, with respect to the following size and/or parameters. The size plays a very important role at the Clarke-processors machine language. There are a lot of possible sizes available, but they are of course not available to any opcode/adressing mode combination. 2.1 Size extensions: -------------------- The great variety of possible size-extensions seems confusing and often useless. But remember, the Saturn-architecture was developed to laborate with 64-Bit BCD encoded floating-point-values, and therefore the size-extensions really make sense. .1, .2, .3, .... .16 A size extension that consists of a decimal number in the range from one to sixteen gives just the number of nibbles to be processed by the opcode. There are only a few commands that can handle one to sixteen nibbles, but this size extension is also used by some pseudo-opcodes described later. This size extension type will be abbreviated '.dd' in the following. .a One of the most used size-extension is .a, due to its meaning for the adress-handling. This extension is available to many commands. It defines the size to be processed to 20 bits (5 nibbles). .p The .p extension tells the command to process the nibble pointed to by the p register, therefore accessing one nibble only. .wp This extension tells the command to access the nibbles 0 to p of the specified register(s). .xs Accesses the exponent sign: Nibble #2 of the registers. .x Accesses the whole exponent: Three nibbles, nibble 0 to nibble 2 of the registers. .s Accesses the sign: Nibble 15 of the registers .m Accesses the mantissa: 12 nibbles, nibble 3 to nibble 14 of the registers. .b Accesses one byte: 2 nibbles, nibble 0 to nibble 1 of the registers. These to nibbles are used to hold the exponent (without the exponent sign). .w Acesses one complete register or floating-point value: 16 nibbles, nibble 0 to nibble 15 of the registers. If you access the memory by using one of the adress-registers, the specified size does not affect the location in memory. MOVE.s (d0),c MOVE.xs c,(d0) do both access the same adress in memory, but they use different nibbles of the c register. In most cases, CLASS will accept the following equivalents: .2 = .b .3 = .x .4 = .as .5 = .a .16 = .w 2.2 Size extension graphical summary: ------------------------------------- Assuming register p to consist the value 9 --------- Nibble of register ------------------ 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00 Size-extension .. .. .. .. .. .. .. .. .. .. .. ** ** ** ** ** .a .. .. .. .. .. .. ** .. .. .. .. .. .. .. .. .. .p .. .. .. .. .. .. ** ** ** ** ** ** ** ** ** ** .wp .. .. .. .. .. .. .. .. .. .. .. .. .. ** .. .. .xs .. .. .. .. .. .. .. .. .. .. .. .. .. ** ** ** .x ** .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .s .. ** ** ** ** ** ** ** ** ** ** ** ** .. .. .. .m .. .. .. .. .. .. .. .. .. .. .. .. .. .. ** ** .b ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** .w The following entries are only valid for the MOVE.dd #xxx,a.p|c.p command, still assuming p to be 9 .. .. .. .. .. .. ** .. .. .. .. .. .. .. .. .. .1 .. .. .. .. .. ** ** .. .. .. .. .. .. .. .. .. .2 .. .. .. .. ** ** ** .. .. .. .. .. .. .. .. .. .3 .. .. .. ** ** ** ** .. .. .. .. .. .. .. .. .. .4 .. .. ** ** ** ** ** .. .. .. .. .. .. .. .. .. .5 .. ** ** ** ** ** ** .. .. .. .. .. .. .. .. .. .6 ** ** ** ** ** ** ** .. .. .. .. .. .. .. .. .. .7 ** ** ** ** ** ** ** .. .. .. .. .. .. .. .. ** .8 ** ** ** ** ** ** ** .. .. .. .. .. .. .. ** ** .9 ** ** ** ** ** ** ** .. .. .. .. .. .. ** ** ** .10 ** ** ** ** ** ** ** .. .. .. .. .. ** ** ** ** .11 ** ** ** ** ** ** ** .. .. .. .. ** ** ** ** ** .12 ** ** ** ** ** ** ** .. .. .. ** ** ** ** ** ** .13 ** ** ** ** ** ** ** .. .. ** ** ** ** ** ** ** .14 ** ** ** ** ** ** ** .. ** ** ** ** ** ** ** ** .15 ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** .16 2.3 Parameters -------------- The Clarke-processors commands use up to three parameters. The following types of parameters are supported by CLASS: label A string that defines the name of a label, a numerical or one that is a offset to the object-code's begin. Example: JMP label #expr An expression that is calculated while assembling. See also chapter 2.4 Example: BSET #expr,st a|b|c|d One of the four data-registers a,b,c and d. The '|' sign is used as a logical OR in the following Example: MOVE a,c c.dd The nibble pointed to by dd (decimal digit 0 to 15) in c register c.p The nibbles from the one p points to higher ones in c register a.p The nibbles from the one p points to higher ones in a register rx One of the five scratch-registers r0,r1,r2,r3 and r4 p The p register st The status-register hst The harware-status-register in The I/O input-register out The I/O output-register 2.4 Expression evaluation ------------------------- Expressions may consist of labels, absolute values, operators and parentheses. Expressions are evaluated without any kind of hierarchy, you have to set parentheses if required. This behaviour may change in future versions of CLASS. Numerical labels and those who include an offset to the object's begin are treated the same in the actual version. This may also change in future. 2.4.1 Operators --------------- Actually, there are only a few operators supported: + Addition - Subtraction / negative sign * Multiplication / Division | OR & AND \ XOR 2.4.2 Absolute values --------------------- Decimal numbers begin with a number from 0 to 9. 1234 Hexadecimal numbers are prefixed with an dollar sign. $12ab Binary numbers are prefixed with a per-cent sign. %01101 ASCII-codes are written between quotation marks. "xy12" 2.5 Adressing modes ------------------- The following is a complete (?hope so?) list of the available adressing modes: #expr Absolute value label Absolute adress a|b|c|d Data-register direct (a) Data-register indirect (used in JMP (a) only) d0|d1 Adress-register direct (d0)|(d1) Adress-register indirect r0|r1|r2|r3|r4 Scratch-register direct p P-register direct st Status-register direct hst Hardware-status-register direct in I/O input-register direct out I/O output-register direct c.dd Data-register c with special nibble-pointer a|c.p Data-register from nibble pointed to by p register upwards 2.6 Mnemonic summary -------------------- Many of the mnemonics listed here are sharing the format mnemonic.size source,dest Notice that this is similar to the 680xx family's assembly format, and the opposite of Intel's one. The following abbreviations will be used: fs Field selctor. One of the extensions .p, .wp, .xs, .x, .s, .m, .b, .w rx One of the five scratch-registers r0,r1,r2,r3,r4 Notice that commands that use the .a size-extension need less nibbles to be coded in most cases. 2.6.1 Data transfer instructions -------------------------------- MOVE ---- The most used mnemonic at all is the move-command. There are a lot of adressing modes available, but they are not very similar to the adressing modes used by other processors. In fact, there are missing very potent ones like index'ed adressing. Available: move.fs|a b|c ,a move.fs|a a|c ,b move.fs|a a|b|d ,c move.fs|a c ,d move.3 c ,st ; only 12 of the 16 st-bits are affected! move.3 st ,c move.a|4 a|c ,d0|d1 ; move.a|4 d0|d1,a|c does not exist!!! ; use exg.a|4 a|c,d0|d1 if required move.fs|a|dd a|c ,(d0|d1) ; .a and .b are needing less nibbles move.fs|a|dd (d0|d1),a|c ; .a and .b are needing less nibbles move.w a|c ,rx move.w rx ,a|c move.fs|a a|c ,rx ; these two need more nibbles than the move.fs|a rx ,a|c ; previous - they were implemented later move.4 in ,a|c ; this instruction fails if it is executed ; on an odd adress. A bug in the microcode move.3 c ,out move.s c ,out ; out high-nibble affected move.1 p ,c.dd move.1 c.dd ,p move.1 #expr ,p move.a #expr ,d0|d1 move.ao #expr ,d0|d1 ; read "RELTAB" description move.as #expr ,d0|d1 move.b #expr ,d0|d1 move.dd #expr ,a.p|c.p ; this instruction is used that often, I ; decided to allow 'a' or 'c' as destination ; take care of the contents of p ! move.ao #expr ,a.p|c.p ; read "RELTAB" description move.a pc ,a|c EXG --- EXG exchanges the contents of two registers. It makes is no difference which register is defined first: exg.w c,r0 = exg.w r0,c and so on... EXG is a very usual instruction on the Clarke processor, it allows to use the scratch-registers as quick-accessable data storage and it is the only way to move an adressa out of an adress-register. Available: exg.w c|a ,rx exg.fs|a c|a ,rx ; this one needs more nibbles than the ; previous - it were implemented later exg.a c|a ,d0|d1 exg.fs|a a|b|c ,a|b|c exg.fs|a c ,d exg.1 p ,c.dd exg.a a|c ,pc PUSH ---- PUSH "pushes" a 20-bit value from register c onto stack. Available: push POP --- POP transfers the top 20-bit of the stack into the lower 20-bit of register c. Available: pop 2.6.2 Arithmetic instructions ----------------------------- ADD --- ADD is just what it is on every micro-processor: It adds two values and stores the result in a register. There are powerful increment instructions similar to the ADDQ instruction in the 680xx family, but I decided to use the same mnemonic, add, for them. Available: add.a #expr ,d0|d1 ; expr must be >=1 and <=16 add.fs|a #expr ,a|b|c|d ; expr must be >=1 and <=16. In fact, ; only the .b, .x, .a, .m and .w extension ; work properly on this instruction. add.fs|a a|b|c ,a|b|c ; adding one register on itself is allowed! add.fs|a c|d ,d|c ; It's then just a shift left one bit! add.a p+1 ,c ; This one is always done in hex-mode ; Wonderful to combine with .wp size extension SUB --- SUB subtracts the first parameter from the second and stores the result in the register that has to be specified as the second parameter. Available: sub.fs|a a|b|c ,a|b|c sub.fs|a c|d ,d|c sub.a #expr ,d0|d1 ; expr must be >=1 and <=16 add.fs|a #expr ,a|b|c|d ; expr must be >=1 and <=16. In fact, ; only the .b, .x, .a, .m and .w extension ; work properly on this instruction. SUBR ---- SUBR is not available on the 680xx-family. It subtracts the second parameter from the first and stores the result in the register that has to be specified as the second parameter. Available: subr.fs|a b ,a subr.fs|a c ,b subr.fs|a a ,c subr.fs|a c ,d INC --- INC increases a specified register by one. Available: inc.fs|a a|b|c|d inc.1 p inc.a d0|d1 ; identical to add.a #1,d0|d1 DEC --- DEC decreases a specified register by one. Available: dec.fs|a a|b|c|d dec.1 p dec.a d0|d1 ; identical to sub.a #1,d0|d1 CLR --- CLR stores zero into a register. Available: clr.3 st ; only the lower 12 bits are cleared! clr.fs|a a|b|c|d clr.1 #expr ,hst ; the bits set in expr are cleared in hst NEG --- NEG converts a registers value into it's 2's complement. Available: neg.fs|a a|b|c|d NOT --- NOT inverts all the bits in a register's field. Available: not.fs|a a|b|c|d 2.6.3 Binary-arithmetic instructions ------------------------------------ OR -- OR bitwise "ors" two registers and stores the result in the second one. Available: or.fs|a a|b|c ,a|b|c or.fs|a c|d ,d|c AND --- AND bitwise "ands" two registers and stores the result in the second one. Available: and.fs|a a|b|c ,a|b|c and.fs|a c|d ,d|c LSL --- LSL shifts the bits in a register to the left, setting the rightmost bit to zero after each shift. Available: lsl.fs|a #4 ,a|b|c|d ; a one-nibble shift in fact lsl.fs|a #1 ,a|b|c|d ; is identical to add.fs|a a,a|b,b|c,c|d,d LSR --- LSR shifts the bits in a register to the right, setting the leftmost bit to zero after each shift. Available: lsr.fs|a #4 ,a|b|c|d ; a one-nibble shift in fact lsr.w #1 ,a|b|c|d lsr.fs|a #1 ,a|b|c|d ; needs more nibbles than the ; previous one - later implemented ROL --- ROL rotates the bits in a register to the left, setting the rightmost bit to the value shifted out Available: rol.w #4 ,a|b|c|d ; that's all - sorry ROR --- ROR rotates the bits in a register to the right, setting the leftmost bit to the value shifted out Available: ror.w #4 ,a|b|c|d ; that's all - sorry BSET ---- BSET sets a specified bit in a register. It is not possible to use this instruction on a bit-number higher than 15. Bit numbers always counts from 0, the LSB of the register. There's no size-extension. Available: bset #expr ,a|c ; expr must be >=0 and <=15 bset #expr ,st ; expr must be >=0 and <=15 BCLR ---- BCLR cleares a specified bit in a register. It is not possible to use this instruction on a bit-number higher than 15. Bit numbers always counts from 0, the LSB of the register. There's no size-extension. Available: bclr #expr ,a|c ; expr must be >=0 and <=15 bclr #expr ,st ; expr must be >=0 and <=15 2.6.4 Conditional branches -------------------------- The instructions that affect the program counter do not differ much from those found on other processors. But there are two anomalies: Every conditional branch can be also a conditional return-from- subroutine by setting the branch-offset to zero. Due to the lack of flags, the conditional branches include the neccesary test operations. The following tests are supported: eq = equal ne != not equal lt < lower than le <= lower or equal gt > greater than gs >= greater or equal There are four other, special tests: cc carry clear The branch is taken if the carry flag is clear cs carry set The branch is taken if the carry flag is set bc bit clear The branch is taken if the specified bit is 0 bs bit set The branch is taken if the specified bit is 1 The label that defines the adress to jump to when the branch is taken, must be within a range of -128 to 127 around the adress the offset is defined. Available: bcc label bcs label rtcc ; return-if-carry-clear ... rtcs beq.fs|a a|b|c|d,0,label ; branch is taken if register is zero bne.fs|a a|b|c|d,0,label ; branch is taken if register is not zero beq.fs|a a|b|c,a|b|c,label ; branch is taken if registers are equal beq.fs|a c|d,d|c,label rteq.fs|a a|b|c|d,0 ; return if register is zero rtne.fs|a a|b|c|d,0 blt.fs|a a|b|c,a|b|c,label ; branch is taken if first register is lower blt.fs|a c|d,d|c,label ; than second one ble.fs|a a|b|c,a|b|c,label ble.fs|a c|d,d|c,label bgt.fs|a a|b|c,a|b|c,label bgt.fs|a c|d,d|c,label bge.fs|a a|b|c,a|b|c,label bge.fs|a c|d,d|c,label rtlt.fs|a a|b|c,a|b|c rtlt.fs|a c|d,d|c rtgt.fs|a a|b|c,a|b|c rtgt.fs|a c|d,d|c rtle.fs|a a|b|c,a|b|c rtle.fs|a c|d,d|c rtge.fs|a a|b|c,a|b|c rtge.fs|a c|d,d|c beq.1 #expr,hst,label ; branch is taken if hst AND expr is zero rteq.1 #expr,hst beq.1 #expr,p,label ; branch is taken if expr is equal to contents rteq.1 #expr,p ; of p register bne.1 #expr,p,label rtne.1 #expr,p bbc #expr,st,label ; branch is taken if specified bit in st is clear rtbc #expr,st bbs #expr,st,label ; branch is taken if specified bit in st is set rtbs #expr,st bbc #expr,a|c,label rtbc #expr,a|c bbs #expr,a|c,label rtbs #expr,a|c 2.6.5 Unconditional branches and other pc manipulation ------------------------------------------------------ Please notice: Use relative branches if possible; use always the "smallest" jump, that reaches the desired adress. Available: bra.3 label ; jumps relative (range: -2048 to +2047) bra.4 label ; jumps relative (range: -32768 to 32767) bsr.3 label ; calls subroutine (range: -2048 to +2047) bsr.4 label ; calls subroutine (range: -32768 to +32767) jmp label ; jumps to absolute 20-bit adressa jsr label ; calls subroutine at absolute 20-bit adressa jmp a|c ; jumps to the location pointed to by the lower ; 20 bits of the specified register jmp (a) ; jump to the adress that is defined at the ; memory location pointed to by the lower 20 bits ; of the specified register rtnsxm ; return and set XM-flag in hst. Used to catch ; "jumps into the desert" (opcode $00) rtn ; simply return from subroutine rtncc ; return and clear carry rtnsc ; return and set carry rsi ; return from system interrupt. Not fully explored yet 2.6.6 Miscellaneous instructions -------------------------------- sethex ; sets the processor into HEX-Mode setdec ; sets the processor into DEC-Mode Not fully explored yet. I do not know exactly how the mode affects the processors work. nop4 ; an artifical 4-nibble-nop, in fact a senseless branch uncnfg ; unconfigures all chips (?) and transfers the lower ; 20 bits of register c into each chip controller's ; data pointer. Not fully explored yet. config ; sends the lower 20-bits of register c to the chip ; which has daisy chain input high and config flag low ; Not fully explored yet c=id ; identifies chip: transfers the ID of the chip which has ; daisy chain input high and config flag low to the ; lower 20 bits of register c ; Not fully explored yet shutdn ; sends bus-shut-down command and stops CPU-Clock inton ; enables some sort of interrupt (from keyboard?) intoff ; disables some sort of interrupt (from keyboard?) reset ; sends system-bus-reset command, resets chips. Not fully... buscc ; sends system-bus command "C". Not fully... buscd ; sends system-bus command "D"? Not fully... sreq? ; if any chip on bus needs service, SR-bit in hst is set, ; and the "device-identifier" is latched to the lowest ; nibble of register c 3.0 The CLASS Assembler's pseudo-ops ------------------------------------ There are a lot of so called "pseudo-opcodes" supported - instructions that are not translated into bits instructing the processor to do something. 3.1 Include files ----------------- Every serious assembler supports include-files. You may type a line into your source-code, that has the "include" command in the mnemonic field and a filename as a parameter. This file will be inserted into the source while assembly. An include-file (which is, of course a normal source-file also) may include other files. There's no nesting limit but your stack's size. Examples: include "DH0:blub.a" include trash.a Notice: Every include-file will increase the number of lines to assemble (sounds simple..?) 3.2 Constants ------------- Constants are nibbles created by the assembler that are not executable code. 3.2.1 Nibble-by-nibble constants -------------------------------- With the "dc", "dcr" and "dcg" pseudo-ops you can place any kind of data into your object-file. The format is: dc.dd value, value, value, ... dcr.dd value, value, value, ... dcg.dd value, value, value, ... dd defines the constant's size. "dc" puts the values from high- to low-nibble into memory, as it is done by the 680xx family for example. "dcr" puts the values from low- to high-nibble into memory, this is how the Clarke-processor does it. Example: start move.a #data,d0 move.a (d0),c add.a #5,d0 move.a (d0),a rtn data dcr.5 $12345 dc.5 $12345 After calling this routine, register c will hold $12345 while register a contents $54321. "dcg" is used to define graphic-data for the HP48sx bit-plane format. Not only the order of the nibbles is reversed, the bits itself are also exchanged. You can shift graphic-data pixel-wise by just shifting the register that holds the data. Notice that the bit-plane of the HP48sx is organized byte-wise. 3.2.2 Text-Constants -------------------- With the "text" pseudo-ob you can place ASCII-codes into your object file. There's another pseudo-ob, "textr", which is used to reverse the nibbles of each ASCII. Format: text "The quick brown fox jumps over the lazy dog" textr "The quick brown fox jumps over the lazy dog" 3.3 The EQU instruction ----------------------- Use the "equ" pseudo-ob to define the value of a label to be created. Format: label equ expr References between label-definitions are solved completely. The following example will cause no trouble: jim equ bob+jeff-7 bob equ jeff/2 jeff equ 6 The following example will cause trouble: jim equ bob+jeff-7 bob equ jeff/2 jeff equ jim >ERROR: Unable to solve reference between following labels: >jim >bob >jeff 3.4 Macros ---------- To define a macro, use this syntax: name macro para1,para2,para3,para4 .... .... endmac Notice: - You can use any string to define a parameter. It's a good idea to choose a string that never occures in normal source-code. - The endmac statement must be the only word in the terminating line. - You cannot nest macros. - Macros must be defined BEFORE they are used in the source. - Use "?cnt?" to place a unique number for each macro in the resulting source - The number of possible parameters is limited to 4 - The total lenght of source-text in a macro is limited to 32767 characters Examples: rpl macro value dcr.5 value endmac absadr macro lbl,reg bsr absadr?cnt? absadr?cnt? pop move.a #lbl-absadr?cnt?,a add.a a,c move.a c,reg endmac ; what follows is an example how the macros ; are called rpl $02d9d absadr data,d0 move.3 #$123,c move.3 c,(d0) ; just an example ... has no meaning data text "-----------" The "absadr"-macro would look like this after expanding: bsr absadr1 absadr1 pop move.a #data-absadr1,a add.a a,c move.a c,d0 3.5 The relocation-table ------------------------ To write programs that are relocated before execution I implemented a pseudo-ob called "RELTAB". Use this pseudo to insert a table of the following format into your code: $XXXXX 20-Bit-Offset from PGM's base, where to find first adress to reloc $XXXXX 2. adress... ... 3. 4. 5. ... adresses ... $00000 A terminating zero-adress to signal list's end Entries in this table are made the following way: Whenever CLASS faces a valid instruction with a size-extension that is ".ao", it calculates the offset from the begin of the program to the adress where to find the 20-Bit-adress that is specified within the instruction. This adress is held in a variable until the RELTAB instruction is found. Then all the adresses collected becomes entries in the relocation table. The variables are cleared, and the game starts over. You may have more than one relocation-table in one program. But be sure to use the right ones for relocation during execution then. A sample relocation- and re-relocation-routine comes within the CLASS package. It'll fit your needs on a HP48sx. Note: A RELTAB command only cares about the adressas that have been specified in the source above. Be sure to avoid a 48sx-garbage-collection between relocation and re-relocation. If you fail to re-relocate a program after execution, your system may crash the next time you call the program. Avoid unwanted interrupts!!! 4.0 Local labels ---------------- The CLASS assembler supports local labels in a very easy way, to explain how to use them I will simply explain what CLASS does: Whenever CLASS finds a label that does end with an "." character, it assumes this label to be global. It stores the name of such a label in special variable. Whenever CLASS encounters a label with an "." as FIRST character, it takes the name of the last global label from the special variable, apends the name of the local label (including the point) and adds another point at the end of the total name. This is the whole thing. Now look what you can do with it... start text "the example starts here with a global label" bra .1 ; a first reference to a local label text "just a filler" later. ; a label that has been inserted later. ; it is a global one, but it does not ; affect the last-global-name variable text "just a filler" .1 text "this would be the destination of the branch" .jimbob text "local label names are normal strings..." ende ; the end of the routine, a global label dc.5 start.1. ;access local label from an outer point... .1 ; you can use ".1" as label name again... dc.5 later. ; a reference to the later inserted label bra .1 ; this jump goes to ende.1. 5.0 Invocation -------------- The invocation of the CLASS assembler depends on the computer-type you use. 5.1 Invoking CLASS on Commodore's AMIGA --------------------------------------- You have to invoke CLASS from a command-line interpreter. Use the following format: CLASS -a SOURCENAME [-o OBJECTNAME] [-i INCLUDE_DIR] [-ml maximal_lines] [-mm maximal_macros] [-ul] [-rn] [-mr maximal_reloctab_entries] [-s SYMBOL_FILE] Upper-case parameters have to be replaced by a string, lower-case ones have to be replaced by a decimal number. Parameters in parentheses are optional -a Defines the name of the source file -o Defines the name of the object file -i Defines the name of the include file directory (the actual path is searched also) -s Defines the name of an optinal symbol-file (e.g. for use with CLDIS) -ml Defines the maximal number of source-lines -mm Defines the maximal number of macro-definitions -mr Defines the maximal number of reloction-table-entries (view "RELTAB") -ul Enables listing of unused labels at the end of assembly -rn Enables nibble-swapping at end of assembly. Requiered if you use a "dumb" KERMIT-program that doesn't the swapping. 5.2 Invoking CLASS on Atari's ST -------------------------------- After having started CLASS.TTP from either the desktop or any CLI you can use the same set of options discussed under 5.1. To hold screen after execution keep mouse-key pressed. Please do not use external screen output accelerators like NVDI, TurboST or QuickST for there may occur pixel-trash then. 6.0 Bugs and future extensions ------------------------------ This is a early release of CLASS. I'm sure there are bugs in it. Please report any bug you find to me (adress: see below). I'm also happy about any good idea for additional features. 6.1 Known Bugs -------------- - CLASS is not that fast. I have not optimized the code in any other way than saving programming-time now. - CLASS is not able to handle expressions that exceed the 32 bits a 68000 is able to hold in a register. There are only two instructions where this disadvantage can cause problems: move.dd #expr,c.p|a.p If you really need to use such an instruction with more than 8 nibbles, you can replace it manually: move.16 #$0123456789abcdef,c.p can be replaced by: dc.1 $3,16-1 dc.8 $fedcba98,$76543210 move.16 #$0123456789abcdef,a.p can be replaced by: dc.4 $8082 dc.1 16-1 dc.8 $fedcba98,$76543210 Maybe I'll find the motivation to process 64-bit-values later, but I don't think it's that neccessary. 6.2 New features to come ------------------------ - More speed - A more different bahavior of numeric and offset labels - Conditional assembly - More useful pseudo-ops Since this program is not published the commercial way, I cannot guarantee updates at all. 7.0 Trademarks -------------- Amiga : Commodore-Amiga Inc., USA Atari ST : Atari HP48sx : Hewlett Packard, USA 8.0 Adress of the author ------------------------ Please contact me via EMail. UseNet : lv@muffel.hotb.sub.org FidoNet: 2:247/30.20 If it seems VERY urgent, and you cannot use EMAIL, call -49-69-5601966 (voice) Appendices: ----------- A. HP48sx Processor Performance - explored by L. Vieweg May 1991 ---------------------------------------------------------------- A.1.0 The cycle period ---------------------- The HP48sx Clarke-processor clock runs at 1.96 MHz (in my machine). About 25% of the processors speed is cut off when you enable the BitPlane DMA. To run your processor at full speed, clear bit 3 at $00100. A.2.0 Instruction execution times --------------------------------- These are the execution times of the Clarke processor's most used instructions. This list may be completed or revised later. Notice: If there are fractional cycles given, calculate the total cycles first, then round the number (7.5 => 8 and so on...). The abbreviation "siz" means the total number of nibbles affected by an instruction (.s=1, .b=2, .x=3, .a=5, .w=16, ...) Refer to the CLASS-assembler's manual for further information. Add one cycle for each odd adress the pc runs through. This will cause some nasty effects when programming time-critical routines. Instruction Source Dest Cycles ----------------------------------------------------------------- move.dd #0 ,c.p 2.5+dd*1.5 normal / access $00100-$0013f move.dd (d1)|(d0) ,c|a 21.5+dd*1.5 / 21+dd move.fs (d1)|(d0) ,c|a 21.5+siz*1.5 / 21+siz move.b (d1)|(d0) ,c|a 19 /17 move.a (d1)|(d0) ,c|a 23 /20 move.a c|a ,(d0)|(d1) 19 move.b c|a ,(d0)|(d1) 16 move.fs c|a ,(d0)|(d1) 20+siz move.dd c|a ,(d0)|(d1) 20+siz move.a a ,c 8 move.fs a|b|c|d ,a|b|c|d 4+siz move.a a|c ,d0|d1 9 move.4 a|c ,d0|d1 8 move.w a|c ,rx 20 move.w rx ,a|c 20 move.1 p ,c.dd 8 move.1 c.dd ,p 8 move.1 #expr ,p 3 move.a #expr ,d0|d1 10 move.as #expr ,d0|d1 9 move.b #expr ,d0|d1 6 clr.fs a|b|c|d 4+siz clr.a a|b|c|d 8 clr.1 #expr ,hst 4 exg.w c|a ,rx 20 exg.a c|a ,d0|d1 9 exg.fs a|b|c|d ,a|b|c|d 4+siz exg.a a|b|c|d ,a|b|c|d 8 exg.1 p ,c.dd 8 push pop 18 (the two's total) add.a #expr ,d0|d1 8 add.fs a|b|c|d ,a|b|c|d 4+siz add.a a|b|c|d ,a|b|c|d 8 add.a p+1 ,c 9 sub.fs a|b|c|d ,a|b|c|d 4+siz sub.a a|b|c|d ,a|b|c|d 8 subr.fs a|b|c|d ,a|b|c|d 4+siz subr.a a|b|c|d ,a|b|c|d 8 sub.a #expr ,d0|d1 8 inc.fs a|b|c|d 4+siz inc.a a|b|c|d 8 dec.fs a|b|c|d 4+siz dec.a a|b|c|d 8 inc.1 p 4 neg.fs a|b|c|d 4+siz neg.a a|b|c|d 8 not.fs a|b|c|d 4+siz not.a a|b|c|d 8 or.fs a|b|c|d ,a|b|c|d 6+siz or.a a|b|c|d ,a|b|c|d 11 and.fs a|b|c|d ,a|b|c|d 6+siz and.a a|b|c|d ,a|b|c|d 11 lsl.fs #4 ,a|b|c|d 5+siz lsl.a #4 ,a|b|c|d 9 lsr.fs #4 ,a|b|c|d 5+siz lsr.a #4 ,a|b|c|d 9 lsr.w #1 ,a|b|c|d 21 rol.w #4 ,a|b|c|d 22 ror.w #4 ,a|b|c|d 22 bset #expr ,a|c 8 bclr #expr ,a|c 8 bsr.3 rtn 26 (the two's total) bsr.4 rtn 29 bra.3 14 bra.4 17 ; taken/not taken bcc.2 12/4 b??.fs a|b|c|d ,a|b|c|d|0 16+siz/8+siz b??.a a|b|c|d ,a|b|c|d|0 21/13 beq.1 #expr ,hst 16/8 beq.1 #expr ,p 17/9 bbc|bbs #expr ,a|c 20/12 A.3.0 Overall Performance Comment --------------------------------- The processor's speed isn't very impressive. There's especially one point of criticism to the developers I wish to say: Everyone knows it's not possible to run handhelds processor clocks at a high speed because of the temperature-problems that would cause massive battery exhaust when semi-conductors get hot. But if so, why the devil have you created such a cycle-eating micro-code? There are so many nice processors around that do not need that much cycles for each intruction (6800, 65xx, 8510 etc., even the Z80 needs less cycles...), why does your's do???? The Clarke processor is able to move large amounts of data quicker through memory than comparable processors because of it's 64-bit registers. It is also fast on doing simple floating-point operations (I'll explore the decimal-mode later...). But it is very slow on more complex tasks as array-accessing, fixed-point operations and parameter-passing to subroutines (the little stack...).