home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
CP/M
/
CPM_CDROM.iso
/
enterprs
/
c128
/
text
/
64docs.arc
/
64DOC.TXT
next >
Wrap
Text File
|
1993-09-16
|
60KB
|
1,414 lines
==============================================================================
64 Documentation
by Jarkko Sonninen, Jouko Valta, John West, and Marko M"akel"a
(sonninen@lut.fi, jopi@stekt.oulu.fi, john@ucc.gu.uwa.edu.au,
msmakela@hylk.helsinki.fi)
[Ed's Note: I'm leaving this file as is because of its intention to
serve as a reference guide, and not necessarily to be presented in
article format. The detail and clarity with which the authors have
presented the material is wonderful!!]
#
# $Id: 64doc,v 1.3 93/06/21 13:37:18 jopi Exp $
#
# This file is part of Commodore 64 emulator
# and Program Development System.
#
# See README for copyright notice
#
# This file contains documentation for 6502/6510/8502 instruction set.
#
# Written by
# Jarkko Sonninen (sonninen@lut.fi)
# Jouko Valta (jopi@stekt.oulu.fi)
# John West (john@ucc.gu.uwa.edu.au)
# Marko M"akel"a (msmakela@hylk.helsinki.fi)
#
# $Log: 64doc,v $
# Revision 1.3 93/06/21 13:37:18 jopi
# X64 version 0.2 PL 0
#
# Revision 1.2 93/06/21 13:07:15 jopi
# *** empty log message ***
#
#
#
6510 Instructions by Addressing Modes
++++++++ Positive ++++++++++ -------- Negative ----------
00 20 40 60 80 a0 c0 e0 mode
+00 BRK JSR RTI RTS NOP* LDY CPY CPX Impl/immed
+01 ORA AND EOR ADC STA LDA CMP SBC (indir,x)
+02 t t t t NOP*t LDX NOP*t NOP*t ? /immed
+03 SLO* RLA* SRE* RRA* SAX* LAX* DCP* ISB* (indir,x)
+04 NOP* BIT NOP* NOP* STY LDY CPY CPX Zeropage
+05 ORA AND EOR ADC STA LDA CMP SBC -"-
+06 ASL ROL LSR ROR STX LDX DEC INC -"-
+07 SLO* RLA* SRE* RRA* SAX* LAX* DCP* ISB* -"-
+08 PHP PLP PHA PLA DEY TAY INY INX Implied
+09 ORA AND EOR ADC NOP* LDA CMP SBC Immediate
+0a ASL ROL LSR ROR TXA TAX DEX NOP Accu/impl
+0b ANC** ANC** ASR** AR
Cancel
/duck/mailserv/hacking>
Interrupt
/duck/mailserv/hacking>
6510 Instructions by Addressing Modes
++++++++ Positive ++++++++++ -------- Negative ----------
00 20 40 60 80 a0 c0 e0 mode
+00 BRK JSR RTI RTS NOP* LDY CPY CPX Impl/immed
+01 ORA AND EOR ADC STA LDA CMP SBC (indir,x)
+02 t t t t NOP*t LDX NOP*t NOP*t ? /immed
+03 SLO* RLA* SRE* RRA* SAX* LAX* DCP* ISB* (indir,x)
+04 NOP* BIT NOP* NOP* STY LDY CPY CPX Zeropage
+05 ORA AND EOR ADC STA LDA CMP SBC -"-
+06 ASL ROL LSR ROR STX LDX DEC INC -"-
+07 SLO* RLA* SRE* RRA* SAX* LAX* DCP* ISB* -"-
+08 PHP PLP PHA PLA DEY TAY INY INX Implied
+09 ORA AND EOR ADC NOP* LDA CMP SBC Immediate
+0a ASL ROL LSR ROR TXA TAX DEX NOP Accu/impl
+0b ANC** ANC** ASR** ARR** ANE** LXA** SBX** SBC* Immediate
+0c NOP* BIT JMP JMP STY LDY CPY CPX Absolute
+0d ORA AND EOR ADC STA LDA CMP SBC -"-
+0e ASL ROL LSR ROR STX LDX DEC INC -"-
+0f SLO* RLA* SRE* RRA* SAX* LAX* DCP* ISB* -"-
+10 BPL BMI BVC BVS BCC BCS BNE BEQ Relative
+11 ORA AND EOR ADC STA LDA CMP SBC (indir),y
+12 t t t t t t t t ?
+13 SLO* RLA* SRE* RRA* SHA** LAX* DCP* ISB* (indir),y
+14 NOP* NOP* NOP* NOP* STY LDY NOP* NOP* Zeropage,x
+15 ORA AND EOR ADC STA LDA CMP SBC -"-
+16 ASL ROL LSR ROR STX y) LDX y) DEC INC -"-
+17 SLO* RLA* SRE* RRA* SAX* y) LAX* y) DCP ISB -"-
+18 CLC SEC CLI SEI TYA CLV CLD SED Implied
+19 ORA AND EOR ADC STA LDA CMP SBC Absolute,y
+1a NOP* NOP* NOP* NOP* TXS TSX NOP* NOP* Implied
+1b SLO* RLA* SRE* RRA* SHS** LAS** DCP* ISB* Absolute,y
+1c NOP* NOP* NOP* NOP* SHY** LDY NOP* NOP* Absolute,x
+1d ORA AND EOR ADC STA LDA CMP SBC -"-
+1e ASL ROL LSR ROR SHX**y) LDX y) DEC INC -"-
+1f SLO* RLA* SRE* RRA* SHA**y) LAX* y) DCP ISB -"-
Legend:
t Jams the machine
*t Jams very rarely
* Undocumented command
** Unusual operation
y) indexed using IY instead of IX
6510/8502 Undocumented Commands
-- A brief explanation about what may happen while
using don't care states.
ANE $8B AC = (AC | #$EE) & IX & #byte
same as
AC = ((AC & #$11 & IX) | ( #$EE & IX)) & #byte
In real 6510/8502 the internal parameter #$11 may
occasionally be #$10, #$01 or even #$00. This occurs
probably when the VIC halts the processor right between
the two clock cycles of this instruction.
LXA $AB C=Lehti: AC = IX = ANE
Alternate: AC = IX = (AC & #byte)
TXA and TAX have to be responsible for these.
SHA $93,$9F Store (AC & IX & (ADDR_HI + 1))
SHX $9E Store (IX & (ADDR_HI + 1))
SHY $9C Store (IY & (ADDR_HI + 1))
SHS $9B SHA and TXS, where X is replaced by (AC & IX).
Note: The value to be stored is copied also
to ADDR_HI if page boundary is crossed.
SBX $CB Carry and Decimal flags are ignored but set in
substraction. This is due to the CMP command,
which is executed instead of the real SBC.
Many undocumented commands do not use AND between registers, the CPU just
throws the bytes to a bus simultaneously and lets the open-collector drivers
perform the AND. I.e. the command called 'SAX', which is in the STORE section
(opcodes $A0...$BF), stores the result of (AC & IX) by this way.
More fortunate is its opposite, 'LAX' which just loads a byte simultaeously
into both AC and IX.
$CB SBX IX <- (AC & IX) - Immediate
The 'SBX' ($CB) may seem to be very complex operation, even though it is
combination of subtraction of accumulator and parameter, as in the 'CMP'
instruction, and the command 'DEX'. As a result, both AC and IX are connected
to ALU but only the subtraction takes place. Since the comparison logic was
used, the result of subtraction should be normally ignored, but the 'DEX' now
happily stores to IX the value of (AC & IX) - Immediate.
That is why this instruction does not have any decimal mode, and it does not
affect the V flag. Also Carry flag is ignored in the subtraction but set
according to the result.
Proof:
These test programs show if your machine is compatible with ours
regarding the opcode $CB. The first test, vsbx, shows that SBX does
not affect the V flag. The latter one, sbx, shows the rest of our
theory. The vsbx test tests 33554432 SBX combinations (16777216
different AC, IX and Immediate combinations, and two different V flag
states), and the sbx test doubles that amount (16777216*4 D and C flag
combinations). Both tests have run successfully on a C64 and a Vic20.
They ought to run on C16, +4 and the PET series as well. The tests
stop with BRK, if the opcode $CB does not work expectedly. Successful
operation ends in RTS. As the tests are very slow, they print dots on
the screen while running so that you know that the machine has
not jammed. On computers running at 1 MHz, the first test prints
approximately one dot every four seconds and a total of 2048 dots,
whereas the second one prints half that amount, one dot every seven seconds.
If the tests fail on your machine, please let us know your processor's part
number and revision. If possible, save the executable (after it has stopped
with BRK) under another name and send it to us so that we know at which stage
the program stopped.
The following program is a Commodore 64 executable that Marko M"akela
developed when trying to find out how the V flag is affected by SBX.
(It was believed that the SBX affects the flag in a weird way, and this
program shows how SBX sets the flag differently from SBC.)
You may find the subroutine at $C150 useful when researching other
undocumented instructions' flags. Run the program in a machine language
monitor, as it makes use of the BRK instruction. The result tables will be
written on pages $C2 and $C3.
Other undocumented instructions usually cause two preceding opcodes being
executed. However 'NOP' seems to completely disappear from 'SBC' code $EB.
The most difficult to comprehend are the rest of the instructions located on
the '$0B' line.
All the instructions located at the positive (left) side of this line should
rotate either memory or the accumulator, but the addressing mode turns out
to be immediate!
No problem. Just read the operand, let it be ANDed with the accumulator
and finally use accumulator addressing mode for the instructions above them.
The rest two instructions on the same line, called 'ANE' and 'LXA' ($8B and
$AB respectively) often give quite unpredictable results.
However, the most usual operation is to store ((A | #$ee) & X & #$nn) to
accumulator. Note that this does not work reliably in a real 64!
On 8502 opcode $8B uses values 8C,CC, EE, and occasionally 0C and 8E for the
OR instead of EE,EF,FE and FF used by 6510. With 8502 running at 2 MHz #$EE is
always used.
Opcode $AB does not cause this OR taking place on 8502 while 6510 always
performs it. Note that this behaviour depends on chip revision.
Let's take a closer look at $8B (6510).
AC <- IX & D & (AC | VAL)
where VAL comes from this table:
IX high D high D low VAL
even even --- $EE (1)
even odd --- $EE
odd even --- $EE
odd odd 0 $EE
odd odd not 0 $FE (2)
(1) If the bottom 2 bits of AC are both 1, then the LSB of the result may
be 0. The values of IX and D are different every time I run the test.
This appears to be very rare.
(2) VAL is $FE most of the time. Sometimes it is $EE - it seems to be random,
not related to any of the data. This is much more common than (1).
In decimal mode, VAL is usually $FE.
Two different functions has been discovered for LAX, opcode $AB. One is
AC = IX = ANE (see above) and the other, encountered with 6510 and 8502,
is less complicated AC = IX = (AC & #byte). However, according to what is
reported, the version altering only the lowest bits of each nybble seems to
be more common.
What happens, is that $AB loads a value into both AC and IX, ANDing
the low bit of each nybble with the corresponding bit of the old AC. However,
there are exceptions. Sometimes the low bit is cleared even when AC contains
a '1', and sometimes other bits are cleared. The exceptions seem random (they
change every time I run the test). Oops - that was in decimal mode. Much
the same with D=0.
What causes the randomness? Probably it is that it is marginal logic levels -
when too much wired-anding goes on, some of the signals get very close to
the threshold. Perhaps we're seeing some of them step over it. The low bit
of each nybble is special, since it has to cope with carry differently
(remember decimal mode). We never see a '0' turn into a '1'.
Since these instructions are unpredictable, they should not be used.
There is still very strange instruction left, the one named SHA/X/Y, which is
the only one with only indexed addressing modes. Actually, the commands 'SHA',
'SHX' and 'SHY' are generated by the indexing algorithm.
While using indexed addressing, effective address for page boundary crossing
is calculated as soon as possible so it does not slow down operation.
As a result, in the case of SHA/X/Y, the address and data are prosessed at the
same time making AND between the to take place. Thus, the value to be stored
by SAX, for example, is in fact (AC & IX & (ADDR_HI + 1)).
On page boundary crossing the same value is copied also to high byte of the
effective address.
Register selection for load and store
bit1 bit0 AC IX IY
0 0 x
0 1 x
1 0 x
1 1 x x
So, AC and IX are selected by bits 1 and 0 respectively, while ~(bit1 | bit0)
enables IY.
Indexing is determined by bit4, even in relative addressing mode, which
is one kind of indexing.
Lines containing opcodes xxx000x1 (01 and 03) are treated as absolute after
the effective address has been loaded into CPU.
Zeropage,y and Absolute,y (codes 10x1 x11x) are distinquished by bit5.
Decimal mode in NMOS 6500 series
Most sources claim that the NMOS 6500 series sets the N, V and Z flags
unpredictably. Of course, this is not true. While testing how the flags are
set, I also wanted to see what happens if you use illegal BCD values.
ADC works in Decimal mode in a quite complicated way. It is amazing how it
can do that all in a single cycle. Here's a pseudo code version of the
instruction:
AC accumulator
AL low nybble of accumulator
AH high nybble of accumulator
C Carry flag
Z Zero flag
V oVerflow flag
N Negative flag
s value to be added to accumulator
AL = (AC & 15) + (s & 15) + C; ! Calculate the lower nybble.
if (AL > 9) ! BCD fixup
AL += 6; ! for lower nybble
AH = (A >> 4) + (s >> 4) + (AL > 15); ! Calculate the upper nybble.
Z = (AC + s + C != 0); ! Zero flag is set just
! like in Binary mode.
! Negative and Overflow flags are set with the same logic than in
! Binary mode, but after fixing the lower nybble.
N = (AH & 8 != 0);
V = ((AH & 8) ^ (A >> 4)) && (!(A ^ s) & 128);
if (AH > 9) ! BCD fixup
AH += 6; ! for upper nybble
! Carry is the only flag set after fixing the result.
C = (AH > 15);
AC = ((AH << 4) | (AL & 15)) & 255;
The C flag is set as the quiche eaters expect, but the N and V flags
are set after fixing the lower nybble but before fixing the upper one.
They use the same logic than binary mode ADC. The Z flag is set before
any BCD fixup, so the D flag does not have any influence on it.
Proof: The following test program tests all 131072 ADC combinations in
Decimal mode, and aborts with BRK if anything breaks this theory.
If everything goes well, it ends in RTS.
All programs in this chapter have been successfully tested on a Vic20
and a Commodore 64. They should run on C16, +4 and on the PET series as
well. If not, please report the problem to Marko M"akel"a. Each test in
this chapter should run in less than a minute at 1 MHz.
SBC is much easier. Just like CMP, its flags are not affected by
the D flag.
Proof:
The only difference in SBC's operation in decimal mode from binary mode
is the result-fixup:
AC accumulator
AL low nybble of accumulator
AH high nybble of accumulator
C Carry flag
Z Zero flag
V oVerflow flag
N Negative flag
s value to be added to accumulator
AL = (AC & 15) - (s & 15) - !C; ! Calculate the lower nybble.
if (AL & 16) ! BCD fixup
AL -= 6; ! for lower nybble
AH = (AC >> 4) - (s >> 4) - (AL > 15); ! Calculate the upper nybble.
if (AH & 16) ! BCD fixup
AH -= 6; ! for upper nybble
! Flags are set just like in Binary mode.
C = (AC - s - !C > 255);
Z = (AC - s - !C != 0);
V = ((AC - s - !C) ^ s) && ((AC ^ s) & 128);
N = ((AC - s - !C) & 128);
AC = ((AH << 4) | (AL & 15)) & 255;
Again Z flag is set before any BCD fixup. The N and V flags are set
at any time before fixing the high nybble. The C flag may be set in any
phase.
Decimal subtraction is easier than decimal addition, as you have to
make the BCD fixup only when a nybble flows over. In decimal addition,
you had to verify if the nybble was greater than 9. The processor has
an internal "half carry" flag for the lower nybble, and it uses it to
trigger the BCD fixup. When calculating with legal BCD values, the
lower nybble cannot flow over again when fixing it. So the processor
does not handle overflows while performing the fixup. Similarly, the
BCD fixup occurs in the high nybble only if the value flows over,
i.e. when the C flag will be cleared.
Because SBC's flags are not affected by the Decimal mode flag, you
could guess that CMP uses the SBC logic, only setting the C flag
first. But the SBX instruction shows that CMP also temporarily clears
the D flag, although it is totally unnecessary.
The following program, which tests SBC's result and flags,
contains the 6502 version of the pseudo code example above.
Obviously the undocumented instructions RRA (ROR+ADC) and ISB
(INC+SBC) have inherited also the decimal operation from the official
instructions ADC and SBC. The program droradc shows this statement
for ROR, and the dincsbc test shows this for ISB. Finally,
dincsbc-deccmp shows that ISB's and DCP's (DEC+CMP) flags are not
affected by the D flag.
6510 features
o PHP always pushes the Break (B) flag as a `1' to the stack.
Jukka Tapanim"aki claimed in C=lehti issue 3/89, on page 27 that the
processor makes a logical OR between the status register's bit 4
and the bit 8 of the stack register (which is always 1).
o Indirect addressing modes do not handle page boundary crossing at all.
When the parameter's low byte is $FF, the effective address wraps
around and the CPU fetches high byte from $xx00 instead of $xx00+$0100.
E.g. JMP ($01FF) fetches PCL from $01FF and PCH from $0100,
and LDA ($FF),Y fetches the base address from $FF and $00.
o Indexed zero page addressing modes never fix the page address on
crossing the zero page boundary.
E.g. LDX #$01 : LDA ($FF,X) loads the effective address from $00 and $01.
o The processor always fetches the byte following a relative branch
instruction. If the branch is taken, the processor reads then the
opcode from the destination address. If page boundary is crossed, it
first reads a byte from the old page from a location that is bigger
or smaller than the correct address by one page.
o If you cross a page boundary in any other indexed mode,
the processor reads an incorrect location first, a location that is
smaller by one page.
o Read-Modify-Write instructions write unmodified data, then modified
(so INC effectively does LDX loc;STX loc;INX;STX loc)
o -RDY is ignored during writes
(This is why you must wait 3 cycles before doing any DMA -
the maximum number of consecutive writes is 3, which occurs
during interrupts except -RESET.)
o Some undefined opcodes may give really unpredictable results.
o All registers except the Program Counter remain the same after -RESET.
(This is why you must preset D and I flags in the RESET handler.)
Different CPU types
The Rockwell data booklet 29651N52 (technical information about R65C00
microprocessors, dated October 1984), lists the following differences between
NMOS R6502 microprocessor and CMOS R65C00 family:
1. Indexed addressing across page boundary.
NMOS: Extra read of invalid address.
CMOS: Extra read of last instruction byte.
2. Execution of invalid op codes.
NMOS: Some terminate only by reset. Results are undefined.
CMOS: All are NOPs (reserved for future use).
3. Jump indirect, operand = XXFF.
NMOS: Page address does not increment.
CMOS: Page address increments and adds one additional cycle.
4. Read/modify/write instructions at effective address.
NMOS: One read and two write cycles.
CMOS: Two read and one write cycle.
5. Decimal flag.
NMOS: Indeterminate after reset.
CMOS: Initialized to binary mode (D=0) after reset and interrupts.
6. Flags after decimal operation.
NMOS: Invalid N, V and Z flags.
CMOS: Valid flag adds one additional cycle.
7. Interrupt after fetch of BRK instruction.
NMOS: Interrupt vector is loaded, BRK vector is ignored.
CMOS: BRK is executed, then interrupt is executed.
6510 Instruction Timing
The NMOS 6500 series uses a sort of pipelining. It always reads two
bytes for each instruction. If the instruction was only two cycles long,
the opcode for the next instruction can be fetched during the third cycle.
As most instructions are two or three bytes long, this is quite efficient.
But one-byte instructions take two cycles, even though they could be
performed in one.
The following tables show what happens on the bus while executing different
kinds of instructions. The tables having "???" marks at any cycle may be
totally wrong, but the rest should be absolutely accurate.
Interrupts
NMI and IRQ both take 7 cycles. Their timing diagram is much like
BRK's. IRQ will be executed only when the I flag is clear.
The processor will usually wait for the current instruction to
complete before executing the interrupt sequence.
There is one exception to this rule: If a NMI occurs while the
processor is executing a BRK, the two interrupts may take 7 to 14
cycles to execute, and the processor may totally lose the BRK
instruction. Probably the results are similar also with IRQ.
Marko M"akel"a experimented with BRK/NMI, but he still hasn't
analyzed the results.
RESET does not push program counter on stack, and we don't know how
long it lasts. But we know that RESET preserves all registers
(except PC).
Accumulator or implied addressing
BRK
# address R/W description
--- ------- --- -----------------------------------------------
1 PC R fetch opcode, increment PC
2 PC R read next instruction byte (and throw it away),
increment PCR
3 $0100,S W push PCH on stack (with B flag set), decrement S
4 $0100,S W push PCL on stack, decrement S
5 $0100,S W push P on stack, decrement S
6 $FFFE R fetch PCL
7 $FFFF R fetch PCH
RTI
# address R/W description
--- ------- --- -----------------------------------------------
1 PCR R fetch opcode, increment PCR
2 PCR R read next instruction byte (and throw it away),
increment PCR
3 $0100,S R increment S
4 $0100,S R pull P from stack, increment S
5 $0100,S R pull PCL from stack, increment S
6 $0100,S R pull PCH from stack
RTS
# address R/W description
--- ------- --- -----------------------------------------------
1 PCR R fetch opcode, increment PCR
2 PCR R read next instruction byte (and throw it away),
increment PCR
3 $0100,S R increment S
4 $0100,S R pull PCL from stack, increment S
5 $0100,S R pull PCH from stack
6 PCR R increment PCR
PHA, PHP
# address R/W description
--- ------- --- -----------------------------------------------
1 PCR R fetch opcode, increment PCR
2 PCR R read next instruction byte (and throw it away),
increment PCR
3 $0100,S W push register on stack, decrement S
PLA, PLP
# address R/W description
--- ------- --- -----------------------------------------------
1 PCR R fetch opcode, increment PCR
2 PCR R read next instruction byte (and throw it away),
increment PCR
3 $0100,S R increment S
4 $0100,S R pull register from stack
Note: The 3rd cycle does NOT read from PCR.
Maybe it reads from $0100,S.
Other instructions
# address R/W description
--- ------- --- -----------------------------------------------
1 PCR R fetch opcode, increment PCR
2 PCR R read next instruction byte (and throw it away),
increment PCR
Immediate addressing
# address R/W description
--- ------- --- ------------------------------------------
1 PCR R fetch opcode, increment PCR
2 PCR R fetch value, increment PCR
Absolute addressing
JMP
# address R/W description
--- ------- --- -------------------------------------------------
1 PCR R fetch opcode, increment PCR
2 PCR R fetch address's low byte to latch, increment PCR
3 PCR R copy latch to PCL, fetch address's high byte to
latch, increment PCR, copy latch to PCH
JSR
# address R/W description
--- ------- --- -------------------------------------------------
1 PCR R fetch opcode, increment PCR
2 PCR R fetch address's low byte to latch, increment PCR
3 $0100,S R store latch
4 $0100,S W push PCH on stack, decrement S
5 $0100,S W push PCL on stack, decrement S
6 PCR R copy latch to PCL, fetch address's high byte to
latch, increment PCR, copy latch to PCH
Read instructions (LDA, LDX, LDY, EOR, AND, ORA, ADC, SBC, CMP, BIT,
LAX, NOP)
# address R/W description
--- ------- --- ------------------------------------------
1 PCR R fetch opcode, increment PCR
2 PCR R fetch low byte of address, increment PCR
3 PCR R fetch high byte of address, increment PCR
4 address R read from effective address
Read-Modify-Write instructions (ASL, LSR, ROL, ROR, INC, DEC,
SLO, SRE, RLA, RRA, ISB, DCP)
# address R/W description
--- ------- --- ------------------------------------------
1 PCR R fetch opcode, increment PCR
2 PCR R fetch low byte of address, increment PCR
3 PCR R fetch high byte of address, increment PCR
4 address R read from effective address
5 address W write the value back to effective address,
and do the operation on it
6 address W write the new value to effective address
Write instructions (STA, STX, STY, SAX)
# address R/W description
--- ------- --- ------------------------------------------
1 PCR R fetch opcode, increment PCR
2 PCR R fetch low byte of address, increment PCR
3 PCR R fetch high byte of address, increment PCR
4 address W write register to effective address
Zero page addressing
Read instructions (LDA, LDX, LDY, EOR, AND, ORA, ADC, SBC, CMP, BIT,
LAX, NOP)
# address R/W description
--- ------- --- ------------------------------------------
1 PCR R fetch opcode, increment PCR
2 PCR R fetch address, increment PCR
3 address R read from effective address
Read-Modify-Write instructions (ASL, LSR, ROL, ROR, INC, DEC,
SLO, SRE, RLA, RRA, ISB, DCP)
# address R/W description
--- ------- --- ------------------------------------------
1 PCR R fetch opcode, increment PCR
2 PCR R fetch address, increment PCR
3 address R read from effective address
4 address W write the value back to effective address,
and do the operation on it
5 address W write the new value to effective address
Write instructions (STA, STX, STY, SAX)
# address R/W description
--- ------- --- ------------------------------------------
1 PCR R fetch opcode, increment PCR
2 PCR R fetch address, increment PCR
3 address W write register to effective address
Zero page indexed addressing
Read instructions (LDA, LDX, LDY, EOR, AND, ORA, ADC, SBC, CMP, BIT,
LAX, NOP)
# address R/W description
--- --------- --- ------------------------------------------
1 PCR R fetch opcode, increment PCR
2 PCR R fetch address, increment PCR
3 address R read from address, add index register to it
4 address+I* R read from effective address
Notes: I denotes either index register (X or Y).
* The high byte of the effective address is always zero,
i.e. page boundary crossings are not handled.
Read-Modify-Write instructions (ASL, LSR, ROL, ROR, INC, DEC,
SLO, SRE, RLA, RRA, ISB, DCP)
# address R/W description
--- --------- --- ---------------------------------------------
1 PCR R fetch opcode, increment PCR
2 PCR R fetch address, increment PCR
3 address R read from address, add index register X to it
4 address+X* R read from effective address
5 address+X* W write the value back to effective address,
and do the operation on it
6 address+X* W write the new value to effective address
Note: * The high byte of the effective address is always zero,
i.e. page boundary crossings are not handled.
Write instructions (STA, STX, STY, SAX)
# address R/W description
--- --------- --- -------------------------------------------
1 PCR R fetch opcode, increment PCR
2 PCR R fetch address, increment PCR
3 address R read from address, add index register to it
4 address+I* W write to effective address
Notes: I denotes either index register (X or Y).
* The high byte of the effective address is always zero,
i.e. page boundary crossings are not handled.
Absolute indexed addressing
Read instructions (LDA, LDX, LDY, EOR, AND, ORA, ADC, SBC, CMP, BIT,
LAX, LAE, SHS, NOP)
# address R/W description
--- --------- --- ------------------------------------------
1 PCR R fetch opcode, increment PCR
2 PCR R fetch low byte of address, increment PCR
3 PCR R fetch high byte of address,
add index register to low address byte,
increment PCR
4 address+I* R read from effective address,
fix the high byte of effective address
4+ address+I R re-read from effective address
Notes: I denotes either index register (X or Y).
* The high byte of the effective address may be invalid
at this time, i.e. it may be smaller by $100.
+ This cycle will be executed only if the effective address
was invalid during cycle #4, i.e. page boundary was crossed.
Read-Modify-Write instructions (ASL, LSR, ROL, ROR, INC, DEC,
SLO, SRE, RLA, RRA, ISB, DCP)
# address R/W description
--- --------- --- ------------------------------------------
1 PCR R fetch opcode, increment PCR
2 PCR R fetch low byte of address, increment PCR
3 PCR R fetch high byte of address,
add index register X to low address byte,
increment PCR
4 address+X* R read from effective address,
fix the high byte of effective address
5 address+X R re-read from effective address
6 address+X W write the value back to effective address,
and do the operation on it
7 address+X W write the new value to effective address
Notes: * The high byte of the effective address may be invalid
at this time, i.e. it may be smaller by $100.
Write instructions (STA, STX, STY, SHA, SHX, SHY)
# address R/W description
--- --------- --- ------------------------------------------
1 PCR R fetch opcode, increment PCR
2 PCR R fetch low byte of address, increment PCR
3 PCR R fetch high byte of address,
add index register to low address byte,
increment PCR
4 address+I* R read from effective address,
fix the high byte of effective address
5 address+I W write to effective address
Notes: I denotes either index register (X or Y).
* The high byte of the effective address may be invalid
at this time, i.e. it may be smaller by $100. Because
the processor cannot undo a write to an invalid address,
it always reads from the address first.
Relative addressing (BCC, BCS, BNE, BEQ, BPL, BMI, BVC, BVS)
# address R/W description
--- --------- --- ---------------------------------------------
1 PCR R fetch opcode, increment PCR
2 PCR R fetch operand, increment PCR
3 PCR R Fetch opcode of next instruction,
If branch is taken, add operand to PCL.
Otherwise increment PCR.
3+ PCR* R Fetch opcode of next instruction.
Fix PCH. If it did not change, increment PCR.
3! PCR R Fetch opcode of next instruction,
increment PCR.
Notes: * The high byte of Program Counter (PCH) may be invalid
at this time, i.e. it may be smaller or bigger by $100.
+ If branch is taken, this cycle will be executed.
! If branch occurs to different page, this cycle will be
executed.
Indexed indirect addressing
Read instructions (LDA, ORA, EOR, AND, ADC, CMP, SBC, LAX)
# address R/W description
--- ----------- --- ------------------------------------------
1 PCR R fetch opcode, increment PCR
2 PCR R fetch pointer address, add X to it,
increment PCR
3 ??? R internal operation
4 pointer+X R fetch effective address low
5 pointer+X+1 R fetch effective address high
6 address R read from effective address
Note: The effective address is always fetched from zero page,
i.e. the zero page boundary crossing is not handled.
Read-Modify-Write instructions (SLO, SRE, RLA, RRA, ISB, DCP)
# address R/W description
--- ----------- --- ------------------------------------------
1 PCR R fetch opcode, increment PCR
2 PCR R fetch pointer address, add X to it,
increment PCR
3 ??? R internal operation
4 pointer+X R fetch effective address low
5 pointer+X+1 R fetch effective address high
6 address R read from effective address
7 address W write the value back to effective address,
and do the operation on it
8 address W write the new value to effective address
Note: The effective address is always fetched from zero page,
i.e. the zero page boundary crossing is not handled.
Write instructions (STA, SAX)
# address R/W description
--- ----------- --- ------------------------------------------
1 PCR R fetch opcode, increment PCR
2 PCR R fetch pointer address, add X to it,
increment PCR
3 ??? R internal operation
4 pointer+X R fetch effective address low
5 pointer+X+1 R fetch effective address high
6 address W write to effective address
Note: The effective address is always fetched from zero page,
i.e. the zero page boundary crossing is not handled.
Indirect indexed addressing
Read instructions (LDA, EOR, AND, ORA, ADC, SBC, CMP)
# address R/W description
--- ----------- --- ------------------------------------------
1 PCR R fetch opcode, increment PCR
2 PCR R fetch pointer address, increment PCR
3 pointer R fetch effective address low
4 pointer+1 R fetch effective address high,
add Y to low byte of effective address
5 address+Y* R read from effective address,
fix high byte of effective address
5+ address+Y R read from effective address
Notes: The effective address is always fetched from zero page,
i.e. the zero page boundary crossing is not handled.
* The high byte of the effective address may be invalid
at this time, i.e. it may be smaller by $100.
+ This cycle will be executed only if the effective address
was invalid during cycle #5, i.e. page boundary was crossed.
Read-Modify-Write instructions (SLO, SRE, RLA, RRA, ISB, DCP)
# address R/W description
--- ----------- --- ------------------------------------------
1 PCR R fetch opcode, increment PCR
2 PCR R fetch pointer address, increment PCR
3 pointer R fetch effective address low
4 pointer+1 R fetch effective address high,
add Y to low byte of effective address
5 address+Y* R read from effective address,
fix high byte of effective address
6 address+Y W write to effective address
7 address+Y W write the value back to effective address,
and do the operation on it
8 address+Y W write the new value to effective address
Notes: The effective address is always fetched from zero page,
i.e. the zero page boundary crossing is not handled.
* The high byte of the effective address may be invalid
at this time, i.e. it may be smaller by $100.
Write instructions (STA, SHA)
# address R/W description
--- ----------- --- ------------------------------------------
1 PCR R fetch opcode, increment PCR
2 PCR R fetch pointer address, increment PCR
3 pointer R fetch effective address low
4 pointer+1 R fetch effective address high,
add Y to low byte of effective address
5 address+Y* R read from effective address,
fix high byte of effective address
6 address+Y W write to effective address
Notes: The effective address is always fetched from zero page,
i.e. the zero page boundary crossing is not handled.
* The high byte of the effective address may be invalid
at this time, i.e. it may be smaller by $100.
Absolute indirect addressing (JMP)
# address R/W description
--- --------- --- ------------------------------------------
1 PCR R fetch opcode, increment PCR
2 PCR R fetch pointer address low, increment PCR
3 PCR R fetch pointer address high, increment PCR
4 pointer R fetch low address to latch
5 pointer+1* R fetch PCH, copy latch to PCL
Note: * The PCH will always be fetched from the same page
than PCL, i.e. page boundary crossing is not handled.
MEMORY MANAGEMENT
normal ultimax
1111 101x 011x 001x 1110 0100 1100 xx01
1000 00x0
10000
------------------------------------------------------------------------
F000
Kernal RAM Kernal RAM Kernal Kernal Kernal module
E000
------------------------------------------------------------------------
D000 I/O I/O** I/O RAM I/O I/O I/O I/O
------------------------------------------------------------------------
C000 RAM RAM RAM RAM RAM RAM RAM -
------------------------------------------------------------------------
B000
BASIC RAM RAM RAM BASIC module module -
A000
------------------------------------------------------------------------
9000
RAM RAM RAM RAM module RAM module module
8000
------------------------------------------------------------------------
7000
6000
RAM RAM RAM RAM RAM RAM RAM -
5000
4000
------------------------------------------------------------------------
3000
2000
RAM RAM RAM RAM RAM RAM RAM RAM
1000
0000
------------------------------------------------------------------------
**) Chargen not accessible by the CPU
AUTOSTART CODE
If memory places $8004 to $8008 contain 'CBM80' (C3 C2 CD 38 30),
the RESET routine jumps to ($8000) and the default NMI handler jumps to
($8002).
HOW REAL PROGRAMMERS ACKNOWLEDGE INTERRUPTS
With RMW instructions:
; beginning of combined raster/timer interrupt routine
LSR $D019 ; clear VIC interrupts, read raster interrupt flag to C
BCS raster ; jump if VIC caused an interrupt
... ; timer interrupt routine
Operational diagram of LSR $D019:
# data address R/W
--- ---- ------- --- ---------------------------------
1 4E PCR R fetch opcode
2 19 PCR+1 R fetch address low
3 D0 PCR+2 R fetch address high
4 xx $D019 R read memory
5 xx $D019 W write the value back, rotate right
6 xx/2 $D019 W write the new value back
The 5th cycle acknowledges the interrupt by writing the same
value back. If only raster interrupts are used, the 6th cycle
has no effect on the VIC.
With indexed addressing:
; acknowledge interrupts to both CIAs
LDX #$10
LDA $DCFD,X
Operational diagram of LDA $DCFD,X:
# data address R/W description
--- ---- ------- --- ---------------------------------
1 BD PCR R fetch opcode
2 FD PCR+1 R fetch address low
3 DC PCR+2 R fetch address high, add X to address low
4 xx $DC0D R read from address, fix high byte of address
5 yy $DD0D R read from right address
; acknowledge interrupts to CIA 2
LDX #$10
STA $DDFD,X
Operational diagram of STA $DDFD,X:
# data address R/W description
--- ---- ------- --- ---------------------------------
1 9D PCR R fetch opcode
2 FD PCR+1 R fetch address low
3 DC PCR+2 R fetch address high, add X to address low
4 xx $DD0D R read from address, fix high byte of address
5 ac $DE0D W write to right address
With branch instructions:
; acknowledge interrupts to CIA 2
LDA #$00 ; clear N flag
JMP $DD0A
DD0A BPL $DC9D ; branch
DC9D BRK ; return
You need the following preparations to initialize the CIA registers:
LDA #$91 ; argument of BPL
STA $DD0B
LDA #$10 ; BPL
STA $DD0A
STA $DD08 ; load the ToD values from the latches
LDA $DD0B ; jam the ToD display
LDA #$7F
STA $DC0D ; assure that $DC0D is $00
Operational diagram of BPL $DC9D:
# data address R/W description
--- ---- ------- --- ---------------------------------
1 10 $DD0A R fetch opcode
2 91 $DD0B R fetch argument
3 xx $DD0C R fetch opcode, add argument to PCL
4 yy $DD9D R fetch opcode, fix PCH
( 5 00 $DC9D R fetch opcode )
; acknowledge interrupts to CIA 1
LDA #$00 ; clear N flag
JMP $DCFA
DCFA BPL $DD0D
DD0D BRK
; Again you need to set the ToD registers of CIA 1 and the
; Interrupt Control Register of CIA 2 first.
Operational diagram of BPL $DD0D:
# data address R/W description
--- ---- ------- --- ---------------------------------
1 10 $DCFA R fetch opcode
2 11 $DCFB R fetch argument
3 xx $DCFC R fetch opcode, add argument to PCL
4 yy $DC0D R fetch opcode, fix PCH
( 5 00 $DD0D R fetch opcode )
; acknowledge interrupts to CIA 2 automagically
; preparations
LDA #$7F
STA $DD0D ; disable CIA 2's all interrupt sources
LDA $DD0E
AND #$BE ; ensure that $DD0C remains constant
STA $DD0E ; and stop the timer
LDA #$FD
STA $DD0C ; parameter of BPL
LDA #$10
STA $DD0B ; BPL
LDA #$40
STA $DD0A ; RTI/parameter of LSR
LDA #$46
STA $DD09 ; LSR
STA $DD08 ; load the ToD values from the latches
LDA $DD0B ; jam the ToD display
LDA #$09
STA $0318
LDA #$DD
STA $0319 ; change NMI vector to $DD09
LDA #$FF ; Try changing this instruction's operand
STA $DD05 ; (see comment below).
LDA #$FF
STA $DD04 ; set interrupt frequency to 1/65536 cycles
LDA $DD0E
AND #$80
ORA #$11
LDX #$81
STX $DD0D ; enable timer interrupt
STA $DD0E ; start timer
LDA #$00 ; To see that the interrupts really occur,
STA $D011 ; use something like this and see how
LOOP DEC $D020 ; changing the byte loaded to $DD05 from
BNE LOOP ; #$FF to #$0F changes the image.
When an NMI occurs, the processor jumps to Kernal code, which jumps to
($0318), which points to the following routine:
DD09 LSR $40 ; clear N flag
BPL $DD0A ; Note: $DD0A contains RTI.
Operational diagram of BPL $DD0A:
# data address R/W description
--- ---- ------- --- ---------------------------------
1 10 $DD0B R fetch opcode
2 11 $DD0C R fetch argument
3 xx $DD0D R fetch opcode, add argument to PCL
4 40 $DD0A R fetch opcode, (fix PCH)
With RTI:
; the fastest possible interrupt handler in the 6500 family
; preparations
SEI
LDA $01 ; disable ROM and enable I/O
AND #$FD
ORA #$05
STA $01
LDA #$7F
STA $DD0D ; disable CIA 2's all interrupt sources
LDA $DD0E
AND #$BE ; ensure that $DD0C remains constant
STA $DD0E ; and stop the timer
LDA #$40
STA $DD0C ; store RTI to $DD0C
LDA #$0C
STA $FFFA
LDA #$DD
STA $FFFB ; change NMI vector to $DD0C
LDA #$FF ; Try changing this instruction's operand
STA $DD05 ; (see comment below).
LDA #$FF
STA $DD04 ; set interrupt frequency to 1/65536 cycles
LDA $DD0E
AND #$80
ORA #$11
LDX #$81
STX $DD0D ; enable timer interrupt
STA $DD0E ; start timer
LDA #$00 ; To see that the interrupts really occur,
STA $D011 ; use something like this and see how
LOOP DEC $D020 ; changing the byte loaded to $DD05 from
BNE LOOP ; #$FF to #$0F changes the image.
When an NMI occurs, the processor jumps to Kernal code, which jumps to
($0318), which points to the following routine:
DD0C RTI
How on earth can this clear the interrupts? Remember, the processor
always fetches two successive bytes for each instruction.
A little more practical version of this is redirecting the NMI (or IRQ)
to your own routine, whose last instruction is JMP $DD0C or JMP $DC0C.
If you want to confuse more, change the 0 in the address to a
hexadecimal digit different from the one you used when writing the RTI.
Or you can combine the latter two methods:
DD09 LSR $xx ; xx is any appropriate BCD value 00-59.
BPL $DCFC
DCFC RTI
This example acknowledges interrupts to both CIAs.
If you want to confuse the examiners of your code, you can use any of
these techniques. Although these examples use no undefined opcodes, they do
not run correctly on CMOS processors. However, the RTI example should run on
65C02 and 65C816, and the latter branch instruction example might work as
well.
The RMW instruction method has been used in some demos, others were
developed by Marko M"akel"a. His favourite is the automagical RTI method,
although it does not have any practical applications, except for some
time dependent data decryption routines for very complicated copy protections.
MAKING USE OF THE I/O REGISTERS
If you are making a resident program and want to make as invisible to the
system as possible, probably the best method is keeping most of your code
under the I/O area (in the RAM at $D000-$DFFF). You need only a short routine
in the normally visible RAM that pushes the current value of the processor's
I/O register $01 on stack, switches I/O and ROMs out and jumps to this area.
Returning from the $D000-$DFFF area is easy even without any routine in the
normally visible RAM area. Just write a RTS to an I/O register and return
through it.
But what if your program needs to use I/O? And how can you write the RTS
to an I/O register while the I/O area is switched off? You need a swap area
for your program in normally visible memory. The first thing your routine at
$D000-$DFFF does is copying the I/O routines (or the whole program) to
normally visible memory, swapping the bytes. For instance, if your I/O
routines are initially at $D200-$D3FF, exchange the bytes at $D200-$D3FF
with the contents of $C000-$C1FF. Now you can call the I/O routines from
your routine at $D000-$DFFF, and the I/O routines can switch the I/O area
temporarily on to access the I/O circuitry. And right before exiting your
program at $D000-$DFFF swaps the old contents of that I/O routine area in,
e.g. exchanges the memory areas $D200-$D3FF and $C000-$C1FF again.
What I/O registers can you use for the RTS? There are two alternatives:
8-bit VIC sprite registers or CIA serial port register. The CIA register is
usually better, as changing the VIC registers might change the screen layout.
However, also the SP register has some drawbacks: If the machine's CNT1 and
CNT2 lines are connected to a frequency source, you must stop either CIA's
Timer A to use the SP register method. Normally the 1st CIA's Timer A is the
main hardware interrupt source. And if you use the Kernal's RS232, you cannot
stop the 2nd CIA's Timer A either. Also, if you don't want to lose any CIA
interrupts, remember that the RTS at SP register causes also the Interrupt
Control Register to be read.
Also keep in mind that the user could press RESTORE while the Kernal ROM
and I/O areas are disabled. You could write your own NMI handler (using
the NMI vector at $FFFA), but a fast loader that uses very tight timing
would still stop working if the user pressed RESTORE in wrong time. So, to
make a robust program, you have to disable NMI interrupts. But how is this
possible? They are Non-Maskable after all. The NMI interrupt is
edge-sensitive, the processor jumps to NMI handler only when the -NMI line
drops from +5V to ground. Just cause a NMI with CIA2's timer, but don't
read the Interrupt Control register. If you need to read $DD0D in your
program, you must add a NMI handler just in case the user presses RESTORE.
And don't forget to raise the -NMI line upon exiting the program. This can
be done automatically by the latter two of the three following examples.
; Returning via VIC sprite 7 X coordinate register
Initialization: ; This is executed when I/O is switched on
LDA #$60
STA $D015 ; Write RTS to VIC register $15.
Exiting: ; NOTE: This procedure must start at VIC register
; $12. You have multiple alternatives, as the VIC
; appears in memory at $D000+$40*n, where $0<=n<=$F.
PLA ; Pull the saved 6510 I/O register state from stack
STA $01 ; Restore original memory bank configuration
; Now the processor fetches the RTS command from the
; VIC register $15.
; Returning via CIA 2's SP register (assuming that CNT2 is stable)
Initialization: ; This is executed when I/O is switched on
LDA $DD0E ; CIA 2's Control Register A
AND #$BF ; Set Serial Port to input
STA $DD0E ; (make the SP register to act as a memory place)
LDA #$60
STA $DD0C ; Write RTS to CIA 2 register $C.
Exiting: ; NOTE: This procedure must start at CIA 2 register
; $9. As the CIA 2 appears in memory at $DD00+$10*n,
; where 0<=n<=$F, you have sixteen alternatives.
PLA
STA $01 ; Restore original memory bank configuration
; Now the processor fetches the RTS command from
; the CIA 2 register $C.
; Returning via CIA 2's SP register, stopping the Timer A
; and forcing SP2 and CNT2 to output
Initialization: ; This is executed when I/O is switched on
LDA $DD0E ; CIA 2's Control Register A
AND #$FE ; Stop Timer A
ORA #$40 ; Set Serial Port to output
STA $DD0E ; (make the SP register to act as a memory place)
LDA #$60
STA $DD0C ; Write RTS to CIA register $C.
Exiting: ; NOTE: This procedure must start at CIA 2 register
; $9. As the CIA 2 appears in memory at $DD00+$10*n,
; where, 0<=n<=$F, you have sixteen alternatives.
PLA
STA $01 ; Restore original memory bank configuration
; Now the processor fetches the RTS command from
; the CIA 2 register $C.
For instance, if you want to make a highly compatible fast loader, make
the ILOAD vector ($0330) point to the beginning of the stack area. Remember
that the BASIC interpreter uses the first bytes of stack while converting
numbers to text. A good address is $0120. Robust programs practically never
use so much stack that it could corrupt this routine. Usually only crunched
programs (demos and alike) use all stack in the decompression phase. They
also make use of the $D000-$DFFF area.
This stack routine will jump to your routine at $D000-$DFFF, as described
above. For performance's sake, copy the whole byte transfer loop to the swap
area, e.g. $C000-$C1FF, and call that subroutine after doing the preliminary
work. But what about files that load over $C000-$C1FF? Wouldn't that destroy
the transfer loop and jam the machine? Not necessarily. If you copy those
bytes to your swap area at $D000-$DFFF, they will be loaded properly, as
your program restores the original $C000-$C1FF area.
If you want to make your program user-friendly, put a vector initialization
routine to the stack area as well, so that the user can restore the fast
loader by issuing a SYS command, rather than loading it each time he has
pressed STOP & RESTORE or RESET.
NOTES
See MCS 6500 Microcomputer Family Programming Manual for more information.
There is also a table showing functional description and timing for complete
6510 instruction set on C=Hacking magazine issue 1/92 (available via FTP at
ccosun.caltech.edu:/pub/rknop/hacking.mag/ and
nic.funet.fi:/pub/cbm/c=hacking/).
References:
C64 Memory Maps C64 Programmer's Reference Guide pp. 262-267
6510 Block Diagram C64 Programmer's Reference Guide p. 404
Instruction Set C64 Programmer's Reference Guide pp. 416-417
C=Hacking Volume 1, issue #1, 1/92
C=Lehti magazine 4/87
=============================================================================