Overview
Assembly language is very simplistic; it views your computer as a combination of memory addresses and a CPU. The memory addresses contain values that can be moved into and out of the CPU registers for logical operations, arithmetic manipulation, or simply for relocation--for a value cannot be moved from one memory location to the next, but rather must be moved from a memory location to a CPU register, then from the CPU register to the destination memory location. The CPU is hard-coded with internal instructions called opcodes which are used for the manipulation and relocation of data values.
Many complex operations are used frequently and sometimes-- for example, in the case of hardware access-- rely on system-specific configuration information stored in the machine BIOS; the DOS and BIOS interrupt services are supplied to ease the burden on the programmer. The use of these services is much like using the standard C library ("stdio.h", et al); the same functionality can be duplicated without the interrupt services, though the coding will be lengthy and difficult.
One thing the high-level coder must keep in mind is that all data is considered to be a memory location of some form or another. Variables, structures, pointers, arrays...in assembly language, these are all just memory locations with a specific content. For example, the memory location
0110:0100 54 68 69 73
would contain the data 54 68 69 73 in hexadecimal, or "This" in ASCII. A pointer to this variable would look like
0110:0299 00 01 10 01
containing the address 01 10 [:] 01 00 in reverse ("little endian", used by Intel processors) notation. Thus, either of the following
would put the value "This" into the CPU register EAX:
mov eax, [0100] ; Direct Addresssing
mov eax, [0299] ; Indirect Addressing
mov eax, 0100 ; Place value 0100 into eax
Note that the value at 0100 is a string, and hence an array, made up of four one-byte characters taking up four bytes of memory. Each element of
the array could be accessed as an offset from the base of the array, or as a specific memory location, as in the following code:
mov eax, [0100] ; eax = "T"
mov eax, [0101] ; eax = "h"
mov eax, [0100 + 2] ; eax = "i"
mov eax, [0102] ; eax = "i"
mov eax, [0299 + 2] ; eax = address of "i"
mov eax, [0103] ; eax = "s"
Needless to say, you won't be dealing with specific memory locations when writing assembly programs. Rather, you will use symbolic names
for variables and memory locations which the assembler will later turn into memory addresses. In order to use symbolic names in your code, you must
define bytes in memory and to label portions of your code, as follows:
.DATA ;start of Data Segment (DS)
MyVariable db 'This' ;MyVariable = 'This' (MyVariable)
MyVariable2 db MyVariable ;MyVariable2 = 'This' (MyVariable)
ptrMyVariable db offset MyVariable ;ptrMyVariable = address of MyVariable (*MyVariable)
.CODE ;Start of Code Segment (CS)
CodeStart: ;CodeStart = address of the next line of code
mov eax, My Variable ;eax contains 'This' (MyVariable)
mov eax, offset MyVariable ;eax contains the address of MyVariable (*MyVariable)
mov ebx, offset eax ;ebx contains the address of MyVariable (*MyVariable)
mov ebx, eax ;ebx contains 'This' (MyVariable)
mov eax, offset ptrMyVariable ;eax contains the address of ptrMyVariable (*ptrMyVariable)
mov eax, ptrMyVariable ;eax contains the address of MyVariable (*MyVariable)
mov eax, CodeStart ;eax contains the address of the code following label CodeStart
Note how when dealing with symbolic names, it is clear that an address and not a value
is being placed in the register; therefore the brackets are used to enforce indirect addressing.
The keyword offset is often used--and with some compilers, required--to clarify that
the address of a variable, rather than its contents, is being referred to.In assembly language, there are inherently no IFs, ELSEs, or FORs. Instead you must make do with the basic compare and jump commands. The basic form of a conditional statement in assembly is
cmp op1, op2
jne loc1
loc2:
misc code
jmp CarryOn
loc1:
more misc code
CarryOn:
rest of program
This has the effect of saying "if (op1 == op2) then loc2() else loc1()
".
There are a number of comparison and conditional jump operators, such as test, jnz,
jbe, etc, but these all boil down to a simple compare-and-jump-if-equal or compare-amd-jump-if-not-equal
condition, with points for style added from there. There is also a loop opcode which will decrement the value in
ecx and jump to the given value until ecx==0:
xor eax, eax ;set eax == 0
mov ecx, 100 ;loop 100 times
loop_1:
add eax, 1
loop loop_1
This will loop 100 times and execute with eax==100. With these conditional flow statements you should be able to
emulate standard high-level flow control statements, albeit somewhat crudely. As examples, a "switch...case" statement would be emulated
with multiple "cmp ax, valueX...jmp caseX"; "for 0 to x" would be emulated with "loop_1: mov ecx, x...misc code...loop loop_1", and the
standard "if(!x)..." would be emulated with a "mov eax, x; cmp eax, 0; jne code-label".That should be enough of the philosophy of assembly language. There is a lot more to it, namely a number of different CPU registers and opcodes, various types of memory locations, and of course the extensive DOS and BIOS interrupt services. Finally, there is the art of structuring your source code.
In C, a basic "hello world" program would look as follows:
//----------------------------- Definitions Section
#include //code module with definition for printf() function
//---------------------------- Data Section
char strHello = "Hello, eh?\n";
//----------------------------- Code Section
main()
{
printf(%s, strHello);
}
Very simple, very easy. In assembly, things look a bit different:
;----------------------------- Definitions Section
.model small ;make this a COM file
;----------------------------- Data Section
.DATA
strHello db 'Hello, eh?',0dh,0ah,$ ;define string, CR/LF, mark end of string with '$'
;----------------------------- Code Section
.CODE
start:
mov dx, offset strHello ;Put address of string in DX
mov ah, 09h ;Put function# of Int21 service "Display String" in ah
int 21h ;Call Interrupt Service 21
exit:
mov ah, 4ch ;Put function# of Int21 service "Terminate to DOS" in ah
int 21h
END
The assembly language program is a little less clear. First, in the definitions section, options must be
configured for the memory model of the program (small, compact, medium, large, huge, etc) and other considerations such as
target CPU. In the data section, strings must be declared according to how they will be used: often a dollar sign ('$') or
a null terminator ('\0') will be appended to the string to mark its end. The code section must have a start: label
to mark the program entry point and an END statement to mark the end of the file. This example uses two functions in
the DOS Interrupt 21h service: Function 09h (Display String) and Function 4Ch (Terminate to DOS). Note how the function number
must be loaded into ah, the lower byte of register EAX, before calling the interrupt. Some interrupts, like higher-level
procedures, require that parameters be passed in specific registers: in this example, Int_21h:Func_09h requires that the address
of the string be passed in the lower two bytes of the ebx register, or bx.A more complex example, one that tests to see if the DOS version is 3.0 or above:
.model small
.DATA
strLowerVersion db 'Error! DOS version is lower than 3.0!',0dh,0ah,$
strHigherVersion db 'DOS version is 3.0 or higher.',0dh,0ah,$
.CODE
start:
xor eax, eax
mov ah, 30h ;GetDosVersion function
int 21h
cmp al, 3
jae is_3
mov dx, offset strLowerVersion
jmp exit
is_3:
mov dx, offset strHigherVersion
exit:
mov ah, 09h ;DisplayString function
int 21h
mov ah, 4ch ;TerminateToDOS function
int 21h
END
This program starts by clearing eax with the xor instruction--a shortcut the more
conventional mov eax, 0
...in the xor operator, each bit is compared with a bit in
the same position in the other register, and the first bit is replaced with "0" if the bits are
equal, and "1" if the bits are unequal; thus when all bits are equal, all of the bits in the first
register are set to 0.Next comes a call to Int 21h Function 30h, which returns the DOS major version (the "3" in "3.11") in al and minor version (the "11" in "3.11) in ah. The cmp instruction compares al with "3", and is followed by a conditional jump (Jcc...these are a "j" followed by a mnemonic flag ID, such as Jz [Jump-if-Zero], Jnz [Jump-if-Not-Zero], Je [JEqual], Jne, Ja [JAbove], Jb[JBelow], Jc[Jump-if-Carry], etc) Jae [Jump-if-Above-or-Equal] ) which branches to code label "is_3". The two branches of the program ("is_3" and "is_not_3", or default) simply load pointers to different strings into dx, after which the DOS Display String function is called (Int 21h, Funct 09h). At the end, of course, the program is terminated so that the user can regain control of the machine (this can by all means be left out if the program is to be run on someone else's machine...).
All versions of DOS and Windows come with debug.exe--an extraordinary 20K file that allow you to write code directly into memory, edit disks and partition tables, debug software, and compile COM programs. Debug has absolutely no user interface, like adb (or, to speak out of context, vi ;) in the Unix world, yet it has a very simple set of mnemonic commands:
? HELP display debug commands a ASSEMBLE assemble 8086/87/88 mnemonics to binary c COMPARE compare two portions of memory d DUMP display the contents of an area of memory e ENTER enter data in memory at specificed address f FILL fill range of memory with specified values g GO run executable h HEX perform hex math i INPUT input 1 byte from specified port l LOAD load contents of file or disk sector m MOVE copy contents of a bloack of memory n NAME specify file to L or W o OUTPUT output one byte to port p PROCEED execute a loop, reps, int or subroutine q QUIT exit debug r REGISTER display/alter registers s SEARCH search memory for pattern of bytes t TRACE execute one instruction, then display registers/flags/cs:IP u UNASSEMBLE disassemble binary to 8086/87/88 mnemonics w WRITE write file to disk xa XALLOC allocate expanded memory xd XDEALLOC deallocate expanded xm XMAP map expanded memory pages xs XSTATUS display status of expanded memory"Live" Coding
When running debug, you can assemble code directly into memory using the a command. It is best to start assembling at offset 100h, though by no means necessary; however doing so will get you into the habit by the time COM file coding comes around, a few paragraphs down.
Start debug and type a100
at the "-" prompt, then enter the following:
mov dl, 48
mov ah, 2
int 21
mov dl, 65
int 21
mov dl, 79
int 21
mov dl, 21
int 21
mov dl, 0d
int 21
mov dl, 0a
int 21
int 20
Be sure to hit enter at the end to complete your code entry. Note that Int 21h, Func 02h is
the Display Character function, with the character to be displayed loaded into dl. Therefore, 02h
(you may note that debug assumes hexadecimal numbers) is loaded into ah and kept there while
successive values are loaded into dl--the hex values for certain ascii characters which, when you run the
program by typing "g" at the "-" prompt, will spell
Hey!
A long way to go for nothing, to be sure. The Int 20h at the end is an older Terminate To DOS function which
is useful in "debug programming" because it requires no parameters (ergo one less line to type).
It is considered easier to define your data at startup in strings, rather than outputting it one character at a time.
To do so, you must initiate your debug code with a jmp past the data area, then you must refer to the
string explicitly (not by a symbolic name) when preparing the Display String service. The following code,
which you can enter as before by typing "a100" at the "-" prompt and running with the "g" command, should become
clearer as you type it (i.e., watch the address of the db statement):
jmp 0114
db 'Hey! ASM rocks!'0d,0a'$'
mov dx, 102
mov ah, 9
int 21
int 20
Once again, using 0D and 0A as a CR/LF to avoid screwing up the console display. 102 is the address of the
string (the jmp statement takes two bytes), and 114 is the start of the code. At this point one might
be tempted to ask, how do you know the offset of the code to jump to when defining the data? Trial and error, naturally;
of course, if one wanted to waste a few meager bytes of RAM, one could start with a jmp 200, enter a few strings which
will hopefully be less than 100h (unless one is coding a VB app through debug ;), finish the assembling with a carriage return, and
then re-entering the assembler mode with the a200 (assemble at 200h) command.
But enough bare-metal programming; these are not the days of mainframes after all. We are a civilized people, and we have a compiler! Known affectionately as...debug.
Using debug as a compiler
Debug can be used more or less as a standard assembler by preparing .asm files
beforehand and invoking them through standard dos redirection, i.e.
debug.exe < hello.asm
A text file can be prepared for assembly as follows:
-----hello.asm-----
a100
jmp 0114
db 'Hey! ASM rocks!'0d,0a'$'
mov dx, 102
mov ah, 9
int 21
int 20
rcx
1d
n hello.com
w
q
This is essentially a script of commands you would enter in debug; it can be called from
a batch file with the line debug < hello.asm
. The first
line stands for "Assemble at 100h", or "start coding a com file". At this point
each line of asm code you enter will be assembled into the final file, starting
with address 100h. Note that the string ends with a 0D + 0A CR/LF combo, as well as the requisite '$';
the 102 loaded into dx is the address of the first byte of the string, following which there is a call to
Function 09h of Interrupt 21h (Display '$'-terminated string) and a call to Interrupt 20h (Terminate) to
return control to DOS.
The blank line at the end of the code is important as it signifies that the code input is over. Next you must edit the register CX (using the "rcx" command) to reflect the byte count of the file; the "rcx" goes on one line and the number of bytes to write--1D--goes on another. The file must be named of course, which is taken care of in the "n hello.com" line, then written with the "w" line (remember, each line in a dos text file has a CR/LF, which is the equivalent of pressing ENTER when writing scripts like this). Finally, debug is exitted with the "q" command followed by a blank line (the last is very important, for without it debug will lock up and receive no further input, which as you remember is coming from a file).
Note that this method is a little tricky as you do not know the address of each
line while you are writing the code. Thus, the starting JMP statement is usually
a guess (e.g. JMP 1FF) that is fixed later; ditto for any jumps or data references in the
code. A good practice would be to write the asm file as follows:
hello.asm
a100
jmp FFFF
...
mov dx, FFFF
...
int 20
...
Then, after running the .asm file through debug, unassemble the resulting com
file in debug and fix the jumps:
debug hello.com
-u
0C93:0100 EB12 JMP FFFF
0C93:0102 48 DEC AX
0C93:0103 65 DB 65
0C93:0104 7921 JNS 0127
0C93:0106 204153 AND [BX+DI+53],AL
0C93:0109 4D DEC BP
0C93:010A 20726F AND [BP+SI+6F],DH
0C93:010D 63 DB 63
0C93:010E 6B DB 6B
0C93:010F 7321 JNB 0132
0C93:0111 0D0A24 OR AX,240A
0C93:0114 BA0201 MOV DX,FFFF
0C93:0117 B409 MOV AH,09
0C93:0119 CD21 INT 21
0C93:011B CD20 INT 20
0C93:011D 46 INC SI
0C93:011E EBBB JMP 00DB
-a100
0C93:0100 JMP 114
-a114
0C93:0114 mov DX,102
-w
-q
Notice how the data between 102 and 114 is unassembled as code; this is
because debug is a "dumb" (i.e., not following the flow of execution)
disassembler. However, with practice--and good habits like placing all
data at the start of your code, therefore enabling you to simply count the number of
bytes (or characters) following the DB in order to determine where the first JMP
should point to--you will be able to interpret such crudely disassembled code
with ease. Or, to go the easier route, pad each of your data declarations with two or three nop's,
which should stand out enough to make differentiating the strings a trivial problem.
Alas, the days of relying on debug are over. Now the closest you can get is NASM (just kidding ;), as many of the commerical assemblers are trying to make things easier for the programmer by allowing such luxuries as multiple segments, symbolic names, decimal integers, and code labels (all of which, thanks to this past section, you will now be able to really appreciate).
It is often useful to have a base program to work from...something more useful than the banal "hello"-style programs that mimic no known functionality in the real world. What follows are two templates, one for a .COM and one for an .EXE file, which can be used as building blocks upon which to build your own assembly-language programs. Each template, if compiled as-is, will produce a program which will check the command-line parameters for "-h" or "-v" (case insensitive); on "h" it will display a help screen, on "v" it will display the DOS version currently running, and on neither it will give an error message and the help screen. Quick, easy, and ready to be mutated into your own command-line option utilities...
DOS COM File Template
.286
.model tiny ; COM file: use EXE2BIN or link with TLINK /t
; Set up some PSP definitions for later use
NUM_ARGS equ 80h ; 80h = # of Command Line Arguments
ARGS equ 81h ; 81-FFh = Arguments
.CODE
org 100h ; load image at 100h
start:
jmp CodeStart ; Jump over data declarations
; Replace from here with your own data
szNoArgs db 'Incorrect number of arguments',0Dh,0Ah, '$'
szHelp db 'Command-Line Arguments:',0Dh,0Ah
db '-h : Display this Help screen',0Dh,0Ah
db '-v : Display DOS Version',0Dh,0Ah,'$'
szDOSVer db 'DOS Version X.X ',0Dh,0Ah,'$'
;===========================================
;start of program, equivalent to main() in C
Codestart:
; Replace from here with your own command-line parser
mov si, ARGS + 2 ;Get third byte of CmdLine into al
lodsb
cmp al, 48h ;is it "H" ?
je callHelp
cmp al, 68h ;is it "h" ?
je callHelp
cmp al, 56h ;is it "V" ?
je callGetDosVer
cmp al, 76h ;is it "v" ?
je callGetDosVer
; Handler routines...Replace these with your own handlers
callNoArgs:
mov dx, offset szNoArgs
mov ah, 09h
int 21h ;note follow-through to display Help
callHelp:
mov dx, offset szHelp
mov ah, 09h
int 21h
jmp exit
callGetDosVer:
mov ah, 33h ;Get DOS Verison Number
mov al, 06h ;Int 21h, Func 33-06
int 21h
add bl, 30h ;convert hex to ASCII decimal number
mov szDOSVer + 12, bl
add bh, 30h ;convert hex to ASCII decimal number
cmp bh, 39h ;is Minor Version # less than 10?
jle OutputDosVer ;yes, keep it for output
mov bl, bh
sub bl, 0Ah ;Find "ones" column by subtracting 10 from version byte
mov szDOSVer + 15, bl
mov bh, 31h ;set "tens" column to "1"
OutputDosVer:
mov szDOSVer + 14, bh
mov dx, offset szDOSVer
mov ah, 09 ;OutPut String
int 21h ;Int21h Func 09
exit:
mov ah,4ch ;Terminate to DOS
int 21h ;Int 21h Func 4C
;End of main() routine
end start
Notes: The file begins with ".286" to specify the miniminum processor needed, followed by a ".model"
directive with a "tiny" specification (tiny == COM file). Some useful equates --like #define's in C--
provide useful definitions for locations in the Program Segment Prefix of the COM file. After that, the ".CODE" directive marks
the start of the executable code, followed by an "org 100h" directive (like a100 in debug, this causes the image to be loaded at
100h, just after the PSP) and a "start:" label to mark the program entry point.
The program immediately jumps over the data declarations and starts loading the command-line parameters
with the mov si, ARGS + 2
instruction. This is merely a setup for the lodsb (LOaD String Byte) instruction that
follows; lodsb loads a byte from si into al. The byte loaded is the third byte of the command-line arguments (the first is a space,
the second is a "-"), and it is compared with the hexadecimal values of the ASCII characters h, H, v, and V (now you know what those
weird ASCII tables in the back of your old DOS manuals were for, eh?).
The rest of the program is fairly straightforward, merely a few Display String calls, except for the GetDosVersion area. The problem here is converting the hexadecimal major and minor versions ( version [0-f].[0-f] ) into decimal for output. The major version is no problem; DOS is only on 7.0 and so adding 30h to the major version number will bump it nicely into the ASCII decimal-digits areas. The same goes for the minor version...until v.10-15 comes up. This is bypassed by adding a "tens" digit which is always set to 1 (to make things a little simpler; this was tested on a Win95 machine...DOS 7.10 ;), and subtracting 10 (0Ah) from the "ones" digit to bring it back in the realm of the 30's (the flaw here is that only the first nibble, or half-byte, of the minor version number is being treated...but it is sufficient for the exercise). The converted values are then written directly to their place in the string using offsets which represent the number of bytes from the start of the string that the value is to be written at.
DOS EXE File Template
.286
.model small
.stack 200h
PSP segment at 00h ; Define PSP as a segment for easy access
org 2ch ; 2Ch = Environment Field (not used)
ENVIRON_PTR dw ?
org 80h ; 80h = Command Line Arguments Field
NUM_ARGS db ? ; Byte 1 = # of Arguments
ARGS db 127 dup(?) ; Bytes 2-128 = Command Line Args
PSP ends
.DATA ; Replace from here with your own data
szNoArgs db 'Incorrect number of arguments',0Dh,0Ah, '$'
szHelp db 'Command-Line Arguments:',0Dh,0Ah
db '-h : Display this Help screen',0Dh,0Ah
db '-v : Display DOS Version',0Dh,0Ah,'$'
szDOSVer db 'DOS Version X.X ',0Dh,0Ah,'$'
.CODE
;===========================================
;start of program, equivalent to main() in C
start:
assume es:PSP
mov ax,@data ;set segment registers
mov ds, ax
; Replace from here with your own command-line parser
mov al, es:ARGS + 2 ;Get third byte of CmdLine
cmp al, 48h ;is it "H" ?
je callHelp
cmp al, 68h ;is it "h" ?
je callHelp
cmp al, 56h ;is it "V" ?
je callGetDosVer
cmp al, 76h ;is it "v" ?
je callGetDosVer
callNoArgs:
call NoArgs ;note follow-through to display Help
callHelp:
call Help
jmp exit
callGetDosVer:
call GetDosVer
exit:
mov ah,4ch ;Terminate to DOS
int 21h ;Int 21h Func 4C
;===========================================
;End of main() routine
; Procedures: Replace these with your own routines
;-------------------------------------------
;GetDosVer: Gets DOS version, prepares it for output, displays
; output to screen and returns
GetDosVer proc
mov ah, 33h ;Get DOS Verison Number
mov al, 06h ;Int 21h, Func 33-06
int 21h
add bl, 30h ;convert hex to ASCII decimal number
mov szDOSVer + 12, bl
add bh, 30h ;convert hex to ASCII decimal number
cmp bh, 39h ;is Minor Version # less than 10?
jle OutputDosVer ;yes, keep it for output
mov bl, bh
sub bl, 0Ah ;Find "ones" column by subtracting 10 from version byte
mov szDOSVer + 15, bl
mov bh, 31h ;set "tens" column to "1"
OutputDosVer:
mov szDOSVer + 14, bh
mov dx, offset szDOSVer
mov ah, 09 ;OutPut String
int 21h ;Int21h Func 09
ret
GetDosVer endp
;-------------------------------------------
;Help: Prints the message in szHelp and returns
Help proc
mov dx, offset szHelp
mov ah, 09h
int 21h
ret
Help endp
;-------------------------------------------
;NoArgs : Prints the error message in szNoArgs and returns
NoArgs proc
assume DS: DGROUP
mov dx, offset szNoArgs
mov ah, 09h
int 21h
ret
NoArgs endp
end start
Notes: EXE files allow a bit more flexibility. This example has 3 segments: the PSP (defined for convenience; it will not
be present in the final executable), the data segment, and the code segment. The PSP segment is organized to allow certain parameters to be
readily accessible; the code segment (beginning at the "start:" label) demonstrates the advantage of this definition with the line
mov al, es:ARGS + 2
. This is followed by the asm equivalent of a switch...case statement (very inelegant, to be sure) to
handle the command-line options. This template also demonstrates the use of procedures to break up the code and make it more modular (since
procedures can be stored in separate .asm files and included into the main asm file using the include statement): the procedures contain code
very similar to that in the COM file template.
The first thing you will need to become familiar with in order to learn Assembly Language (ASM) is the idea of a register. A register is a specific piece of memory located within the cpu itself (it is not counted as part of your RAM), usually from 8 to 32 bits in size, that is used to store information for CPU processing. Some registers can only hold certain information--such as the memory address of the line of code currently being executed--while other registers are "scratch-pad" registers that can be used for the dynamic storage or manipulation of data.
The General Purpose registers are EAX, EBX, ECX, and EDX. These are 32-bit registers that have evolved from 8-bit and 16-bit registers, so their lowest 16 bits (0-15) can be accessed as the AX, BX, CX, and DX registers, each of which can further be divided into 8-bit H (high) and L (low) registers, such that AX can be divided into AH and AL. Remember that 8 bits is one byte, so that AH and AL are each 1 byte, AX is two bytes (or 1 word), and EAX is four bytes (2 words or 1 double word, or dword). These registers are used for manipulating data, such as variable compares (MOV EAX, 003F; MOV EBX, [EBP-04); CMP EAX, EBX) or mathematical operations (MOV AX,2; MOV BX,4; ADD AX,BX), as well as for writing a value to a memory location, for a memory location cannot be written to directly but can be "dumped" the contents of a register. Note that CX is often used as a Count register; in fact the LOOP instruction in assembly will decrement CX with each "looping" and will end the loop when CX=0. Also, AH is used to determine what service of an interrupt function (discussed below) is to be used when the INT (interrupt) call is generated.
The Segment Registers are CS, DS, ES, FS, GS, and SS. These registers are all 16 bits (0-15) and contain the first half of a segment:offset memory address (segment:offset addresses are converted by the OS kernel to physical memory locations, for the OS will move code and data all over hell and back in the course of "paging" and "memory management"; thus you will always deal with segment:offset addresses when accessing memory, as physical memory locations are in a constant state of flux that is only handled or understood by the OS). CS is the Code Segment, of the segment containing the executable code of the program currently being executed (note that "currently being executed" is determined by what program has its code in the CS:IP of the CPU; this can be changed and managed by the OS, though it still seems to be an example of the serpent biting its own tail...); DS is the Data Segment of the program currently being located (i.e., string tables, constants, etc); SS is the Stack Segment of the program currently being executed (the stack being the dynamic data area of the program, where variables and "scratch" information are kept); and ES, FS and GS are all extra data segments that may or may not be used, depending on the memory model of the currently executing program.
The Offset Registers are EIP, ESP, EBP, ESI, and EDI. These are 32-bit registers whose lower halves can be accessed as the 16-bit registers IP, SP, BP, SI, and DI. EIP is the Instruction Pointer, and contains the offset of the line of code to be executed next (such that CS:IP forms the complete address); ESP is the Stack Pointer, and contains the address of the "top" of the stack, or where the next item pushed onto the stack will go to (with the complete address being SS:SP); EBP is the Base Pointer, and contains a memory address in the stack from which data in the stack can be referenced (to quote Fravia, when examining a program's code, "function parameters have positive offsets from BP [eg, BP 04], local variables have negative offsets from BP [eg, BP-04]"; the complete address referenced by this register would be SS:BP)"; ESI is the Source Index, and contains the source of data in a "block move" (the complete address being DS:SI); and EDI is the Destination Index, and contains the destination of data in a "block move" (the complete address being ES:EDI).
The Control Registers are CR0, CR1, CR2, and CR3. These are 32-bit registers that are responsible for things like processor mode (real or protected or V86), paging, FPU emulation, etc.. They are accessible only to Ring-0 (kernel) programs; if you attempt to write to a CR with a Ring-3 (application) you will cause a GPF.
The Debug Registers are DR0, DR1, DR2, DR3, DR4, DR5, DR6, and DR7, they are 32-bits in size ( bits 0-31). DR0 through 3 contain breakpoint addresses, the rest set what happens when a breakpoint is activated (i.e., they determine how a debug exception is generated). Apparently you can access these from DOS or using the Dos Protected-Mode Interface in Windows, if you are intent on writing a debugger.
The Test Registers are TR6 and TR7. These are 32-bit registers that are used only to test the memory-paging system in an OS. If you are writing your own operating system, these will be handy for determining whether or not your memory management is up to snuff.
The protected mode Memory Management Registers are TR for the TSS (Task State Segment), LDTR for the LDT (Local Descriptor Table), IDTR for the IDT (interrupt Descriptor Table), and GDTR for the GDT (Global Descriptor Table). TR and LDTR are 68-bit registers, the first 20 bits setting the TSS or LDT limit, the next 32 bits setting the TSS or LDT base address, and the final 16 bits setting the TSS or LDT selector. IDTR and GDTR are 52-bit registers, with the first 20 bits setting the IDT or GDT limit, and the last 32 bits setting the IDT or GDT base address. These registers are used in conjunction with CR0 by the kernel to manage tasks, interrupts, and memory allocation.
The Flag Register is EFLAG. This is a 32-bit register whose lower 16 bits (0-15) contain the Carry, Parity, Auxilliary, Zero, Sign, Trap, Interrupt Enable, Direction, Overflow, I/O Privelege Level, and Nested Task flags; bits 16-31 contain the Resume and Virtual 8086 mode flags, as well as a number of reserved flags. Each flag bit can be either on ("1") or off ("0"), indicating that the flag is set or is not set. If the result of a comparison returned 0, for example, the Zero flag would be set ("1"), and a subsequent JZ (Jump if Zero) instruction would be executed.
Note that these are the registers common to the i386 processor and above; every subsequent processor will have additional specialized registers that allow expanded processor functions. Additional information about regsiters, and about the PC in general, can be obtained from Addison Wesley's The Indispensable PC Hardware Book, just released in its 3rd edition in mid-1997 and worth every penny of its $42.95 US price..
Soft-Ice Interlude: When you break into Soft-Ice, the state of the registers for the current process (identified at the bottom-left of the screen) will be displayed at the top of the screen--if it is missing, you can toggle the register display on by typing "WR".
The general-purpose EAX, EBX, ECX, and EDX registers will contain data or pointers to data that is in use by the current process--for example, during a login sequence EAX might point to a memory location that contains your user name (which you would be able to view by typing "d eax"), while ECX might contain a character count which increases as the program parses your username.
The segment registers CS, DS, ES, FS, GS, and SS will contain the valid segments for the current process. The CS register will contain the Code Segment (location in RAM of the program's executable code), the DS register will contain the Data Segment (location in RAM of the program's data), and the SS register will contain the Stack Segment (location in RAM of the program's stack space). The ES, FS, and GS registers contain "extra segments" that the program may haver reserved for data.
The offset registers IP, SP, BP, SI, and DI complement the segment registers to provide complete memory addresses (segment:offset). IP, or instruction pointer, will contain the offset of the the line of code about to be executed by the CPU, such that CS:IP is the complete memory address for that line of code. SP, or stack pointer, contains the offset of the top of the program's stack, so that the next piece of data pushed onto the stack will be stored at SS:SP. BP, or base pointer, is a pointer used to reference data placed on the stack. SI (source index) and DI (destination index) will contain the offsets of string data that is being manipulated (moved, compared, etc).
The Flags register will appear as a string of letters reading o d i s z a p c; these letters represent the current state of the Overflow, Direction, Interrupt, Sign, Zero, Auxilliary, Parity, and Carry flags. A capital letter indicates that flag has been toggled "on", while a lowercase letter indicates that the flag has been toggled "off".
"Memory" is perhaps the single most nebulous term used in the PC industry. Commonly it is used to refer simply to a system's RAM, but when you get into the world of assembly language, everything changes. Suddenly you have to worry about real-mode vs. protected mode memory, physical vs. logical memory, flat vs segmented memory address space, global vs. local memory allocation, and memory space vs I/O space. For this reason, it would be best to define a few terms that may come up later.
Executable File Memory Model
How does memory come into play in the context of an application? Very basic executable files are .COM files with a single segment for data, code, and stack, that is mapped directly into system memory for execution. More complicated, and less obsolete, executable files (.EXE files) are made of multiple segments in memory--some for code, some for data, some for stack--and contain internal structures in addition to code and application data that allow the system to map the file correctly across multiple segments (see the applicable file format documentation for more info).
An executable file is mapped into memory at its simplest with a text segment, a data
segment, a stack segment, and a heap. The text segment contains the executable code for the
application (note that this may span mutiple segments). The data segment contains
initialized data, or variables that are explicitly assigned a value in the source code
(e.g. int serial_number = 159900;
),
and uninitialized data (sometimes called BSS data), or variables that are allocated
space in the source code but are not explicitly assigned a value (e.g.,
char username[20];
).
The stack segment contains local variables
(e.g. checkkey(serial_number, username){ int x; DWORD key_check; ... }
),
parameters passed to functions, and return addresses pushed for call returns. Finally,
the heap is an area of memory from which local allocation is made
(e.g. malloc( 1024 );...
).
The Program "in vivo"
The x86 CPU is hardcoded with certain instructions to manipulate and compare data. What follows is a an brief overview of the most common instructions; this is not a complete or authoritative reference, but more of a quick reference guide for beginners. First, a few brief definitions to save clarification later:
immediate value: an integer such a 09h (e.g. mov edx, 09h) memory value: a value stored at a memory location, such as DS:Variable1 (e.g. mov edx, DS:Variable1) register value: a value stored in a register, such as EAX (e.g. mov edx, eax) relative offset: a value to be calculated from the end of the current instruction, or a code label (e.g. jnz Exit or jmp 015h)Program Flow
Call dest Call Procedure: Turn execution over to the procedure specified in dest. When calling near procedures, CALL will push the address of the next instruction onto the stack as a return address; when calling far procedures, CALL will push CS followed by the address of the next instruction as a return address. The dest value can be a relative offset, a register value, or a memory value.
INT dest Generates a call to an interrupt handler. The dest value must be an integer (e.g., Int 21h). INT3 and INTO are interrupt calls that take no parameters but call the handlers for interrupts 3 and 4, respectively.
IRET Interrupt Return: Return from interrupt handler to standard execution.
JCC dest Jump if Condition is Met: These instructions check the flags and jump to the dest location if the condition is met; otherwise execution continues as normal. The dest value must be a relative offset. The various conditional jumps are:
JA short/near Jump if Above: CF=0 and ZF=0 JAE short/near Jump if Above/Equal: CF=0 JB short/near Jump if Below: CF=1 JBE short/near Jump if Below/Equal: CF=1 or ZF=1 JC short/near Jump if Carry: CF=1 JCXZ short Jump if CX=0 CX=0 JE short/near Jump if Equal: ZF=1 JECXZ short Jump if ECX=0 ECX=0 JG short/near Jump if Greater: ZF=0 and SF=OF JGE short/near Jump if Greater/Equal: SF=OF JL short/near Jump if Less: SF <> OF JLE short/near Jump if Less/Equal: ZF=1 and SF <> OF JNA short/near Jump if Not Above: CF=1 or ZF=1 JNAE short/near Jump if Not Above/Equal: CF=1 JNB short/near Jump if Not Below: CF=0 JNBE short/near Jump if Not Below/Equal: CF=0 and ZF=0 JNC short/near Jump if Not Carry: CF=0 JNE short/near Jump if Not Equal: ZF=0 JNG short/near Jump if Not Greater: ZF=1 or SF <> OF JNGE short/near Jump if Not Greater/Equal: SF <> OF JNL short/near Jump if Not Less: SF=OF JNLE short/near Jump if Not Less/Equal: ZF=0 and SF=OF JNO short/near Jump if Not Overflow: OF=0 JNP short/near Jump if Not Parity: PF=0 JNS short/near Jump if Not Sign: SF=0 JNZ short/near Jump if Not Zero: ZF=0 JO short/near Jump if Overflow: OF=1 JP short/near Jump if Parity: PF=1 JPE short/near Jump if Parity Even: PF=1 JPO short/near Jump if Parity Odd: PF=0 JS short/near Jump if Sign: SF=1 JZ short/near Jump if Zero: ZF=1
JMP dest Jump: Transfers control to the location specified by dest. The dest value can be either a relative offset, or a register or memory value that contains such an offset.
LOOP/LOOPE/LOOPNE/LOOPNZ/LOOPZ dest Loop with CX Counter: Decrements ECX by 1, then jumps to the location indicated by dest; when ECX=0, the jump is bypassed and program execution continues. The variations of LOOP (-E,-NE,-NZ,-Z) will execute the jump only if the CX register is != 0 and if their conditions are met (ZF=1, ZF=0, ZF=0, and ZF=1, in order). The dest value must be a relative offset.
NOP No Operation: A one-byte instruction that does nothing.
REP/REPE/REPZ/REPNE/REPNZ ins Repeat Following String Instruction: Repeats ins until CX=0 or until indicated condition (ZF=1, ZF=1, ZF=0, ZF=0) is met. The ins value must be a string operation such as CMPS, INS, LODS, MOVS, OUTS, SCAS, or STOS.
RET/RETF/RETN dest Return From Procedure: Transfers control to a return address located on the stack. The optional dest parameter indicates the number of stack bytes or words (depending on whether the code is 16-bit or 32-bit) to release (POP) after the return address is popped off the stack. This would be the case if the procedure was CALLed with a number of parameter pushed onto the stack before the call, and if the proceudre itself is responsible for cleaning up the stack when it returns. RETF is Return-Far and RETN is Return-Near; the difference is that Far returns pop both CS and IP from the stack to form the return address, while near returns pop only the IP register.
CMP dest, src Compare: Compares dest with src and discards the result so that only the flags (Overflow, Sign, Zero, Aux, Parity, Carry) are affected. The dest value may be a register or memory value, while src may be a register, memory, or immediate value.
CMPS/CMPSB/CMPSW/CMPSD Compare String Byte/Word/Dword: Compares the bytes, words, or dwords at ES:EDI with the ones at DS:ESI. The result is discarded and only the flags (Overflow, Sign, Zero, Aux, Parity, Carry) are affected. Each location is incremented after the compare and the REP instructions can be combined with these for string processing.
TEST dest, src Logical Compare: Performs a bitwise AND of dest and src; the result is discarded and only the flags (Zero, Sign and Parity) are effected. The dest value may be a register or memory value, while the src value may be a register or immediate value.
ADC dest, src Add With Carry: Adds dest, src, and the carry flag (CF); stores the result in dest. Can be used to add two registers (ADC edx, eax), a register to a register/memory/immediate value (ADC edx, eax/Variable1/09h), or a memory to a register/memory/immediate value (ADC Variable2, eax/Variable1/09h).
ADD dest, src Add: Adds dest and src, stores the result in dest. Can be used to add two registers (ADD edx, eax), a register to a register/memory/ immediate value (ADD edx, eax/Variable1/09h), or a memory to a register/memory/ immediate value (ADD Variable2, eax/Variable1/09h).
DEC dest Decrement by 1: Decrements memory or register value by 1
DIV src Unsigned Divide: Divides AX by src if src is size byte, DX:AX by src if src is size word, and EDX:EAX by src if src is size dword. Quotient will be stored in AL, AX, or EAX depending on the above conditions, and the remainder will likewise be stored in AH, DX, or EDX. The src value may be a register or memory value.
IDIV src Signed Divide: Divides AX by src if src is size byte, DX:AX by src if src is size word, and EDX:EAX by src if src is size dword. Quotient will be stored in AL, AX, or EAX depending on the above conditions, and the remainder will likewise be stored in AH, DX, or EDX. The src value may be a register or memory value.
IMUL dest, src Signed Multiply: If only src is specified, the the action is similar to MUL: AL is multiplied by src if src is size byte, AX is multiplied by src if src is size word, and EAX is multipled by src if src is size dword--the result being stored in AX, DX:AX, and EDX:EAX respectively, and src may be a register or memory value. If both operands are specified, dest is multiplied by src, the result being stored in dest; dest can be a register or memory value, and src must be an immediate value.
INC dest Increment by 1: Increments memory or register value by 1
MUL srcUnsigned Multiply: Multiples AL by src if src is of size byte, AX by src is src is size word, and EAX by src if src is of size dword. The result is stored in AX, DX:AX, and EDX:EAX respectively. The src value may be a register or memory value.
RCL/ROL dest, src Rotate Left: The bits in dest are rotated 9, 17, or 33 bits (depending on whether dest is an 8, 16, or 32-bit value) to the left the number of times indicated in src. The result is that the top bit is returned to the bottom and the second-top most bit is moved to the top (ROL), or that the topmost bit is moved into the carry flag CF, CF is moved into the bottom-most bit, and the second-topmost bit is moved to the top. The dest value can be a memory or register value, while src must be CL or an immediate value.
RCR/ROR dest, src Rotate Right: The bits in dest are rotated 9, 17, or 33 bits (depending on whether dest is an 8, 16, or 32-bit value) to the right the number of times indicated in src. The result is that the bottom bit is moved to the top and the second-bottom-most bit is moved to the bottom (ROR), or that the bottom-most bit is moved into the carry flag CF, CF is moved into the top-most bit, and the second-bottom-most bit is moved to the bottom. The dest value can be a memory or register value, while src must be CL or an immediate value.
SAL/SHL dest, src Shift Left: Shifts the bits of src to the left, so that by shifting once the lowest bit becomes the second lowest (e.g., shifting the binary value 10111100 once to the left would make it 01111000) . dest can be a register or memory value, while src must be an immediate value or the CL register. Shifting an integer once to the left has the effect of multiplying it by two.
SAR/SHR dest, src Shift Right: Shifts the bits of src to the right, so that by shifting once the highest bit becomes the second-highest (e.g., shifting the binary value 10111100 once to the right would make it 01011110). dest can be a register or memory value, while src must be an immediate value or the CL register. Shifting an integer once to the right has the effect of dividing it by two.
SUB dest, src Integer Subtraction: Subtracts src from dest and stores the result in dest; dest may be a register or memory value, while src may be a register, memory, or immediate value.
Reference Table operation src dest result AND 1 1 1 1 0 0 0 1 0 0 0 0 OR 1 1 1 1 0 1 0 1 1 0 0 0 XOR 1 1 0 1 0 1 0 1 1 0 0 0 NOT 0 N/A 1 1 N/A 0AND dest, src Logical AND: Compares each bit of dest and src, and overwrites the bit in dest with 1 if both dest and src bits are 1; otherwise it overwrites the bit in dest with 0. For example, binary 01001001 ANDed with 11100010 would result in 01000000. The dest value may be a register or memory value, while the src value may be a register, memory, or immediate value.
NEG src Two's Complement Negation: The operand is subtracted from 0 and replaced with the result. For example, the binary value 01001001 NEGed would produce 1011011, while the hex value 13h would produce EDh. NOT src One's Complement Negation: Toggles each bit of the operand so that a 1 becomes a 0 and a 0 becomes a 1. For example, binary value 01001001 NOTed would produce 10110110.
OR dest, src Logical Inclusive OR: Compares each bit of dest and src, then overwrites the bit in dest with 0 if both dest and src bits are 0; otherwise it overwites the dest bit with 1. For example, binary 01001001 ORed with 11100010 would result in 11101011. The dest value may be a register or memory value, while the src value may be a register, memory, or immediate value.
XOR dest, src Logical Exclusive OR: Compares each bit of dest and src, then overwrites the bit in dest with 1 if the dest and src bits are the same; otherwise if the bits are different it overwrites the dest bit with 0. For example, binary 01001001 XORed with 11100010 would result in 10101011, while 01001001 XORed with itself would result in 00000000. The dest value may be a register or memory value, while the src value may be a register, memory, or immediate value.
IN dest, src Input from Port: Transfers a byte, word or dword from the port specified in src to the register specified in dest. The dest operator may be AL, AX, or EAX, while src may be any valid 1-byte port number, or DX with a 2-byte port number stored therein.
INS/INSB/INSW/INSD dest, src Input from Port to String: Inputs data from the input port specified in src into the location specified by ES:DI; note that dest is ignored and src must be DX. After the transfer, DI is incremented/decremented by the number of bytes transferred, in the direction specified by the direction flag ( 0=inc, 1=dec). The B,W,and D versions of this instruction take no operands, and move data of the specified size (Byte, Word, Dword) from the port specified in DX to the location specified in ES:DI.
LEA dest, src Load Effective Address: Calculates the effective address (offset) of src and stores it in dest; dest must be a register value, while src must be a memory value or symbolic name.
LODS/LODSB/LODSW/LODSD src Load String Data: Loads the byte, word, or dword addressed by ES:SI into the AL, AX, or EAX register; the src operand is ignored. After the transfer, SI is incremented/decremented by the number of bytes transferred, in the direction specified by the direction flag ( 0=inc, 1=dec). The B, W, and D versions of this instruction take no operands and move the specified amount of data (Byte, Word, Dword) into the EAX register.
MOV dest, src Move Data: Copies value of the src into dest. If dest is a register value, then src may be a register, memory, or immediate value; if dest is a memory value, then src may be a register or immediate value.
MOVS/MOVSB/MOVSW/MOVSD dest, src Move Data from String to String: Copies the byte, word, or dword at DS:ESI to ES:EDI, regardless of operands. After the move, SI and DI is incremented/decremented by the number of bytes transferred, in the direction specified by the direction flag ( 0=inc, 1=dec). The B, W, and D versions of this instruction take no operands and copy the specified amount of data from DS:ESI to ES:EDI.
MOVSX dest, src Move with Sign-Extend: Reads the byte or word at src and copies it to the word or dword at dest with a sign-extend. The dest value may only be a register value, while src may be a register or memory value.
MOVZX dest, src Move with Zero-Extend: Reads the byte or word at src and copies it to the word or dword at dest with zero-extend. The dest value may only be a register value, while src may be a register or memory value.
OUT dest, src Output to Port: Transfers data from src to the port specified in dest. The src value may be AL, AX, or EAX; the dest value may be any one-byte port number, or a DX with a two-byte port number stored therein.
OUTS/OUTSB/OUTSW/OUTSD dest, src Output String to Port: Transfers data from DS:ESI to the port specified in the DX register, regardless of operands. After the transfer, SI is incremented/decremented by the number of bytes transferred, in the direction specified by the direction flag ( 0=inc, 1=dec). The B, W, and D versions of this instruction transfer the specified amount of data (byte, word or dword) from DS:ESI to the port specified in DX.
POP dest POP Word/Dword from Stack: Moves the value on the top of the stack to dest; the stack pointer SP is incremented by 2 (word) or 4(dword) so that the POPed data is off the stack. The dest value may be a register or memory value.
POPA/POPAD/POPAW Pop All General Registers: Reverses a previous PUSHS by popping the top of the stack into the general registers. POPA and POPAW are equivalent to POP DI, SI, BP, BX, DX, CX, AX; POPAD is equivalent to POP EDI, ESI, EBP, EBX, EDX, ECX, EAX.
POPF/POPFD/POPFW Pop to Flags Register: Pops the top word (POPF/POPFW) or dword (POPFD) from the stack into the flags register.
PUSH src Push Word/Dword to Stack: Decrements the stack pointer SP by two (word) or four (dword) to add space on the stack, then copies the value in src to that newly-made space at the top of the stack. The src value may be a register, memory, or immediate value.
PUSHA/PUSHAD/PUSHAW Push All General Registers: Save the 16-bit (PUSHA/PUSHAW) or 32-bit (PUSHAD) registers to the top of the stack; PUSHA and PUSHAW are equivalent to PUSH AX, CX, DX, BX, SP, BP, SI, DI and PUSHAD is equivalent to PUSH EAX, ECX, EDX, EBX, ESP, EBP, ESI, EDI.
PUSHF/PUSHFD/PUSHFW Push Flags Register: Saves the FLAGS (PUSHF/PUSHFW) or EFLAGS (PUSHFD) to the top of the stack.
SCAS/SCASB/SCASW/SCASD src Compare String Data: Compares byte, word, or dword at ES:DI with AL, AX, or EAX, regardless of operand; the result is discarded and only the flags are affected. After the compare, DI is incremented/decremented by the number of bytes transferred, in the direction specified by the direction flag ( 0=inc, 1=dec).The B, W, and D versions of this instruction compare values of the indicated size with AL, AX, or EAX.
STOS/STOSB/STOSW/STOSD src Store String Data: Transfers the contents of AL, AX, or EAX to the memory location specified in ES:EDI, regardless of operands. After the transfer, DI is incremented/decremented by the number of bytes transferred, in the direction specified by the direction flag ( 0=inc, 1=dec). The B, W, and D versions of this instruction transfer values of the indicated size to ES:EDI.
XCHG dest, src Exchange Memory or Register with Register: Moves the original value of src into dest and the original value of dest into src. If dest is a register value, then src can be a register or memory value; if dest is a memory value, then src must be a register value.
CLC Clear Carry Flag: Set CF=0
CLD Clear Direction Flag: Set DF=0
CLI Clear Interrupt Flag: Set IF=0
LAHF Load AH from Flags: Set bits 7, 6, 4, 2, and 0 with the value of flags SF ZF AF PF CF.
SAHF Store AH into Flags: Sets flags SF ZF AF PF CF with bits 7, 6, 4, 2, and 0 from AH.
STC Set Carry Flag: Sets CF=1
STD Set Direction Flag: Sets DF=1
STI Set Interrupt Enable Flag: Sets IF=1
Interrupt services are handlers for software interrupts--handlers which are routines provided by the ROM-BIOS or the Operating System (assumed to be DOS in this case). When an NIT instruction is encountered, the CPU pushes the flags register and then the return address (CS:IP), then looks up the interrupt number in the Interrupt Vector Table (IVT to its friends) and calls the handler associated with that inteerupt vector. Execution continues in the handler until the CPU encounters an IRET statement, whereupon it returns to the stored CS:IP and restores the flags.
Interrupts are called by moving required values into registers--notice that the stack is not used-- and then calling the interrupt number via the INT function. Some interrupts have a number of functions, which are identified in the INT statement by a value placed in AH, and some functions also have sub-functions, which are identified by a value placed in AL.
The following is a listing and brief description of the more common BIOS and DOS interrupt services. It is not intended to fully explain each interrupt but rather to provide a quick reference as to the interrupt number, name, parameters, and return values.
Bios Services
Video Services
Int5h PrintScreen Entry: N/A Exit: N/A Notes: Sends ASCII contents of video buffer to printer
Int10h-00h Set Video Mode Entry: ah=00 al=mode Exit: N/A Notes: modes 0-3 are 16-color text, modes 4-6 are 4-color graphics, mode 7 is mono text, modes 8-18 are card-dependent, mode 19 is 256-color graphics
Int10h-01h Set Cursor Size Entry: ah=01 ch=start scanline cl=end scanline Exit: N/A Notes: Cursor appears between start and end scanlines, each scanline is one pixel high
Int10h-02h Set Cursor Position Entry: ah=02 bh=video page# dh=cursor row dl=cursor col Exit: N/A Notes: Primary page = 0
Int10h-03h Read Cursor Position & Size Entry: ah=03 bh=video page# Exit: bh=video page# ch=start scanline cl=end scanline dh=cursor row dl=cursor col Notes:
Int10h-05h Select Active Display Page Entry: ah=05 al=display page Exit: N/A Notes: text mode only; page range is usually 0-7
Int10h-06h Scroll Window Up Entry: ah=06 al=lines to scroll bh=display attr for blank lines ch=row for upper left corner of window cl=col for upper left corner dh=row for lower right corner dl=col for lower right corner Exit: N/A Notes: Selectively scross portion of screen; attr= 1-byte hex value, top nibble=background, bot nibble=foreground, colors 0-F (black, blue, green, cyan, red, magenta, brown, white, gray, lt blue, lt green, lt cyan, lt red, lt magenta, yellow, white), such that 0x0Fh is white fore, black back
Int10h-07h Scroll Window Down Entry: ah=07 al=lines to scroll bh=display attr for blank lines ch=row upper left cl=col upper left dh=row lower right dl=col lower right Exit: N/A Notes: As above
Int10h-08h Read Char and Attribute Entry: ah=08 bh=video page# Exit: ah=attr byte al=ASCII code Notes: Attr is as above
Int10h-09h Write Char and Attribute Entry: ah=09 al=ASCII code bh=video page# lb=attr byte cx=#of characters to display Exit: N/A Notes: Attr as above
Int10h-0Ah Write Char Entry: ah=0A al=ASCII code bh=video page# bl=color cx=# of chars to write Exit: N/A Notes:
Int10h-0Ch Write Pixel Dot Entry: ah=0C al=pixel value cx=pixel col dx=pixel row Exit: N/A Notes:
Int10h-0Dh Read Pixel Dot Entry: ah=0D cx=pixel col dx=pixel row Exit: al=pixel value cx=pixel col dx=pixel row Notes:
Int10h-0Eh TTY Char Output Entry: ah=0E al=ASCII code bh=video page# bl=char color Exit: N/A Notes: Translates ASCII bell, backspace, CR and LF chars
Int10h-0Fh Get Current Video State Entry: ah=0F Exit: ah=screen widt al=display mode bh=active display page Notes:
Int10h-13h Write String Entry: ah=13h al=mode bh=video page# bl=char attr cx=lengthof string dh=cursor row dl=cursor col es=seg bp=offset Exit: N/A Notes: ES:BP=addr of string
System Services
Int11h Get Equipment Status Entry: N/A Exit: ax=status Notes: bitstruct returned
Int12h Get Memory Size Entry: N/A Exit: ax=memory blocks Notes: ax=# of contiguous 1K blocks
Int18h Boot Process Failure Entry: N/A Exit: N/A Notes: Calls ROM Basic if available
Int19h Warm Boot Entry: N/A Exit: N/A Notes: Reboot!
Int1Bh Control-Break Handler Entry: N/A Exit: N/A Notes: Called when Ctrl-Brk pressed (can be hooked)
Disk Services
Int13h-00h Reset Disk Drives Entry: ah=00 dl=drive# Exit: N/A Notes: reset disk controller: for error handling
Int13h-01h Get Floppy Disk Status Entry: ah=01 dl=drive# Exit: ah=status byte Notes: bitstruct returned
Int13h-02h Read Disk Sectors Entry: ah=02 al=#sectors es=segment bx=offset ch=track cl=sector dh=head/side# dl=drive# Exit: return code Notes: es:bx=addr of buffer
Int13h-03h Write Disk Sectors Entry: ah=03 al=#sectors es=seg bx=offset ch=track cl=sector dh=head/side# dl=drive# Exit: return code Notes: es:bx=addr of string
Int13h-05h Format Disk Track Entry: ah=05 es=seg bx=offset ch=trach dh=head/side# dl=drive# Exit: return code Notes: es:bx=addr of track address field. Repeat int for formatting entire disk
Peripheral Services
Int14h-00h Initialize Communications Port Entry: ah=00 al=parameter dx=COM# Exit: ah=line status al=modem status Notes: COM#=0 for COM1, 3 for COM4
Int14h-01h Transmit Character Entry: ah=01 al=ASCII char dx=COM# Exit: N/A Notes:
Int14h-02h Receive Character Entry: ah=02 dx=COM# Exit: ah=return code al=char Notes:
Int14h-03h Get COM Port Status Entry: ah=03 dx=COM# Exit: ah=line status al=modem statis Notes:
Int15h-84h Joystick Support Entry: ah=84 dx=code Exit: al=switch settings/ax= a(x) bx=a(y) cx=b(x) dx=b(y) Notes: code=00 (read switches) or 01 (get position)
Int16h-00h Read Keyboard Character Entry: ah=00 Exit: ah=scan code al=ASCII value Notes:
Int16h-02h Read Keyboard Shift Status Entry: ah=02 Exit: al=code Notes: code=bitstruct
Int17h-00h Print Character Entry: ah=00 al=char dx=printer Exit: ah=printer status Notes: printer=0 (LPT1) to 2 (LPT3)
Int17h-01h Initialize Printer Entry: ah=01 dx=printer Exit: ah=printer status Notes:
Int17h-02h Get Printer Status Entry: ah=02 dx=printer Exit: ah=printer status Notes:
DOS Services
Input/Output Services
Int21h-01h Character Input With Echo Entry: ah=1 Exit: al=char Notes: Echo to screen
Int21h-02h Output Character Entry: ah=2, dl=char Exit: N/A Notes:
Int21h-03h Auxiliary Input Entry: ah=3 Exit: al=char Notes: Reads std aux (COM1)
Int21h-04h Auxiliary Output Entry: ah=4 dl=char Exit: N/A Notes: Sends std aux (COM1)
Int21h-05h Printer Output Entry: ah=5 dl=char Exit: N/A Notes: Sends std prt (LPT1)
Int21h-06h Direct Console I/O Entry: ah=6 dl=char Exit: al=char Notes: set dl=0FFh for input
Int21h-08h Char Input, No Echo Entry: ah=8 Exit: al=char Notes: Catches Ctrl-combos
Int21h-09h Output Character String Entry: ah=9 ds=seg dx=offset Exit: N/A Notes: ds:dx = addr of string
Int21h-0Ah Buffered Input Entry: ah=0A ds=seg dx=offset Exit: N/A Notes: ds:dx=addr of buffer
Int21h-44h-02h Character Device Read Entry: ah=44 al=02 bx=dev handle cx=bytes-to-read ds=seg dx=offset Exit: ax=bytes read Notes: ds:dx=addr of buffer
Int21h-44h-03h Character Device Write Entry: ah=44 al=03 bx=dev handle cx=bytes-to-write ds=seg dx=offset Exit: ax=bytes written Notes: ds:dx=addr of buffer
Int21h-44h-04h Block Device Read Entry: ah=44 al=04 bl=drive# cx=bytes-to-read ds=seg dx=offset Exit: ax=bytes read Notes: ds:dx=addr of buffer
Int21h-44h-05h Block Device Write Entry: ah=44 al=05 bl=drive# cx=bytes-to-write ds=seg dx=offset Exit: ax=bytes written Notes: ds:dx=addr of buffer
Disk Services
Int21h-0Dh Reset Disk Entry: ah=0D Exit: N/A Notes: flushes DOS disk buffers
Int21h-0Eh Set Default Drive Entry: ah=0E dl=drive# Exit: al=# Logical Drives Notes:
Int21h-0Fh Open File (FCB) Entry: ah=0F ds=seg dx=offset Exit: al=status Notes: ds:dx=addr of FCB
Int21h-10h Close File (FCB) Entry: ah=10 ds=seg dx=offset Exit: al=status Notes: ds:dx=addr of FCB
Int21h-11h Search First FileName Match (FCB) Entry: ah=11 ds=seg dx=offset Exit: al=status Notes: ds:dx=addr of FCB; returns match in DTA
Int21h-12h Search Next FileName Match (FCB) Entry: ah=12 ds=seg dx=offset Exit: al=status Notes: ds:dx=addr of FCB, match returns in DTA
Int21h-13h Delete File (FCB) Entry: ah=13 ds=seg dx=offset Exit: al=status Notes: ds:dx=addr of FCB
Int21h-14h Sequential Read (FCB) Entry: ah=14 ds=seg dx=offset Exit: al=status Notes: ds:dx=addr of FCB, returns 1 block in DTA
Int21h-15h Sequential Write (FCB) Entry: ah=15 ds=seg dx=offset Exit: al=status Notes: ds:dx=addr of FCB, block written from DTA
Int21h-16h Create File (FCB) Entry: ah=16 ds=seg dx=offset Exit: al=status Notes: ds:dx=addr of FCB
Int21h-17h Rename File (FCB) Entry: ah=17 ds=seg dx=offset Exit: al=status Notes: ds:dx=addr of modified FCB
Int21h-19h Get Current Drive Entry: ah=19 Exit: al=drive# Notes: 0=A, 1=B, 2=C, etc
Int21h-1Ah Set Disk Transfer Area Entry: ah=1A ds=seg dx=offset Exit: N/A Notes: ds:dx=addr of DTA
Int21h-1Bh Get FAT Info For Default Drive Entry: ah=1B Exit: al=sectors/cluster ds=seg bx=offset cx=bytes/sector dx=clusters/disk Notes: ds:bx points to FAT ID byte
Int21h-1Ch Get FAT Info For Drive Entry: ah=1C dl=drive# Exit: al=sectors/cluster ds=seg bx=offset cx=bytes/sector dx=clusters/disk Notes: ds:bx points to FAT ID byte
Int21h-1Fh Get Default Disk Parameter Block Entry: ah=1F Exit: al=status ds=seg bx=offset Notes: ds:bx=addr of disk parameter block
Int21h-21h Random Read (FCB) Entry: ah=21 ds=seg dx=offset Exit: al=status Notes: ds:dx=addr of FCB; reads one record to DTA
Int21h-22h Random Write (FCB) Entry: ah=22 ds=seg dx=offset Exit: al=status Notes: ds:dx=addr of FCB, info written from DTA
Int21h-23h Get File Size(FCB) Entry: ah=23 ds=seg dx=offset Exit: al=status Notes: ds:dx=addr of FCB
Int21h-24h Set Random Record (FCB) Entry: ah=24 ds=seg dx=offset Exit: al=status Notes: ds:dx=addr of FCB
Int21h-27h Read Random Records (FCB) Entry: ah=27 cx=#records to read ds=seg dx=offset Exit: al=status cx=#records read Notes: ds:dx=addr of FCB, info returned in DTA
Int21h-28h Write Random Records (FCB) Entry: ah=28 cx=#records ro write ds=seg dx=offset Exit: al=status cx=#records written Notes: ds:dx=addr of FCB, info written from DTA
Int21h-29h Parse FileName (FCB) Entry: ah=29 al=parsing ds=seg si=offset es=seg di=offset Exit: al=status ds=seg si=offset Notes: ds:si=addr of string to parse, es:di=addr of FCB, al=bit-flag for parsing; ds:si returns addr of first char after parsed string
Int21h-2Fh Get Disk Transfer Area Entry: ah=2F Exit: es=seg bx=offset Notes: es:bx=addr of DTA
Int21h-32h Get Disk Parameter Block Entry: ah=32 dl=drive# Exit: al=error ds=seg bx=offset Notes: ds:bx=addr of disk parameter block
Int21h-33h-05h Get Boot DriveEntry: ah=33 al=05 Exit: dl=drive# Notes:
Int21h-36h Get Disk Free Space Entry: ah=36 dl=drive# Exit: ax=sectors/cluster bx=avail clusters cx=bytes/sector dx=clusters/driveNotes:
Int21h-39h Create Subdir Entry: ah=39 ds=seg dx=offset Exit: ax=error Notes: ds:dx=addr of path name
Int21h-3Ah Remove Subdir Entry: ah=3A ds=seg dx=offset Exit: ax=error Notes: ds:dx=addr of path name
Int21h-3Bh Set Dir Entry: ah=3B ds=seg dx=offset Exit: ax=error Notes: ds:dx=addr of path name
Int21h-3Ch Create File Entry: ah=3C cx=file attr ds=seg dx=offset Exit: ax=error Notes: ds:dx=addr of path name, attr"0"=normal
Int21h-3Dh Open File Entry: ah=3D al=open code ds=seg dx=offset Exit: ax=file handle Notes: ds:dx=addr of path name, open code=bit flag
Int21h-3Eh Close File Entry: ah=3E bx=file handle Exit: ax=error Notes:
Int21h-3Fh Read File Entry: ah=3F bx=file handle cx=bytes-to-read ds=seg dx=offset Exit: ax=error Notes: ds:dx=addr of buffer
Int21h-40h Write File Entry: ah=40 bx=file handle cx=bytes-to-write ds=seg dx=offset Exit: ax=error Notes: ds:dx=addr of buffer
Int21h-41h Delete File Entry: ah=41 ds=seg dx=offset Exit: ax=error Notes: ds:dx=addr of pathname
Int21h-42h Move File Pointer Entry: ah=42 al=move code bx=file handle cx=distance hiword dx=distance loword Exit: ax=Location loword dx=location hiword Notes: Movement code is 0 (rel to beginning of file), 1 (rel to curr loc), or 2(rel to end of file)
Int21h-43h Get/Set File Attributes Entry: ah=43 al=code cx=desired attr ds=seg dx=offset Exit: ax=error cx=curr attr Notes:ds:dx=addr of path name, code = 0 (get attr) or 1 (set attr)
Int21h-47h Get Dir Path Entry: ah=47 dl=drive# ds=seg si=offset Exit: ax=error Notes: ds:si=addr of buffer
Int21h-4Eh Search First Filename Match Entry: ah=4E cx=File attr ds=seg dx=offset Exit: ax=error Notes: ds:dx=addr of filename, info returned in DTA
Int21h-4Fh Search Next Filename Match Entry: ah=4F Exit: ax=error Notes: returns info in DTA
Int21h-56h Rename File Entry: ah=56 ds=seg dx=offset es=seg di=offset Exit: ax=error Notes: ds:dx=addr of old filename es:di=addr of new filename
Int21h-57h Get/Set File Date & Time Entry: ah=57 al=code bx=file handle cx=new time dx=new date Exit: ax=error cx=file time dx=file date Notes: code is 0 (Get) or 1 (Set)
Int21h-5Ah Create Temporary File Entry: ah=5A cx=file attr ds=seg dx=offset Exit: ax=error ds=seg dx=offset Notes: ds:dx=addr of path name, ds:dx returns with complete path/filename
Int21h-5Bh Create File Entry: ah=5B cx=file attr ds=seg dx=offset Exit: ax=error Notes: ds:dx=addr of path name
Int21h-68h Flush Buffer Entry: ah=68 bx=file handle Exit: ax=error Notes: write file buffer to disk
Int25h Absolute Disk Read Entry: al=drive# ds=seg bx=offset cx=sectors to read dx=logical starting sector Exit: ax=return code Notes: ds:bx=addr of buffer
Int26h Absolute Disk Write Entry: al=drive# ds=seg bx=offset cx=sectors to write dx=logical starting sector Exit: ax=error Notes: ds:bx=addr of buffer
System Services
Int20h Terminate Program Entry: N/A Exit: N/A Notes:
Int21h-00h Terminate Program Entry: ah=0 cs=seg of PSP Exit: N/A Notes:
Int21h-25h Set Interrupt Vector Entry: ah=25 al=int# ds=seg dx=offset Exit: N/A Notes: ds:dx=addr of new int handler
Int21h-26h Create PSP Entry: ah=26 dx=seg of new PSP Exit: N/A Notes: copies current PSP (256 bytes) to new location
Int21h-30h Get DOS Version Entry: ah=30 al=0 Exit: al=Major version ah=Minor version bh=OEM seial# bl=hi-order 8 bits of serial# cx=lo-order 16 bits of serial# Notes:
Int21h-31h Terminate & Stay Resident Entry: ah=31 al=return code dx=memory paragraphs to reserver Exit: N/A Notes:
Int21h-33h-06h Get DOS Version Entry: ah=33 al=06 Exit: bl=Major version bh=Minor version dl=Revision# dh=DOS Memory flags Notes: dh: bit-3-set=DOS in ROM, bit-4-set=DOS in HMA
Int21h-34h Get InDOS Flag Entry: ah=34 Exit: es=seg bx=offset Notes: es_bx=addr of InDOS flag; used to determine if DOS is currently executing an Int21 service
Int21h-35h Get Interrupt Vector Entry: ah=35 al=int# Exit: es=seg bx=offset Notes: es:bx=addr of int handler
Int21h-38h Get/Set Country Info Entry: ah=38 al=country# bx=country# ds=seg dx=offset Exit: ax=error bx=country# Notes: ds:dx=addr of country info block (34 bytes); using country#=0 specifies the currently installed country
Int21h-44h-00h Get Device Info Entry: ah=44 al=00 bx=device handle Exit: dx=device info Notes: dx returns a bit flag structure
Int21h-44h-01h Set Device Info Entry: ah=44 al=01 bx=device handle dh=0 dl=device info Exit: N/A Notes: dl is a bit-flag structure
Int21h-44h-06h Get Input Status Entry: ah=44 al=06 bx=device handle Exit: al=status Notes: al=0F if ready, 00 if not ready
Int21h-44h-07h Get Output Status Entry: ah=44 al=07 bx=device handle Exit: al=status Notes: al=0F if ready, 00 if not ready
Int21h-48h Allocate Memory Entry: ah=48 bx=paragraphs to allocate Exit: ax=error bx=max paragraphs avail (if error) Notes:
Int21h-49h Free Allocated Memory Entry: ah=49 ES=seg of memory block Exit: ax=error Notes: free blocks allocated with above service
Int21h-4Ah Change Memory-Block Alloc Entry: ah=4A BX=total paragraphs to allocate ES=seg of memory block Exit: ax=error bx=max paragraphs avail (if error) Notes:
Int21h-4Bh-00h Load Program Entry: ah=4B al=00 es=seg bx=offset ds=seg dx=offset Exit: ax=error Notes: es:bx=addr of parameter block, ds:dx=addr of path name
Int21h-4Bh-03h Load Overlay Entry: ah=4B al=03 es=seg bx=offset ds=seg dx=offset Exit: ax=error Notes: es:bx=addr of parameter block, ds:dx=addr of path name
Int21h-4Bh-05h Set Execution State Entry: ah=4B al=05 ds=seg dx=offset Exit: N/A Notes: ds:dx=addr of parameter block; this prepares DOS ti transfer control to a new program/overlay
Int21h-4Ch Process Terminate Entry: ah=4C al=return code Exit: N/A Notes: Terminate current process
Int21h-4Dh Get Return Code of SubProcess Entry: ah=4D Exit: ax=return code Notes:
Int21h-50h Set PSP Address Entry: ah=50 bx=seg of new PSP Exit: N/A Notes: redfine PSP for currently-running program
Int21h-51h Get PSP Address Entry: ah=51 Exit: bx=seg addr of PSP Notes:
Int21h-59h Get Ext Error Info Entry: ah=59 bx=0 Exit: ax=ext error code(1-90) bh=error class(1-13) bl=suggested remedy(1-7) ch=locus(1-5) Notes:
Int21h-5Dh-0Ah Set Ext Errror Values Entry: ah=5D al=0A ds=seg si=offset Exit: N/A Notes: ds:si=addr of ext error table to be returned at next system error
Int21h-65h-20h Convert Character Entry: ah=65 al=20 dl=char Exit: ax=error dl=char Notes: converts the character in dl to its upercase equivalent
Int21h-65h-21h Convert String Entry: ah=65 al=21 cx=string length ds=seg dx=offset Exit: ax=error Notes: converts the string pointed to by ds:dx to its uppercase equivalent
Int21h-65h-22h Convert ASCIIZ String Entry: ah=65 al=22 ds=seg dx=offset Exit: ax=error Notes: converts null-terminated string at ds:dx to uppercase
Int21h-66h Get/Set Global Code Page Entry: Exit: Notes:
Int21h-67h Change Handle Count Entry: ah=67 bx=# of handles Exit: ax=error Notes: Change the # of handles available to DOS
Int27h Terminate & Stay Resident Entry: dx=pointer to last byte of program Exit: N/A Notes: only for com files
Miscellaneous Services
Int21h-2Ah Get System Date Entry: ah=2A Exit: al=Day of week(0-6), cx=Year(1980-2099), dh=Month(1-12), dl=Day(1-31) Notes:
Int21h-2Bh Set System Date Entry: ah=2B cx=Year dh=Month dl=Day Exit: al=status Notes:
Int21h-2Ch Get System Time Entry: ah=2C Exit: ch=Hour(0-23), cl=Minute(0-59), dh=Second(0-50), dl=Hundredths(0-99) Notes:
Int21h-2Dh Set System Time Entry: ah=2D ch=Hour cl=Minute dh=Second dl=Hundredths Exit: al=status Notes:
Int21h-5Eh-00h Get Machine Name Entry: ah=5E al=0 ds=seg dx=offset Exit: ax=error ch=IsNamed cl=NetBIOS# Notes: ds:dx=offset of buffer
Int21h-5Fh-02h Get Redirection List Entry Entry: ah=5F al=02 bx=redirection list index es=seg di=offset ds=seg si=offset Exit: ax=error bh=device status bl=device type cx=parameter val Notes: es:di=addr of network name buffer ds:si=addr of local name buffer
Int21h-5Fh-03h Redirect Device Entry: ah=5F al=03 bl=device type cx=caller value es=seg di=offset ds=seg si=offset Exit: ax=error Notes: es:di=addr of network path ds:si=addr of device name
Int21h-5Fh-04h Cancel Redirection Entry: ah=5F al=04 ds=seg si=offset Exit: ax=error Notes:ds:si=addr of device name/path
Information regarding assembly language is available everywhere on the Internet...for example, browse through these and you will come across quite a few porgramming gems. Another good resource is Miller Freeman, the United News & Media-owned company that produces MSJ, Windows Developer's Journal, Dr. Dobb's Journal and Sourcebook, C/C User's Journal, and all of these The following are texts on assembly language that I have reviewed and found more or less worthy (at least in some respects), grouped by publisher.
Applied PC Interfacing, Graphics, and Interrupts by William Buchanan: Mixes C, Pascal (yuck), and assembler-- an excellent resource for those getting into device drivers. low-level utilities, or even IC programming. Written for advanced students, this book has very technical information presented in an approachable manner--just the thing for polishing your well-honed DOS asm skills.
Systems Programming For Windows 95 by Walter Oney: This book is by-lined as the "C/C programmer's guide to VxDs, I/O devices, and operating system extensions"--it covers a bit of low-level Windows 95 programming, including quite a bit (surprisingly) in assembly language. Reading it is kind of tough; as is typical with Microsoft publications, subjects are treated only very generally, with the examples beign of a useless "hello world" nature. This is a lot of theory with very little practice, and the writing (again typical of Microsoft) is unclear at best. Still, it provides you with information you couldn't have gotten elsewhere (especially regarding VXDs), providing you have the wit and guile to draw the facts from Oney's onerous prose. Current price: US$40.
Assembly Language For The PC by Peter Norton and John Socha: My second book ever on assembly language, and the best introduction to the language that you can ask for. Most of Norton's books these days are at the casual user level; this one effectively teaches assembly language by guiding you through the creation of a hex editor, such that not only so you learn how to open/display/edit files and how to create a rudimentary GUI, but you also end up writing a tool that you can customize to meet your own twisted needs! Current price is US$40, but you can often find it for around US$10.
Using Assembly Language by Allen Wyatt: I'm not sure whether or not I can recommend this book. It was my first text on assembly language, and I found it inadequate...though it has a pretty decent reference section in terms of opcodes and DOS/BIOS services, and it covers interfacing assembly with other development platforms such as Clipper, C/C , Pascal, and FoxPro. It also has a good intro on how to choose your assembler and linker (though it only examines MASM and TASM); its shortcoming seems to be in actual assembly language programming. Current price is US$30.
Windows Assembly Language And Systems Programming by Barry Kauler: This book was originally published by Prentice Hall; it has now (finally) come out in a new edition which covers 32-Bit Windows programming. This is by far the best source I have found on Windows assembly language (though there are not many); it is concise and clear, with useful examples (some of which are here)and a coverage of topics such as OOP assembly, VXDs, direct hardware access (in Windows!), and Ring0/Ring3 programming. Current price: US$45.
Developing Utilities In Assembly Language by Deborah Cooper: This book is small and inexpensive, as well as an excellent introduction to developing programs in assembly language. The utilities it teaches are rather mundane (DIRNAME, FILEFIND, TRAPBOOT, TRAPDEL, SAFE [disables FORMAT.EXE], CAPSLOCK, and ICU [displays cursor location]), but the techniques (such as key-trapping and TSR programming) are widely applicable, as is the very sensible way in which the programs are developed. Currently US$16.
Master Class Assembly Language by Many People: Excellent text, though most of the sample source code is on disk--so it's a tough read without a computer next to you. Beginners should by no means be turned away: the summation (about 35 pages) of assembly is one of the best I have seen, and may help clarify other texts or tutorials. From there is jumps straight into systems programming and covers topics such as disassembly (!), anti-virus programs, 486/Pentium optimization, generic code optimization, device drivers, data compression, and protected mode programming. All of this is written very succinctly, a very "no BS" approach that is refreshing in computer books these days...it very much so blows away PC Undergound (or whatever that book was by the guys who did PC Intern). Current Price: US$50 (ouch!)