Some people say that assembly language is hard to learn. People who say that
have probably never tried to learn it. Every programmer should know assembler
cause it allows you to control the computer at the lowest level. You know that
a computer is nothing but a microprocessor system. A microprocessor is a
programmable device that can be used to control other electronic devices.
You'll find them inside almost every electronic device such as microwave owens,
vcr's, cameras etc. The cpu is the same as the microprocessor. CPU = Central Processing Unit.
The cpu controls everything thats connected to you PC, keyborad, graphic card,
sound card, RAM memmory, the monitor(screen) etc. The way it controls them is
by sending them digital electronic signals wich can either be one or zero
(1 or 0). The commands that are used to program the cpu are one's and zeroes
called machine code language. To write machine code is very hard for human
beeings since they are binary (1 or 0) . It's alot easier if the commands had
names instead of codes and thats where assembler comes in. Assembler uses
mnemonics (like MOV for move) instead of codes and when the compiler
( used for making code executable) compiles the code it translates the
mnemonics into machine code.
All code no matter what language it was written in are compiled to assembler
language and then to machine language. Assembler is basically the same as machine
language. The only difference is that instead of typing "10001" as a command
you type "mov or jmp" or whatever. Because the programs are compiled to
assembler language it is pretty easy to look at a programs code using a debugger.
So a program is a bunch of commands for the cpu.
Basic is the easiest language to learn since the commands are plain english.
Therefor the code is written in plain english. Here is an example of that:
Print "enter your name"
Input a$
Print "hello",a$
This program ask the user to type his/her name. Then it prints on the screen
"hello" and the name the user just typed in. As you see the code is very easy
to understand. The only thing a "non computer user" would wonder about is "a$"
which is a variable used for strings. A variable is a reserved memory space
which can contain numbers or letters (chars). The $ is used to tell the
computer that the variable contain letters (in BASIC language). When one or
more letter are used in a variable the variable is called a string. This is
BASIC language.
Lets see what this means for the cpu- It output "enter your name"
- Then it checks the keyboard to see if any key has been pressed. If a key
has been pressed that letter is stored in a predefined place in memory (a$).
It will keep checking the keyboard until return has been pressed, that indicates
the end of the string. when return has been pressed it puts a "0" at the end
of the string (a$) in memory.
- Then it outputs "hello" and reads the letters (a$) stored in memory. It will
read the memory until it reached a zero which indicates the end of the string.
Then it will output the letters to the screen after hello.
As you can see its alot of things to do. To make things easier the bios and
dos has it own small programs written to do basic things like printing to the
screen and reading the keyboard. These programs are called "interrupts". The
interrupts are very handy when programming in assembler. At the end of this file
i have included a small interrupt list. But a more complete list is available
on the net on the randall server. The url is at the end of this lesson.
Go there and study. On that server is a complete assembler book.
BYTE, BIT AND WORD
^=EE button on your calculator ie 2^3 = 2*2*2=8
There are Binary numbers and decimal numbers and hexadecimal numbers.
The decimal number system uses the base of 10. The decimal number 123 works
like this:
1*10^2+2*10^1+3*10^0=123
The above is pretty easy to understand.
The Binary system uses the base of 2. It uses ones and zeroes (1 and 0). Here's the function of a 8 bit binary number. You read binary numbers right to
left. 0000 = 0*2^0+0*2^1+0*2^2+0*0^3+0*2^4=0
0001 = 1*2^0 = 1*1=1
0010 = 1*2^1 = 1*2=2
0100 = 1*2^2 = 1*4=4
1000 = 1*2^3 = 1*80011 = 1*2^0+1*2^1 =1*1+1*2 = 3
1010 = 1*2^1+1*2^3 1*2+1*8=10
If you are unfamilliar with the binary system read this a couple of times and
it all will become clear to you. Every 1 or 0 in a binary number is called a bit. The above numbers are 8 bit.
4 bits are called a nibble and 8 bits are called a byte. a 16 bits are called
a word. It is very painful to calculate binary for us humans but for a computer
its very easy since digital signal are either 2.5-5V (1) or 0-2,5V (0). We therefor
use the Hexadecimal system.
HEXADECIMAL SYSTEM
Hexadecimal numbers have 16 as a base and it also has 16 digits:
0 1 2 3 4 5 6 7 8 9 A B C D E F
1 = 1*16^0 = 1*1
11 = 1*16^1+1*16^0 = 16+1
1A = 1*16^1+10*16^0 =16+10
Ok i think you'll understand this by now, if you dont then pick up your old math
books and read them.
CONVERTING NUMBERS
Decimal to hexadecimal:
You divide the decimal number with 16. You will then get ie 123.90. Then you
multiply 0.90 with 16 to get the rest.
what is 1200 in hex?
1200/16=75 wich has the rest 0
75/16 = 4.6875 0.6875*16=11 (B in hex) ok the rest is B
4/16 = 0.25 0.25*16=4
1200 decimal is 4B0 in hex lets check that (B is 11 in decimal)
4*16^2+11*16=4*256+11*16=1024+176=1200.
Like i said in lesson #1 the Intel processor
stores the number backwards, so when searching for 4B0 you should reverse it
to B04. Since dos use 16 bit numbers wich is the same a 4 bytes you could add
a zero before the first number ie 04B0.
Alright lets move on with assebler and have a look at the Intel processor.
The Intel Processor:
The intel processor 8086 is the cpu used for PC systems. 8086 was used in the
first PC. Then came 8088 etc. Today Pentium and 80486 is used alot. But the
instructions are the same for all of these processors because the are from
the same "family". Newer processors have more instructions and more features.
For each new processor new commands are added and so on. The important thing
here is that they are 100% compatible with 8086 instructions. So thats the
processor we will learn how to program. When you have the basic knowledge of
assembler you can go on and learn specific features for the newer cpu's. Ok
lets have alook.
A processor has registers that are used for different things. One register is
used for caculations and another one is used for copying strings etc etc.
This is a map over the intel 8086:
MSB LSB
15 0
|----------------------------------------|
| A X | AX is 16 bit register, AL and AH
| AH | AL | is 8 bit registers. Same goes for
|----------------------------------------| BX BH Bl, CX CH CL, DX DH DL.
| B X |
| BH | BL | These registers are called common
|----------------------------------------| registers, I call them that.
| C X |
| CH | CL |
|----------------------------------------|
| D X |
| DH | DL |
|----------------------------------------|
| SI | SI and DI are string registers
|________________________________________| used for copying and comparing
| DI | strings ie password protections
|________________________________________|
| SP |
|________________________________________| SP and BP are stack registers
| BP | I'll get to the stack later
|________________________________________|
| CS | CS, DS, ES and SS are SEGMENT
|________________________________________| registers.
| DS |
|________________________________________|
| ES |
|________________________________________|
| SS |
|________________________________________|
|----------------------------------------|
| IP | Instruction Pointer
|________________________________________|
| |
| O D I T S Z A P C |Flags
------------------------------------------
Alright, thats how the 8086 cpu look like. As you can see theres alot of
register that you are unfamilliar with, but dont worry I'll cover them here.
SI and Di are string registers and are mainly used for string operation like
copying string, comapring string and moving string, you do remeber what a
string is don't you?
SP and BP are stack pointers. The stack is like abig pile of data. You can
put data on it and you can get data from it. The SP (stack pointer) points
to the current stack. You can imagine the stack as this. If you are stacking
magazines on the floor, the first magazine you put there will be the last
in the pile and the last magazine will be on top of the pile and the first one
you'll get when u pick a magazine from the stack. Now you have put 100 magazines
on the floor and created a stack. The stack pointer points to last magazine.
If you wanna get a magazine thats in the middle you just change the stack pointer
(SP) so it points to the magazine in the middle. Get it. You use SP to get the
data you want from the stack. I will not cover BP's function cause thats a little
out of this tutorials range.
Are you getting confused, dizzy or even ill. Just pick up your favourite Fender
and plug it into your favourite amplifier and play some guitar for a while and
it all will come clear to you. But beware that only good Rock and roll or blues
will do, anything else will get you even more confused. If this doesn't work
then read this file over and over until you get it. Theres alot of interesting
sites for newbyes on the net that covers assembler language. Sniff there and
learn as much as you can.
Segment registers (this will a little hard first to understand, be patient)
The 8086 memory configuration are divided in 64K blocks. The segment keeps the
adress to each 64K block. To move around in a block the instruction pointer
(ip) is used. The adress that IP points to is called "offset".
segment Offset (IP)
\ /
013F:0012
So the segment keep track of which 64K block you're in and the ip keeps track
on where in that block you're at (offset). The segment adress is depending on
the amount of memory that's free, this means that your program will be given diffrent
segment adresses if you change your memory (like loading another program). The
offset adress is always the same.
The segment register are CS, DS, ES and SS. Here's an explaination of them:
CS Code segment this is the segment that keeps the actual code
DS Data segment this is the segment that holds variabled and strings etc
ES is the same as DS and is used when ie comparing to strings
SS is the stack segment.
IP is the instruction pointer and it points to the next instruction to be
executed.
Flags
Can be either one or zero (on or off)
C = Carry flag is 1 when the result of an calculation is greater than 16 bit
number.
P = parity flag
A = auxilary carry flag
Z = Zero flag, this one becomes zero or one depending on diffrent instructions
ie when comparing two string using cmpsb the zero flag is 0 if
the string are equal and Z is 1 if they are not equal.
S = sign flag (+ or -)
O = overflow flag
I = interrupt enable
D = direction flag
T = trap
You dont have to remeber all these flags. The most important flag is the zero
flag. Ie when comparing the number of lives with 0. You might change that so
it never jumps to the "game over" code.Ok thats it. Now we will move over and study diffrent mnemonics.
Mnemonics (OP Codes)
I'll not cover all the instruction mnemonics here, if you want a complete list
i suggest that you visit Intels homepage and look for instruction lists.MOV Move ie MOV AX,0011 Loads AX with 0011
MOVSB Moves a byte of string source is put in DS:SI destination adress is
put in ES:DI.
MOVSW Same as above but moves a whole word instead of a byte
DEC decreases a register with 1 ie if AX is 0004 the DEC AX will decrease
ax so AX will be 0003. (common for decreasing the life varaiable in
games)
INC Increases a register with 1
CALL call a subroutin (a small part of code in a program)
CLD clears the D flag
CMP Compare ie CMP AL,61 checks if AL is 61 (the letter 'a' in ascii)
CMPSB compare a byte in two strings. Source string is put in DS:SI and destination adress is put in ES:DI.
CMPSW Same as above but compares a word instead of a byte.
DIV Divide ie DIV AX,0012 divides ax with 0012
MUL Multiply
INT Interrupt executes an interrupt ie int 21 executes interrupt 21
PUSH Pushes a value to the stack
POP Returns a value from the stack
NOP No operation doesn't do anything
REP Repeats instruction until CX=0
REPE Repeat while equal ie REPE CMPSB comapres each byte in a string as long as they are equal.
REPZ Repeat while zero flag i 0 (off)
RET Return from subroutine
SCASB Scan a string for a specific byte that is put in AL. The adress of the string to be scanned is put in ES:DI
STOSB Stores the byte in AL at ES:DI, DI is automatically increased.
STOSW The same as above but it stores a word instead of a byte.
TEST A logical comparsion with flag settings. You can see it as a cmp for now.
XOR Exlusive OR. See below:
1001
1100
result 0101 conclusion, XOR allows only on of two sums to be 1 if the result is gona be 1. XOR 1,1 will give 0. XOR 0,0 will give 0
XOR 1,0 will give 1. This instruction is used to zero a register ie XOR AX,AX. Lets say AX=1001
XOR 1001
1001
Result 0000
JZ Jump if zero
JNZ Jump if not zero
JE Jump if equal
JMP Jump anyway
OK you dont have to understand the all of the above instructions. You will see
how they are used in programs that we will be studing (for cheating and perhaps
cracking). This is not all of the instruction set that Intel cpu's have. Search
the web if you bump into an instruction that's not mentioned here. Also search
the web and read all assebmly tutorials you'll find because theres alot of ways
to learn assembler and I only give you the basic so we can debug and cheat
in games.The Interrupts
Lets study some interrupts. As I said an interrupt is a small program. We will
use dos interrupt 21 services here. Interrupt 21 means that we want to use
some of the services (small programs) that are available in interrupt 21.
All of the services a specific interrupt offers are stored in an interrupt
vector. From that vector we can choose what service we want. Lets say we
wanna type something to the screen. The way to choose service from Int 21 is
to put the number of service in AH.Function AH# IN DATA
Output to screen 02 DL=chars
This is what an interrupt listing at the randall server would tell us when we
are looking under INT 21 list. The above means that if we wanna print something
to the screen we must put 02 in register AH. The letters that we wanna store
has to be put in register DL. So to print the char 'A' on the screen we will
write the following in our program:MOV AH,02 ; Put the service number in AH
MOV DL,41 ; ASCII number for 'A' in hex
INT 21 ; Execute interrupt
Here is a small program that uses Int 21 services to print 'hello world' on
screen.--------------------
DOSSEG ; WE ARE USING DOS SEGMENT
.MODEL SMALL ; DEFINES HOW MUCH MEMORY OUR PROGRAM NEEDS
.STACK 100h ; RESERVES 100h BYTES FOR THE STACK
.DATA ; UNDER THIS WE WILL PUT OUR VARIABLES
HELLOMESSAGE DB 'HELLO WORLD',13,10
.CODE ; THIS IS WHERE OUR PROGRAM STARTS
MOV AX,@DATA ; PUT THE ADRESS TO THE DATA IN AX
MOV DS,AX ; AND THEN MOVE IT TO DS
MOV AH,9 ; DOS OUTPUT FUNCTION
MOV DX, OFFSET HELLOMESSAGE ; PUT THE OFFSET TO HELLOMESSAGE IN ; DX
INT 21 ; PRINT THE MESSAGE ON THE SCREEN
MOV AH,4C ; INT 21 FUNCTION TO QUIT AND EXIT TO DOS
INT 21 ; AND EXECUTE IT
END ; END THE PROGRAM
Ok type this in a text editor like dos "edit" and save it as .asm.
Then complile it with TASM or any other ASM compiler. The compiler creates a
.obj file. You have to link the .obj file to get the executable file.
If you save the file has hello.asm. This is what you'll do:
TASM.exe Hello.asm
Tlink.exe Hello.obj
And then you'll have the executable file hello.exe.
You can see the difference between BASIC language and assembly language right
here. In BASIC we wrote 3 lines of code. But in assebler we can control everything
and get it exactly as we want it.
Heres a small interrupt list over service 21:Input from keyboard
with echo to screen 01 AL contains the chars after the int has been
executed
Output to screen 02 DL=chars
Show string 09 DS:DX shall point to the string DS should
point to the segment adress and DX to the
offset adress
Create file 3C CX=attributes DS:DX pointer to ascii string
(name of the file). Returns AX=filehandle or
return code
Open file 3D AL=access and sharing modes DS:DX > Ascii
string. Returns AX=filehandle
Read file 3F BX=filehandle, DS:DX > Ascii string. Returns
AX=read bytes or return code
Write to file 40 BX=filehandle, CX=# of bytes to write, DS:DX
> ascii string. Returns AX=bytes written or
return code
Filehandle:
0 = In device normally the keyboard
1 = Output device uaslly the screen
2 = Error message usually screen
3 = serial, COM 1
4 = paralell LPT1
Lets write another program with the information above.----------------------
dosseg
.model small
.stack 200h
.data
password db 'fender','$' ; The strings and data we will use
login db 100 dup (0)
rig db 'correct','$'
.code
mov ax,@data
mov ds,ax
mov ah,3fh ; Read function from service 21
mov bx,0 ; File handle 0 = keyboard
mov cx,3 ; just check the 3 first letter
mov dx,offset login ; Offset adress in DX
int 21h
and ax,ax ; See if anything was typed
jz fin ; No? so end
mov cx,ax ; Move read bytes in CX
mov ax,seg login ; Put segment adress in AX
mov es,ax ; So we can put it in ES
mov ax,seg password ; Same here but another strings segment
mov ds,ax ; And we store it in DS
mov di,offset login ; Do the same for offset adress for
mov si,offset password ; Both strings
repe cmpsb password,login ; Compare them
Je equals ; Equal the go to Equal
fin: ; End program function
mov ah,4ch
int 21h
equals: ; Subroutine for strings equal
mov dx,offset rig
mov ah,9
int 21h
ret
end
This is a simple password checking program. @DATA is a predefined variable
that holds the adress to the DATA filed we reserved in the beginning of the
program. you may be wondering why we first read the @DATA in AX and then into
DS, cant we just read it into DS by MOV DS,@DATA. No we can't. You can't put
values in DS or ES directly, you have to read them into a register and then
into the DS or ES.Now I suggest you either buy a book on this subject or search the net for
some more information about assembly language. Remeber that
experimenting is important.
Alright Brothers and Sisters I'll leave the assembler tutorial here and move
on to Softice.
SOFTICE 2.62 DOS Version
It's extremly important that you dont use SOFTICE 2.8. Alright have you
installed it. Then load it into your config.sys. When dos has loaded press ctrl-D
and you're in. Great isn't it. What is it?
Softice is a debugger that can be used to see what a program is doing.
For example if load one of the programs that we wrote in the assembly tutorial
you will see the code as it executes. That's why we use a debugger to make
cheats and cracks. The only thing that is difficult is to FIND the code we
are seeking. If we wanna change the # of lives in a game we have to find
exactly where the lives are stored. Once we find that place we can add that value.
This is what we'll do at the end of this lesson.
OK here is what you should see in softice.
The registers at the top. The Data window and the Code window.
At the bottom you enter your commands. Read the Whole softice manual before
proceeding.
Read it? Great!
What we will do now is to change the first program that we wrote. The one that
printed 'hello world' on the screen. You should if you have read the manual
you should know all about breakpoints and how you change code in memory etc
here's what we'll do.
1 Load the program into softice
LDR Hello.exe
2 Have a look at the code that follows you should recognize it.
3 Put a breakpoint on interrupt 21, write to file or device
BPINT 21 AH=40
4 Let the program execute, (it will stop when instruction int 21 AH=40 comes
up)
5 Lets see whats in DS:DX
D DS:DX and look at the data window there is our 'hello world' string.
6 Now lets change that.
E
Now your in the data window and you are able to change the string to
whatever you want. Try and see. When you're done changing the string just
hit enter and you'll see that the program prints your changed string instead
of 'hello world' that it was supposed to do. Great huh!
Try to debug other program and change their output, you'll learn alot by doing
that. Cracking is also something that is both funny and educational.
I promised a cheat for a real game didn't I.
AQUANOID
Aquanoid is a breakout game which is available free on the net (happy puppy,
download.com etc). We will change the number of lives that the programmer
gave us to something that we think is more suitable :).
Ldr the aqua.exe
Now lets think!
Everytime we miss the ball the lives decreases. We just need to find where the
# of lives is stored. How? Beats me.
THE END
No just kidding, the summer is affecting me in some strange way. Where were we,
yeah right we are looking for a place in memory that changes everytime we miss
the ball. And it should only change by decreasing or increasing by 1.
Here's what to do.
Snap_save the whole DS block. Snap takes a snapshot of that register so you
can compare it later and see what has been changed. Play the game until the
ball is going off screen (when you just missed it). Switch to Softice and do a
snap_save of DS register:
Snap s ds:0 ds:ffff
Now switch back (ctrl-d) and loose one life. When the you've lost a life switch
back to softice and do a snap_compare:
Snap C
A row of numbers will now display all the changes. Since you had 6 lives from
the start you will have 5 after you died. The lives can change in two ways it
can either start from 0 or 1 and increase until it reaches 5 or 6, or it could
start from 6 and decrease down to 1 or from 5 to 0 which is most likely.
Look for these changes. If you find a place that changes from 6 to 5 write
the adress down. Remeber that after the adress is the old value and after that
is the new value so this is what you are looking for:
xxxx:xxxx 05 04
xxxx:xxxx 06 05
Now theres alot of values changes in every second thats why you wanna snap_save
the DS register just before you die (in the game that is). Hey look there's a
change from 06 to 05. Write that adress down and keep looking. Keep scrolling
through the list, Yeah I know it takes 1-3 minutes but thats alright i think.
This is the funny part where you try to locate the code setting up diffrent traps
that you can think of. In this case we used the command snap_save. In another
case we might use a breakpoint.
As you can see there was only one change from 06 to 05. Remeber when i told you
that the segment adress a program was given was depending on the amount of available
memory and what programs that are loaded (memory config.). Therefor my segment
adresses will not be the same as your segments.
So the adress is:
CS:0004
Lets view that adress while we keep losing lives.
D cs:0004
As you can see it decreases every time we lose a life. Lets change the value
on that adress.
E cs:0004 change it to whatever you fancy ie 11h=17 lives.
Keep on playing the game and notice how many lives you have, nothing
happens the first 11 times you loose, but then it ticks down. Lets find the
decreaser.
BPM CS:0004 W
The W means Write only. The program will stop when something is written at
cs:0004. Loose another life and BANG!! you're in softice again facing the code
that decreases the number of lives. The instruction looks like this:
CS:39A9 FF8C44AC DEC WORD PTR [SI+AC44],+00
Replace that with a NOP.
A CS:39A9
CS:39A9 NOP
Play some and you'll see that the number of lives is always the same and is
not decreasing anymore. Great HUH.
OK lets make this cheat a permanent one. See the letters and number before the
instruction (mnemonics) the instructions we changed was:
FF8C44AC DEC WORD PTR [SI+AC44],+00.
we replaced it with NOP. Lets search using our hex editor the string
FF8C44AC, two hits lets just change the first. Replace the FF 8C with 90 90
(90 is NOP in machine code). Save the file and wow it works we made a cheat.
Feels good doesn't it. Well that's about it for now. See you in lesson #3.
All opinions on my work are welecomed. Please feel free to ask questions at
my E-mail indian_trail@hotmail.com
AQUANOID is available from
Search for aquanoid
Here you'll also find more shareware games and appz to crack and reverse enginner.
Randall server
Here is the number one assembly resource on the net. Study as much as you can
Ok I'll se you in lesson #3 where we will crack/cheat Pinball Fantasies to be released at the end of september '97 See you then
Indian_Trail