Assembly and Cracking From the Ground Up
An Introduction by Greythorne the Technomancer

An Introduction to Hexadecimal and Binary Numbers


In general, the math of the day is base 10 arithmetic (decimal), and rather commonly known is the affinity of machine designers to use base 2 (binary).

Binary being the chosen method since simply on and off are easy terms in electronics - and the model fits nicely into 1's and 0's.

Somewhere along the line, someone got the idea that counting in binary was a little tedious for man and decided to make it look a little more like the decimal we are used to - less digits to contend with in everyday arithmetic, but still a power of 2 so that the binary model can be easily converted back and forth.

So here comes hexadecimal (base 16) which is used in our pc's and octal (base 8) which is the primary method of data display for unix based platforms (generally mainframes in the past but publicly useful now since desktop machines are as powerful as many of yesterday's mainframes.)

What does that mean to the cracker or the assembly programmer? EVERYTHING. Without an understanding of how hex works, and the conversion between it and binary, it is impossible to debug (reverse-engineer) any real program.



For any number system, the digits always follow a simple rule, whatever base (lets call it 'B') the digits number from a starting point (zero in any system more recent than the ancient roman numerals which didn't have one) to one less than the base (B-1).

What this means is simple, for base 10 we have ten digits 0-9. That means that in binary, we have 0 and 1, and for base 16 we have 16 digits. For simplicity sake, we use our alphabet to cover for the extra digits we don't have in our number system.

In base 8 our digits are simpler... 0-7, but for base 16, our digits are:

0 1 2 3 4 5 6 7 8 9 A B C D E F


(digits having values of 0-15 when converted to decimal)

and it continues just like in our system:

... E F 10 11 12 ... 18 19 1A 1B 1C 1D 1E 1F 20 21 22 ...

so 10 in hexadecimal is really 16 to us, and 20(hex) is really 32,
30h is 48, 40h is 64 and so on.

This isn't that hard to grasp when you think about the fact that binary
works the same way...

0 1 10 11 100 101 110 111 1000...

So
10 (in any base) is always equal to B, the base itself - That is to say 10 (binary) is really 2 in decimal, 10 (octal) is really 8 in decimal and so on.



Okay I can count them on my fingers (if I had 16 of them...), but is there a faster way to figure out what A9 (hex) is in our base 10?

Glad you asked :)

For computers, continuous addition may be handy to get the result considering the speed of program loops, but for people that can be a job not worth the time in only a few groups of digits.

So we find another way...

We already know that base 16 is in powers of 2, which isn't that hard to deal with when we use binary to get our job done. As a matter of fact, once you get used to converting bases to binary first, it makes it much easier to convert any one base to any other since no funny numeric rules are required. So with that - we do all base conversions in two steps, convert the first base to binary, and then convert binary to the new base.



Converting From Decimal To Binary

This is what I tend to refer to as 'remainder mathematics'.

Basically, instead of repeated addition, in other words, counting up to a hex digit, we use repeated division to speed up the process.

Our friends DIV and MOD

In computers, data is stored in whole numbers, either as a long set of digits in any base with the location of the decimal point (known more generally as the radix point for non-decimal bases) or as parts of a fraction and an offset (numerator, denominator, and an added amount stored in three separate locations). There are of course other methods with other mathematics such as imaginary numbers, but that is not the point of this lesson.

In keeping with this whole number situation, division is done the way we learned how to do it by hand - one step at a time and recording the remainder at each step.

For any division operations, we have 2 results that make up the answer, the main value of the result, and the remainder. DIV is what we call the main part of the answer, and MOD is the value of the remainder.

DIV we are familiar with, MOD on the other hand has some interesting qualities that we use in computer programs - specifically in randomization and menu scrolling, but that is not for this section.

Basically, in shorthand we say 45 MOD 4 instead of 'divide 45 by 4 and get the remainder'and we write it as 45 % 4 since the percent sign is used as the MOD symbol in high level computer languages such as C.

so in this case, 45 / 4 = 11 with a remainder of 1

DIV=11 and MOD=1

so we say 45 % 4 = 1 (45 mod 4 equals 1)



Well, so now we can go about the business of converting
base 10 to base 2 (it isn't that bad, don't you worry)

45 in binary is 101101 which is a palindrome, which is not
a good example for this exercise since left vs. right is
important when converting bases, so i choose to work on 47.

First I will let you know that 47 is 101111 in binary,
and now I show you how to deduce that mathematically,
start to finish just this once for easy understanding.

Basically, we repetitively divide our number by the base of 2,
and keep the mod value (remainder) as the next binary digit.

47 / 2 gives DIV=23 MOD=1, bin string = 1
23 / 2 gives DIV=11 MOD=1, bin string = 11
11/2 gives DIV=5 MOD=1, bin string = 111
5/2 gives DIV=2 MOD=1, bin string = 1111
2/2 gives DIV=1 MOD=0, bin string = 01111
1/2 gives DIV=0 MOD=1, bin string = 101111

Notice that it builds from right to left, exactly the
opposite from the way we read English, this is a feature of
our Arabic numeral system that increases value from right
to left. In other words (okay, in English...) the ones are on the
right, and the 10's are on the left, the 100's place is to the left
of the 10's, and so on.

This might seem silly at first, but machines require that
level of instruction to do what we have known so long
that sometimes we forget the basest level of what it
is we are subconsiously doing when we see a number such
as 2041. We take it in as a whole, but innately know that
the place value of the 2 makes it of much more value
than the 4 or the 1. Computers require the steps.

Writing a computer algorithm to do this is now rather simple.

We can write it in pseudocode (a good idea to write a little
bit of English and draw a picture before any programming)

Looking at our above example, notice that the way we
can tell if we are done, is if DIV=0.

So well, here is our set of steps:

1. Get value from user or program, call it DIV
2. Divide DIV by 2, leave result in DIV and remainder in MOD
3.
Store MOD in string as next digit
4. Repeat the actions until (and including when) DIV=0



Converting From Binary To Hexadecimal

This part is very easy.

2^4 = 16 (that is to say, 2 to the 4th power is 16)
That means in a string of binary numerals, every 4 in a row
make up one digit in base 16.

Again, we go from right to left...
(Using our example of 47)

101111

Seperate into groups of four

10 | 1111

(note that if you want to get octal, separate by threes
instead of fours since 2^3 = 8)

Simply enough we can make a simple
conversion chart

0000 = 0 ........ 1000 = 8
0001 = 1 ........ 1001 = 9
0010 = 2 ........ 1010 = A
0011 = 3 ........ 1011 = B
0100 = 4 ........ 1100 = C
0101 = 5 ........ 1101 = D
0110 = 6 ........ 1110 = E
0111 = 7 ........ 1111 = F

looking again at our number (10 | 1111) ,
0010 is a 2 in hex and 1111 converts to F

so simply, 101111 is 2F in hexadecimal

or more thoroughly..

47 (dec) = 101111 (bin) = 2F (hex)

While we are at it...

Separating the binary digits in threes from right to left...

101 | 111

111 = 7
101 = 5

so in octal, the result is 57

47 (dec) = 101111 (bin) = 2F (hex) = 57 (oct)

In programming terms, it is easily expressed in this way...

Starting with end of the binary string (on the right):

For every three digits, moving to the left, (four digits to get hex)
we make one digit in the final base.

Now that you know how these bases interrelate, it will be
much easier for you to read assembly code, and hopefully
begin to understand what you are reading in the near future.



+gthorne'97