Assembly and Cracking From the Ground Up |
An Introduction by Greythorne the Technomancer |
An Introduction to Hexadecimal and Binary Numbers |
In general, the math of the day is base 10 arithmetic (decimal), and rather commonly known is the affinity of machine designers to use base 2 (binary). Binary being the chosen method since simply on and off are easy terms in electronics - and the model fits nicely into 1's and 0's. Somewhere along the line, someone got the idea that counting in binary was a little tedious for man and decided to make it look a little more like the decimal we are used to - less digits to contend with in everyday arithmetic, but still a power of 2 so that the binary model can be easily converted back and forth. So here comes hexadecimal (base 16) which is used in our pc's and octal (base 8) which is the primary method of data display for unix based platforms (generally mainframes in the past but publicly useful now since desktop machines are as powerful as many of yesterday's mainframes.) What does that mean to the cracker or the assembly programmer? EVERYTHING. Without an understanding of how hex works, and the conversion between it and binary, it is impossible to debug (reverse-engineer) any real program. For any number system, the digits always follow a simple rule, whatever base (lets call it 'B') the digits number from a starting point (zero in any system more recent than the ancient roman numerals which didn't have one) to one less than the base (B-1). What this means is simple, for base 10 we have ten digits 0-9. That means that in binary, we have 0 and 1, and for base 16 we have 16 digits. For simplicity sake, we use our alphabet to cover for the extra digits we don't have in our number system. In base 8 our digits are simpler... 0-7, but for base 16, our digits are: 0 1 2 3 4 5 6 7 8 9 A B C D E F (digits having values of 0-15 when converted to decimal) and it continues just like in our system: ... E F 10 11 12 ... 18 19 1A 1B 1C 1D 1E 1F 20 21 22 ... so 10 in hexadecimal is really 16 to us, and 20(hex) is really 32, 30h is 48, 40h is 64 and so on. This isn't that hard to grasp when you think about the fact that binary works the same way... 0 1 10 11 100 101 110 111 1000... So 10 (in any base) is always equal to B, the base itself - That is to say 10 (binary) is really 2 in decimal, 10 (octal) is really 8 in decimal and so on. Okay I can count them on my fingers (if I had 16 of them...), but is there a faster way to figure out what A9 (hex) is in our base 10? Glad you asked :) For computers, continuous addition may be handy to get the result considering the speed of program loops, but for people that can be a job not worth the time in only a few groups of digits. So we find another way... We already know that base 16 is in powers of 2, which isn't that hard to deal with when we use binary to get our job done. As a matter of fact, once you get used to converting bases to binary first, it makes it much easier to convert any one base to any other since no funny numeric rules are required. So with that - we do all base conversions in two steps, convert the first base to binary, and then convert binary to the new base. Converting From Decimal To Binary This is what I tend to refer to as 'remainder mathematics'. Basically, instead of repeated addition, in other words, counting up to a hex digit, we use repeated division to speed up the process. Our friends DIV and MOD In computers, data is stored in whole numbers, either as a long set of digits in any base with the location of the decimal point (known more generally as the radix point for non-decimal bases) or as parts of a fraction and an offset (numerator, denominator, and an added amount stored in three separate locations). There are of course other methods with other mathematics such as imaginary numbers, but that is not the point of this lesson. In keeping with this whole number situation, division is done the way we learned how to do it by hand - one step at a time and recording the remainder at each step. For any division operations, we have 2 results that make up the answer, the main value of the result, and the remainder. DIV is what we call the main part of the answer, and MOD is the value of the remainder. DIV we are familiar with, MOD on the other hand has some interesting qualities that we use in computer programs - specifically in randomization and menu scrolling, but that is not for this section. Basically, in shorthand we say 45 MOD 4 instead of 'divide 45 by 4 and get the remainder'and we write it as 45 % 4 since the percent sign is used as the MOD symbol in high level computer languages such as C. so in this case, 45 / 4 = 11 with a remainder of 1 DIV=11 and MOD=1 so we say 45 % 4 = 1 (45 mod 4 equals 1) Well, so now we can go about the business of converting base 10 to base 2 (it isn't that bad, don't you worry) 45 in binary is 101101 which is a palindrome, which is not a good example for this exercise since left vs. right is important when converting bases, so i choose to work on 47. First I will let you know that 47 is 101111 in binary, and now I show you how to deduce that mathematically, start to finish just this once for easy understanding. Basically, we repetitively divide our number by the base of 2, and keep the mod value (remainder) as the next binary digit. 47 / 2 gives DIV=23 MOD=1, bin string = 1 23 / 2 gives DIV=11 MOD=1, bin string = 11 11/2 gives DIV=5 MOD=1, bin string = 111 5/2 gives DIV=2 MOD=1, bin string = 1111 2/2 gives DIV=1 MOD=0, bin string = 01111 1/2 gives DIV=0 MOD=1, bin string = 101111 Notice that it builds from right to left, exactly the opposite from the way we read English, this is a feature of our Arabic numeral system that increases value from right to left. In other words (okay, in English...) the ones are on the right, and the 10's are on the left, the 100's place is to the left of the 10's, and so on. This might seem silly at first, but machines require that Converting From Binary To Hexadecimal This part is very easy. 2^4 = 16 (that is to say, 2 to the 4th power is 16) That means in a string of binary numerals, every 4 in a row make up one digit in base 16. Again, we go from right to left... (Using our example of 47) 101111 Seperate into groups of four 10 | 1111 (note that if you want to get octal, separate by threes instead of fours since 2^3 = 8) Simply enough we can make a simple conversion chart 0000 = 0 ........ 1000 = 8 0001 = 1 ........ 1001 = 9 0010 = 2 ........ 1010 = A 0011 = 3 ........ 1011 = B 0100 = 4 ........ 1100 = C 0101 = 5 ........ 1101 = D 0110 = 6 ........ 1110 = E 0111 = 7 ........ 1111 = F looking again at our number (10 | 1111) , 0010 is a 2 in hex and 1111 converts to F so simply, 101111 is 2F in hexadecimal or more thoroughly.. 47 (dec) = 101111 (bin) = 2F (hex) While we are at it... Separating the binary digits in threes from right to left... 101 | 111 111 = 7 101 = 5 so in octal, the result is 57 47 (dec) = 101111 (bin) = 2F (hex) = 57 (oct) In programming terms, it is easily expressed in this way... Starting with end of the binary string (on the right): For every three digits, moving to the left, (four digits to get hex) we make one digit in the final base. Now that you know how these bases interrelate, it will be much easier for you to read assembly code, and hopefully begin to understand what you are reading in the near future. |
+gthorne'97 |