The Idea Of IDA (A Small Primer For IDA Newbies) By Gij Hi there, gij here, i'm guessing most of you are reading this because you've heard about IDA and are thinking "why is it better then wdasm?", or you have already gotten IDA, but found it to complicated to use. This is being written to help you out with your first steps in using IDA. Hope it helps... The Disassembling Challenge --------------------------- Most software today is written using high level languages: C, C++, VB, JAVA, Delphi, etc'. These are ( generally speaking ) compiled languages that turn high-level code into the low-level code form that the computer understand, Assembler. We need to make a distinction between a Decompiler and a Disassembler. A decompiler takes a binary file generated by a compiler and try to reverse it into the high level language the file was compile from. So for example a C++ decompiler would take an .exe made by Visual C++ or any other compiler and turn out a C++ source file. A good decompiler is very hard to make. A Disassembler on the other hand, will take a binary file that could have been written in almost any language, and disassemble into an Assembler source file. A good disassembler needs to be able to fairly accurately distinguish between data and code. a good decompiler needs to do that AND be able to understand what code construct in the original high-level language generated this code. so what is IDA? neither. it's a hybrid between the two types of programs. IDA stands for "Interactive Disassembler", and it is. but it also has a of the characteristics of a decompiler, namely it's FLIRT feature. Those of you who have programmed in c, or indeed ever try to debug a windows program, know that every program uses some functions supplied by the compiler or as part of the Win32 API. "printf()" is one example, all c programs that call printf, and are compiled by the same compiler have the same piece of code inside them. In the compilation process the compiler links in the code for the "printf()" function from it's included libraries. This opens up an opportunity for a disassembler to recognize the code pattern of a particular function and pin a useful name on it. This also saves us from the embarrassment of tracing through a function for 20 minutes only to discover it's some compiler's variant of "fseek()". It saves us time by helping us understand what the program is doing easily by letting us see what functions it calls. This is exactly what FLIRT does, the FLIRT libraries that come with many signatures of functions from various compilers, not only for C but also for Pascal, Delphi and others. in a way You get the best of both worlds, IDA let's you reverse any binary file from any language, while taking advantage of the shared nature of compiler libraries. You can get more information on FLIRT at the IDA home page. IDA vs. Wdasm -------------- Those of you who have been using Wdasm up till now, will need to make a slight switch in attitude when moving over to IDA. Wdasm, takes in a file and gives out a disassembly. that's it. IDA is INTERACTIVE. this means the disassembly you get is very much editable, you can change code to be marked as data, and the other way around. you can add comments, see cross-references ( very useful, we'll get to it later ) , and probably a whole lot more. That's probably why most people consider it more complex, or heavier then Wdasm. in IDA, you have to do more work, but you can accomplish much more. it's possible to completely reverse an application inside IDA, generate a source and have it compile to a byte-identical exe file. ( I'm not saying it's easy though ). IDA by itself adds comments to some API calls or INT's, And using some tools availible, you can add your own comments to the databases. The Use of IDA -------------- There are two version of IDA you can use, idaw.exe and idax.exe. idaw.exe is a windows exe, while idax.exe is a dos file. I heard it said it's better to use idaw.exe because as a windows program it has less trouble with memory. I personally use idax.exe because i've found that with international versions of windows idaw.exe tends to have problems when typing in text. One Important thing to remember, and thank you to whatever kind soul on #cracking who helped me understand this when i was starting out with IDA , is that once you feed IDA an exe, dll, or whatever file, upon exit it saves a Disassembly database with the extension .IDB, when you wish to continue work on the disassembly, you do not reload the binary file, but tell IDA to load the .IDB file. once you've generated the .IDB file for a program you no longer need the original EXE, all changes to the disassembly are made an stored in the .IDB file. OK, let's get down to some real-life use of IDA. Actual Use Of IDA ----------------- You can load a program into IDA in 2 ways: 1) on the command line: "idax.exe c:\target\program.exe" 2) through the file dialog which appears when starting IDA without a filename argument. Once you load the program into IDA, it will show you a panel of options to choose from, the default options are chosen for you according to the file format you loaded. The other options are either advanced ( meaning, if you need them, you know what how to use them ), or self-explanatory. IDA then goes through 2 phases: 1) Actual Disassembly, including separation of program into code and data areas. This is not fool-proof, IDA does make mistakes some times, don't expect to run a program through IDA and have a 100% percent accurate disassembly. IDA does come as close as anything i've ever seen. At this phase, it also marks functions and analyses their stack arguments, assigns label names to jump and call destinations, separates the file into segments if needed, and probably more. 2) If IDA recognize the file as compiled by a supported compiler, it will load a signature file, and try to assign names to as many function as possible. After IDA has finished doing it's work, you can see some of the results of the disassembly, besides the main screen, where the disassembly is shown, the is also the status window, where information about what IDA is doing is shown, and other screen you can activate to see information about the disassembly. To switch between the window you can use the "Windows" menu, or F6 and Shift-F6 to go back and forth between windows. The Disassembly window can be divided into 3 parts: 1) segment:offset ( to the left ) 2) code/data ( Taking up most of the width of the screen ). 3) Comments, These are actually part of the disassembly, but sometimes contain Important information like Cross-references and API call info. All in All, the disassembly screen should look familiar if you've ever written asm code. An IDA Disassembly is divided into 3 types of areas/entries: 1) Code. 2) Data. 3) Unexplored. You can distinguish between code and data just like you would in a normal asm file. data is preceded by some sort of data specifier: db,dw,dd,dt,dq.... You can detect unexplored code by looking at the left part of the screen, unexplored areas are marked by a "greyed out" segment:offset. Unexplored areas are the areas to which no reference is made in code ( it's not jumped or called to, or incorporated in to the code flow of the program in any other way. ), or as data ( that area is not read or written by code.) These places usually ( always? ) contain 0's, and are usually not relevant to the code, unless IDA missed marking some area as code, which references this area. This illustrates another point, with any change you make to the disassembly IDA checks if it can deduce anything else about the rest of the programs from the changes you've made. so for example, if you mark a data areas as code, IDA will look at it, and if it sees a jump to some unexplored space, it will mark that area as code, same with data, if that new code you've marked reads or writes to a previously unexplored address, IDA will mark that area as data. To change the markings of an area, you select it using the mouse, then press one of 3 keys: 1) 'c', marks that area as code. 2) 'd', marks that area as data, if you do not select an area, but place the cursor at a line, and press 'd', you will cycle through the data specifiers available. if you mark an area, you can use 'a' to declare it a string, which will make IDA automatically give the string name, and show it as "123", instead of "db 31h,32h,33h" for example. you can also use '*' to make it an array of data, which will pop up a dialog where you can set various options concerning the array (this is the same as marking an area and pressing 'd' ). 3) 'u', marks that area as unexplored. Names ----- Names are the most Important part of a disassembly, it's the difference between 'loc_0_200' and 'Show_Splash_Screen' that makes it possible to understand a program bit by bit. To change the name of a label, or create a new one, position the cursor at the desired line and press 'n', a dialog will appear which will let you enter or modify the label name. Cross-references And Information Screens --------------------------------- I've already mentioned Cross-references in this article a few times, it's only fair that i should explain what they are to those who do not know. a reference is created when a certain piece of code uses another area in some way, this could be a call ( one piece of code calls another ), a jump , or a read or write operation to a data location. IDA keeps track of this references, and maintains a table of cross-references for every label in the disassembly. for example: seg000:0200 seg000:0200 loc_0_200: ; CODE XREF: _main+1EAjump tablesj seg000:0200 cmp word_1B90_14C, 0 seg000:0205 jz loc_0_20A seg000:0207 jmp loc_0_2F2 This means that the label loc_0_200 is referenced as CODE ( jump or call, thus "CODE XREF" ) by another location in the program, to see the list of places that reference this location, position the cursor at the line of the label, and select "Cross references" from the "View" Menu, a window should appear with a list of locations, you can jump to that location by selecting it in the window and pressing ENTER. IDA also keeps tabs of your little "field-trips" inside the code, so after you've traced 12 function deep into the code you can press ESC to get back to the place you where before your present location. Data Cross Reference are Much the same, no need to explain them separately. Cross references are very Important because they give us a little info about a piece of code or data by telling us it is related to another piece of code. so potentially, if you've understood one piece of code, you can us it as an "anchor" into other locations in the program. I should take a minute to tell you about the other items in the view menu, even though they are fairly self-explanatory: 1) Disassembly: shows you the Disassembly screen for the current file. 2) Functions: shows you a list of location marked as function entry points. 3) Names: shows you a complete list of FLIRT Recognized functions, and locations marked as strings. 4) Signatures: shows you the FLIRT signature files currently loaded and applied to the file, you can manually add other signature files here. 5) Segments: shows you the created segments 6) Segment registers: it seems to show you the changes in, and the value of, each segment register between any range of locations in the file. 7) Selectors: This Is Pmode related, won't show anything normally, perhaps you need to add them manually. probably used when disassembling extenders, or something else that's fairly dodgy. 8) Cross references: you know. 9) Structures: You can define structures and assign them to data areas, this will show the currently defines structs. ( see how advanced IDA is? ) 10) Enumerations: I think Enum's are used to give Names to numbers, or maybe array elements. i've never had to use it, normally, neither would you. Comments -------- There are two types of comments: repeatable and regular. 1) Repeatable: made by pressing ';', when adding this sort of comment to a function, the comment will appear next to any call to that function anywhere in the file. 2) Regular: when adding this sort of comment, it will appear only once, at the location you've entered it. Marking Positions ----------------- Sometimes after disassembling for a while, you want to mark a location it could be the beginning of the data strings section, an Important function, your lucky number as an offset, whichever. IDA has a feature that let's you mark positions, and give the mark a name. to plant a mark, got the location you wish to mark, and then press Alt-M, or select "Mark Location" from the "Navigate" Menu, this will pop up a window, press enter on an empty line ( if there are no marked location, they will all be blank ), a dialog will appear asking what name you wish to give to that mark, this is NOT like generating a label for a location, only the marked positions table will show that name. To go to a marked position, select "Jump To/Marked Position" out of the "Navigate Menu", a list of marked positions will show up, choose the one you want to jump to and press ENTER. Getting Around -------------- Sometimes You need to get around IDA, and you just can't be bothered to do it by pressing PgUp a 3-digit number of times. That's when you can use the "Navigate" menu, the sub menus you need are: 1) Jump to: You can go to a specific address,function,string,segment, entry point,cross reference, marked positions etc'. 2) Search for: allows you to search for text, specific operands to an instruction, next code/data/unexplored, etc'. Exporting --------- Those of us that write articles ( yeah, me too ), use IDA to rip code out and paste it into our text as examples. To do this you need to export the code you want to a file, and that is done by using the "File/Produce Output File" menu item. there are 3 options we care about right now: 1) Produce ASM file: this will put out an .ASM file that you should be able to feed TASM or MASM with. 2) Produce LST file: same, but will also put segment:offset pairs into the file, use this for articles. 3) Produce DIF file: IDA can also server as a patcher, look at "EDIT/Patch Program", if you modify the program and use this option you will get a file similar to the output of the dos program "fc". Final Notes ----------- There is more to write about IDA, you should know that it has a scripting language, which let's you automate some rather mundane disassembling procedures. If you would like to learn more about IDA scripting, check the help file for syntax and a list of functions, and you can also get some very nice scripts at mammon's place. For real-life examples of cracking with IDA i direct you to the normal place ( you know )m plenty of articles out there. I Appreciate Input on my articles, comments and criticism, you can reach me on EFNET's #cracking4newbies, or at my email at gij iname.com, i can't promise a reply, but i do read all my mail. GREETZ ------ All The Guys (and girl) On #c4n: Never has so much newbiness been in the hands of so few people. Gij. yep.