Reverse Code Engineering RCE CD +sandman 2000

home *** CD-ROM | disk | FTP | other *** search

/ Reverse Code Engineering RCE CD +sandman 2000 / ReverseCodeEngineeringRceCdsandman2000.iso / RCE / Mammon_ / idaprimer.txt < prev next >

Wrap

Text File | 2000-05-25 | 64KB | 1,401 lines

Mammon_'s Tales to Fravia's Grandson ...An IDA Primer... Contents -------- *Introduction *Configuring IDA *Loading a program *Viewing Imports *Viewing Exports *Viewing Strings/Resources *Searching for Strings/Code *Commenting Code *Working with IDC scripts *Producing an Output File *Advanced Techniques Introduction ------------ Ok, this is a long document for "the basics", mostly due to the Configuration section. New users may want to skip this section, or simply apply the changes suggested therein without reading the explanations. Also, some parts of the "Advanced Techniques" may get lengthy as well. Why is IDA so useful? Because it can do anything. IDA will change the way you think about disassemblers; it will change the way you think about cracking. W32Dasm? A toy. Soft-Ice? Unnecessary. When you have a disassembler that lets you follow the flow of execution by tapping the keyboard, backtrace just as easily, name variables/ addresses/functions, view the entire program as opcodes or assembly, change code to data and back again according to your whim, and even run limited C programs to perform operations on the code from searching and parsing to translating and patching...why go somewhere else? IDA is a reverse engineer's tool. Like many such tools, it is incredibly useful for crackers...yet it is not designed for them. It is huge, it is complex, it requires a lot of studying and tuning to get it to perform. What follows is an attempt to demonstrate how to get the most out of IDA when getting it "straight out of the box": configuration changes are suggested, macros are provided, and a basic tour of using the program in the manner of W32Dasm is attempted as well. By the end of this document you should know well IDA's capabilities and potential; you should also realize how to track down API calls, string references, and specific opcodes. As a tool for engineers, IDA requires that you know what you are doing. The more you know, the more you will get out of it. At the very least I would recommend reading the PE file format reference at http://www.microsoft.com/win32dev/base/pefile.htm Cristina Cifuentes' doctoral thesis (selectively, of course) at http://www.cs.uq.edu.au/groups/csm/dcc.html#thesis and of course the IDA home page itself at http://www.unibest.ru/~ig/index.html ...That should be enough to get you familiar enough with disassembling and the PE file format to use IDA to its greatest potential. What are all these IDA files? Yes, IDA is huge, and some of the files may be useless to you. Here is a quick overview: *.CFG -- IDA Configuration Settings IDA.KEY -- Registration File IDA2.EXE -- OS/2 Executable IDAX.EXE -- DOS4/GW Executable IDAW.EXE -- Win32 Executable IDA.INT -- Auto-generated comments *.LDO -- File loader for OS/2 Executable (ex PE.LDO = PE File Loader) *.LDX -- File loader for DOS4/GW Executable *.LDW -- File loader for Windows Executable *.DLL -- Disassembler for OS/2 Executable (ex PC.DLL = PC Disassembler) *.D32 -- Disassembler for DOS4/GW Executable *.W32 -- Disassembler for Windows Executable /IDC -- IDC macro scripts and include files /IDS -- IDS files for commenting/naming imports /Sig -- FLIRT/Compiler signature files (for recognizing target's compiler) Configuring IDA --------------- In the \IDA37? directory, locate the file Ida.cfg and open it in any text editor. The file is divided into two main sections, First Pass and Second Pass, each of which has different configuration options: the first pass contains the file extension to processor type associations, the memory and screen configuration, OS/2 options, and hotkey definitions; the second pass contains general program parameters, code analysis configuration, format options for the code displayed, ASCII string display options, displayable characters, macro definitions, and processor options. The areas of the configuration file that you will most likely want to change are: *Screen Configuration *Format Options (Text Representation) *ASCII Display Options *Processor Options Some additional areas that you may want to configure are: *Hotkey Definitions *Code Analysis Options *Displayable Characters 1. Screen Configuration Out of the box, the IDA screen configuration section looks like this: ==================================================================== // Screen configuration (first pass) // --------------------------------- #ifdef __MSDOS__ SCREEN_MODE = 0 // Screen mode to use // 0 - don't change screen mode // DOS: AL for INT 10 #else SCREEN_MODE = 0 // Screen mode to use // high byte - cols, low byte - rows // i.e. 0x5020 is 80cols, 32rows #endif SCREEN_PALETTE = 0 // Screen palette: // 0 - automatic // 1 - B & W // 2 - Monochrone // 3 - Color ==================================================================== The MD-DOS SCREEN_MODE anf the SCREEN_PALETTE need not change. If you are using Windows, the second ("else) SCREEN_MODE will determine your screen size. Note that the col/row numbers are in hexadecimal, thus 0x5020 is 80x32 in decimal. I have found that 0x5530 works best on an 800x600 resolution screen. 2. Text Representation Initially, the Text Representation section is given as follows: ==================================================================== // Text representation //------------------------------------------------------------------------- OPCODE_BYTES = 0 // don't display bytes of instruction/data INDENTION = 16 // Indention of instructions COMMENTS_INDENTION = 40 // Indention for on-line comments MAX_TAIL = 16 // Tail depth MAX_XREF_LENGTH = 80 // Maximal length of line with cross-references MAX_DATALINE_LENGTH = 70 // Data directives (db,dw, etc): // max length of argument string SHOW_AUTOCOMMENTS = NO // Don't show silly comments SHOW_BAD_INSTRUCTIONS = NO // Don't bother about instruction lengthes SHOW_BORDERS = YES // Borders between data/code SHOW_EMPTYLINES = YES // Generate empty line to make // text more readable SHOW_LINEPREFIXES = YES // Show line prefixes (1000:0000) SHOW_SEGMENTS = YES // Show segments in addresses USE_SEGMENT_NAMES = YES // Show segment names instead of numbers SHOW_REPEATABLE_COMMENTS = YES // Of course, use repeatable comments // Disabling this increases IDA speed. SHOW_VOIDS = NO // Don't display <void> marks SHOW_XREFS = 2 // Show 2 cross-references SHOW_XREF_VALUES = YES // If not, xrefs are displayed // as "..." SHOW_SEGXREFS = YES // Show segment part of addresses // in cross-references SHOW_SOURCE_LINNUM = YES // Show source line numbers // (used in .obj files and java) SHOW_ASSUMES = YES // Generate 'assume' directives SHOW_ORIGINS = YES // Generate 'org' directives USE_TABULATION = YES // Use '\t' in output file ==================================================================== Of course this section is modified to suit taste, and can be configured through the Options-Text Representation menu item (though changes made within IDA are saved only for the current project). I usually use the following changes: ==================================================================== OPCODE_BYTES = 6 // I want the hex codes! INDENTION = 0 // Save some space COMMENTS_INDENTION = 30 // Save some space MAX_DATALINE_LENGTH = 100 // These can get long SHOW_BAD_INSTRUCTIONS = YES // bother about instruction lengthes SHOW_BORDERS = NO // why border? SHOW_EMPTYLINES = NO // These lines waste space SHOW_XREFS = 15 // Show a ton of cross-references SHOW_ORIGINS = NO // Hide 'org' directives ==================================================================== 3. ASCII Strings & Names Here are the default settings that come with IDA: ==================================================================== // ASCII strings & names //------------------------------------------------------------------------- ASCII_GENNAMES = YES // Generate names when making // an ASCII string ASCII_TYPE_AUTO = YES // Should IDA mark generated ascii names // as 'autogenerated'? // Autogenerated names will be deleted // when the ascii string is deleted // Also, they are displayed with the // same color as dummy names. ASCII_LINEBREAK = '\n' // This char forces IDA // to start a new line ASCII_PREFIX = "a" // This prefix is used when a new // name is generated #define ASCII_STYLE_C 0x00000000// Character-terminated ASCII string #define ASCII_STYLE_PASCAL 0x00000001// Pascal-style ASCII string (length byte) #define ASCII_STYLE_LEN2 0x00000002// Pascal-style, length has 2 bytes #define ASCII_STYLE_UNICODE 0x00000003// Unicode string ASCII_STYLE = ASCII_STYLE_C // Default is C-style ASCII_SERIAL = NO // Serial names are disabled ASCII_SERNUM = 0 // Number to start serial names ASCII_ZEROES = 0 // Number of leading zeroes in // serial names // type of generated names: (dummy names) #define NM_REL_OFF 0 #define NM_PTR_OFF 1 #define NM_NAM_OFF 2 #define NM_REL_EA 3 #define NM_PTR_EA 4 #define NM_NAM_EA 5 #define NM_EA 6 #define NM_EA4 7 #define NM_EA8 8 #define NM_SHORT 9 #define NM_SERIAL 10 DUMMY_NAMES_TYPE = NM_REL_OFF MAX_NAMES_LENGTH = 15 // Maximal length of new names // (you may specify values up to 120) // Types of names that should be included into the list of names // (this list usually appears by pressing Ctrl-L) // normal 1 // public 2 // auto 4 // weak 8 LIST_NAMES = 0x07 // default: include normal, public, weak ...and a ton of demangling info... ==================================================================== What's the big deal? It's only strings... Well, to tell the truth, a string is just a collection of bytes virtually indistinguishable--to the untrained eye--from opcode bytes. IDA will pick up a lot of strings, but it has to have a default string type... hence the ASCII_STYLE definition. This defaults to ASCII_STYLE_C, but you may want to change it to ASCII_STYLE__UNICODE if you will be dealing primarily with Windows 95/NT programs. [Note: You can change string types dynamically in IDA using the Options->ASCII Strings Style menu item, in case your target has multiple string types...notice also that from within IDA you can define different "end characters" from 1 to 2 bytes...this is very handy for special "internal" data types that some targets use.] Now, what about those weird name types? Here they are, translated: // normal 1: this shows internal functions, etc // public 2: this includes exports, entry points // auto 4: this shows the irritating IDA names // weak 8: this is useless ;) #define NM_REL_OFF 0 = loc_0_1234 segbase relative to prog base & offset from segbase #define NM_PTR_OFF 1 = loc_1000_1234 segment base address & offset from the segment base #define NM_NAM_OFF 2 = loc_dseg_1234 (*) segment name & offset from the segment base #define NM_REL_EA 3 = loc_0_11234 segment relative to base address & full address #define NM_PTR_EA 4 = loc_1000_11234 segment base address & full address #define NM_NAM_EA 5 = loc_dseg_11234 segment name & full address #define NM_EA 6 = loc_12 full address (no leading zeroes) #define NM_EA4 7 = loc_0012 full address (at least 4 digits) #define NM_EA8 8 = loc_00000012 full address (at least 8 digits) #define NM_SHORT 9 = dseg_1234 the same as (*) without data type specifier #define NM_SERIAL 10= loc_1 enumerated names (1,2,3... The first part determines what names are shown in the "Names" window; in general, the fewer the better. If you want the Names to show only the exports of the program, choose 0x02. The next section determines how internal addresses are referred to in the disassembled listing; if you like Sourcer's method of defining "location1, location2, etc" you should try defaulting to NM_SERIAL; if you like the location to show just the segment name and offset, use NM_SHORT. You can experiment with this using the Options-> Name Representation menu item in IDA. I tend to set the following parameters: ASCII_TYPE_AUTO = NO ASCII_PREFIX = "str->" MAX_NAMES_LENGTH = 15 LIST_NAMES = 0x03 DUMMY_NAMES_TYPE = NM_SHORT **Note to use my "str->" prefix you will have to change the following line NameChars = "$?@" // asm specific character to NameChars = "$?@->" // asm specific character ...see #7 below. This setup will fill the Names window with strings, exports, and imports. 4. Processor Specific Parameters The PC-specific parameters for IDA are given as follows: ==================================================================== #ifdef __PC__ // INTEL 80x86 PROCESSORS USE_FPP = YES // Floating Point Processor // instructions are enabled WINDIR = "c:\\windows" // Default directory to look up for // DLL files OS2DIR = "c:\\os2" // OS/2 main directory (is used to // look up DLLs) // IBM PC specific analyser options PC_ANALYSE_PUSH = YES // Convert immediate operand of "push" to offset // In sequence // push seg // push num // IDA will try to convert <num> to offset. PC_ANALYSE_NOP = YES // Convert db 90h after "jmp" to "nop" // Sequence // jmp short label // db 90h // will be converted to // jmp short label // nop PC_ANALYSE_MOVOFF = YES // Convert immediate operand of "mov reg,..." to offset // In sequence // mov reg, num // mov segreg, immseg // where // reg - any general register // num - a number // segreg - any segment register // immseg - any form of operand representing a segment paragraph // <num> will be converted to an offset PC_ANALYSE_MOVOFF2 = YES // Convert immediate operand of "mov memory,..." to offset // In sequence // mov x1, num // mov x2, seg // where // x1,x2 - any references to memory // <num> will be converted to an offset // translation used to build an ASCII string name by its contents // (now it is tuned for 866 codepage) // the order and number of the string constants is important! ... a bunch of XLat stuff... #endif // __PC__ ==================================================================== As you can, see, there are a few useful disassembly options here, most of which are already set. In fact, the only thing you should have to change is the following line: WINDIR = "c:\\windows\\system" This will correctly locate the WinAPI DLLs--it is very important to set this! 5. Keyboard HotKey Definitions This section is mostly a matter of personal taste, but I thought that I would draw attention to it. Here are the default keyboard shortcuts (you may want to print this out): "LoadFile" = 0 // Load additional file into database "LoadIdsFile" = 0 // Load IDS file "LoadDbgFile" = 0 // Load DBG file "LoadSigFile" = 0 // Load SIG file "Execute" = "F2" // Execute IDC file "ExecuteLine" = "Shift-F2" // Execute IDC line "Shell" = "Alt-Z" "About" = 0 "SaveBase" = "Ctrl-W" "SaveBaseAs" = 0 "Abort" = 0 // Abort IDA, don't save changes "Quit" = "Alt-X" // Quit to DOS, save changes "ProduceMap" = "Shift-F10" // Produce MAP file "ProduceAsm" = "Alt-F10" "ProduceLst" = 0 "ProduceExe" = "Ctrl-F10" "ProduceDiff" = 0 // Generate difference file "DumpDatabase" = 0 // Dump database to IDC file "EditFile" = 0 // Small text editor "JumpAsk" = 'G' "JumpName" = "Ctrl-L" "JumpSegment" = "Ctrl-S" "JumpSegmentRegister" = "Ctrl-G" "JumpQ" = "Ctrl-Q" "JumpPosition" = "Ctrl-M" "JumpXref" = "Ctrl-X" "JumpOpXref" = "X" "JumpFunction" = "Ctrl-P" "JumpEntryPoint" = "Ctrl-E" "JumpEnter" = "Enter" // jump to address under cursor "Return" = "Esc" "UndoReturn" = "Ctrl-Enter" // undo the last Esc "EmptyStack" = 0 // make the jumps stack empty "SetDirection" = "Tab" "MarkPosition" = "Alt-M" "JumpVoid" = "Ctrl-V" "JumpCode" = "Ctrl-C" "JumpData" = "Ctrl-D" "JumpUnknown" = "Ctrl-U" "JumpExplored" = "Ctrl-A" "AskNextImmediate" = "Alt-I" "JumpImmediate" = "Ctrl-I" "AskNextText" = "Alt-T" "JumpText" = "Ctrl-T" "AskBinaryText" = "Alt-B" "JumpBinaryText" = "Ctrl-B" "JumpNotFunction" = "Alt-U" "MakeJumpTable" = "Alt-J" "MakeAlignment" = 'L' "MakeCode" = 'C' "MakeData" = 'D' "MakeAscii" = 'A' "MakeArray" = '*' "MakeUnknown" = 'U' "MakeVariable" = 0 "SetAssembler" = 0 "SetNameType" = 0 "SetDemangledNames" = 0 "SetColors" = 0 "MakeName" = 'N' "MakeAnyName" = "Ctrl-N" "ManualOperand" = "Alt-F1" "MakeFunction" = 'P' "EditFunction" = "Alt-P" "DelFunction" = 0 "FunctionEnd" = 'E' "OpenStackVariables" = "Ctrl-K" // open stack variables window "ChangeStackPointer" = "Alt-K" // change value of SP "MakeComment" = ':' "MakeRptCmt" = ';' "MakePredefinedComment" = "Shift-F1" "MakeExtraLineA" = "Ins" "MakeExtraLineB" = "Shift-Ins" "OpNumber" = '#' "OpHex" = 'Q' "OpDecimal" = 'H' "OpOctal" = 0 "OpBinary" = 'B' "OpChar" = 'R' "OpSegment" = 'S' "OpOffset" = 'O' "OpOffsetCs" = "Ctrl-O" "OpAnyOffset" = "Alt-R" "OpUserOffset" = "Ctrl-R" "OpStructOffset" = 'T' "OpStackVariable" = 'K' "OpEnum" = 'M' "ChangeSign" = '-' "CreateSegment" = 0 "EditSegment" = "Alt-S" "KillSegment" = 0 "MoveSegment" = 0 "SegmentTranslation" = 0 "SetSegmentRegister" = "Alt-G" "SetSegmentRegisterDefault" = 0 "ShowRegisters" = "Space" "OpenSegmentRegisters" = 0 // open various windows: "OpenSegments" = 0 "OpenSelectors" = 0 "OpenNames" = 0 "OpenXrefs" = 0 "OpenFunctions" = 0 // open functions window "OpenStructures" = 0 // open structures window "OpenEnums" = 0 // open enums window "OpenSignatures" = 0 // open signatures window "PatchByte" = 0 "PatchWord" = 0 "Assemble" = 0 "TextLook" = 0 // set text representation "SetAsciiStyle" = "Alt-A" // set ascii strings style "SetAsciiOptions" = 0 // set ascii strings options "SetCrossRefsStyle" = 0 // set cross-referneces style "SetDirectives" = 0 // setup assembler directives "ToggleDump" = "F4" // show dump or normal view "SetAuto" = 0 // background analysis "ViewFile" = 0 "Calculate" = '?' "ShowFlags" = 'F' "WindowOpen" = "F3" "WindowMove" = "Ctrl-F5" "WindowZoom" = "F5" "WindowPrev" = "Shift-F6" "WindowNext" = "F6" "WindowClose" = "Alt-F3" "WindowTile" = "F7" "WindowCascade" = "F8" "SetProcessor" = 0 "AddStruct" = "Ins" // add struct type "DelStruct" = "Del" // del struct type "ExpandStruct" = "Ctrl-E" // expand struct type "ShrinkStruct" = "Ctrl-S" // shrink struct type "MoveStruct" = 0 // move struct type "DeclareStructVar" = "Alt-Q" // declare struct variable "AddEnum" = "Ins" // add enum "DelEnum" = "Del" // del enum "EditEnum" = "Ctrl-E" // edit enum "AddConst" = "Ctrl-N" // add new enum member "EditConst" = 'N' // edit enum member "DelConst" = 'U' // delete enum member Quite a few, eh? Basically, anything in IDA can have a hotkey. Note all of the 0's in the above list: these options have not hotkeys by default. It is generally good to set frequently- use operations (ASCII text representation, View Names, Search, etc) up as HotKeys, and to change hotkeys which make no sense into better menmonics. 6. Analysis Parameters IDA by default has the following Anaylsis Parameters set: // Analysis parameters //------------------------------------------------------------------------- ENABLE_ANALYSIS = YES // Background analysis is enabled SHOW_INDICATOR = YES // Show background analysis indicator #define AF_FIXUP 0x0001 // Create offsets and segments using fixup info #define AF_MARKCODE 0x0002 // Mark typical code sequences as code #define AF_UNK 0x0004 // Delete instructions with no xrefs #define AF_CODE 0x0008 // Trace execution flow #define AF_PROC 0x0010 // Create functions if call is present #define AF_USED 0x0020 // Analyse and create all xrefs #define AF_FLIRT 0x0040 // Use flirt signatures #define AF_PROCPTR 0x0080 // Create function if data xref data->code32 exists #define AF_JFUNC 0x0100 // Rename jump functions as j_... #define AF_NULLSUB 0x0200 // Rename empty functions as nullsub_... #define AF_LVAR 0x0400 // Create stack variables #define AF_TRACE 0x0800 // Trace stack pointer #define AF_ASCII 0x1000 // Create ascii string if data xref exists #define AF_IMMOFF 0x2000 // Convert 32bit instruction operand to offset #define AF_DREFOFF 0x4000 // Create offset if data xref to seg32 exists #define AF_FINAL 0x8000 // Final pass of analysis // See also ANALYSIS2, bit AF2_DODATA ANALYSIS = 0xFFFF // This value is combination of the defined // above bits. #define AF2_JUMPTBL 0x0001 // Locate and create jump tables #define AF2_DODATA 0x0002 // Coagulate data segs in the final pass ANALYSIS2 = 0x0001 ==================================================================== Generally, you will not need to change any of these parameters. In case you feel like playing with them, though, here is the IDA help file description of each: Create offsets and segments using fixup info IDA will use relocation information to make the disassembly nicer. In particular, it will convert all data items with relocation information to words or dwords like this: dd offset label dw seg seg000 If an instruction has a relocation information attached to it, IDA will convert its immediate operand to an offset or segment: mov eax, offset label You can display the relocation information attached to the current item by using show @0:953[internal] flags command. Mark typical code sequences as code IDA knows some typical code sequences for each processor. For example, it knows about typical sequence push bp mov bp, sp If this option is enabled, IDA will search for all typical sequences and convert them to instructions even if there are no references to them. The search is performed at the loading time. Delete instructions with no xrefs This option allows IDA to undefine unreferences instructions. For example, if you @0:914[undefine] an instruction at the start of a function, IDA will trace execution flow and delete all instructions that lose references to them. Trace execution flow This options allows IDA to trace execution flow and convert all references bytes to @0:916[instructions]. Create functions if call is present This options allows IDA to create @0:933[function] (proc) if a call instruction is present. For example, the presence of: call loc_1234 leads to creation of a function at label loc_1234 Analyse and create all xrefs Without this option IDA will not thoroughly analyse the program. If this option is disabled, IDA will simply trace execution flow, nothing more (no xrefs, no additional checks, etc) Use flirt signatures Allows usage of FLIRT technology Create function if data xref data->code32 exists If IDA encounters a data references from DATA segment to 32bit CODE segment, it will check for the presence of meaningful (disassemblable) instruction at the target. If there is an instruction, it will mark is as an instruction and will create a function there. Rename jump functions as j_... This option allows IDA to rename simple functions containing only jmp somewhere instruction to "j_somewhere". Rename empty functions as nullsub_... This option allows IDA to rename empty functions containing only a "return" instruction as "nullsub_..." (... is replaced by a serial number: 0,1,2,3...) Create stack variables This option allows IDA to automatically create stack variables and function parameteres. Trace stack pointer This option allows IDA to @0:743[trace] value of SP register. Create ascii string if data xref exists If IDA encounters a data reference to an undefined item, it checks for the presence of ASCII string at the target. If the length of ASCII string is big enough (more than 4 chars in 16bit or data segments; more than 16 chars otherwise), IDA will automatically create an @0:918[ASCII] string. Convert 32bit instruction operand to offset This option works only in 32bit segments. If an instruction has an immediate operand and the operand can be represented as a meaningful offset expression, IDA will convert it to an offset. However, the value of immediate operand must be higher than 0x10000. Create offset if data xref to seg32 exists If IDA encounters a data reference to 32bit segment and the target contains 32bit value which can be represented as an offset expression, IDA will convert it to an offset Make final analysis pass This option allows IDA to coagulate all @0:914[unexplored] bytes by converting them to data or instructions. Locate and create jump tables This option allows IDA to try to guess address and size of @0:863[jump] tables. Please note that disabling this option will not disable the recognition of C-style typical switch constructs. Coagulate data in the final pass This option is meaningful only if "Make final analysis pass" is enabled. It allows IDA to convert @0:914[unexplored] bytes to data arrays in the data segments. If this option is disabled, IDA will coagulate only code segments. 7. Character Translations and Allowed Character Lists The default character rules suppleid with IDA are as follows: ==================================================================== // Character translations and allowed character lists //------------------------------------------------------------------------- // translation when ASCII string name is built using its contents XlatAsciiName = /*00..0F*/ "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F" /*10..1F*/ "\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F" /*20..3F*/ " !\"# %&'()*+,-_/" "0123456789:;<=>?" /*40..5F*/ "@ABCDEFGHIJKLMNO" "PQRSTUVWXYZ[\\]^_" /*60..7F*/ "`abcdefghijklmno" "pqrstuvwxyz{|}~" /*80..9F*/ "ABVGDEJZIIKLMNOP" "RSTUFXCCSS I AUQ" /*A0..BF*/ "abvgdejziiklmnop" "ªªªªªªª++ªª+++++" /*C0..DF*/ "+--+-+ªª++--ª-+-" "---++++++++ª_ªª_" /*E0..FF*/ "rstufxccss i auq" "=▌==()~~▌++vn▌ªá"; // the following characters are allowed in ASCII strings, i.e. // in order to find end of a string IDA looks for a character // which doesn't belong to this array: AsciiStringChars = "\r\n\a\v\b\t\x1B" " !\"#$%&'()*+,-./0123456789:;<=>?" "@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_" "`abcdefghijklmnopqrstuvwxyz{|}~" "▌nTGSastOdFne8-++µ▌(÷=v· +_óúÑPâ" "ßf=·±-¬▌+¼¼++í½+ªªªªªªª++ªª+++++" "+--+-+ªª++--ª-+----++++++++ª_ªª_" "a_GpSs▌tFTOd8fen"; // the following characters are allowed in user-defined names: NameChars = "$?@" // asm specific character "_0123456789" "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "abcdefghijklmnopqrstuvwxyz"; // the following characters are allowed in mangled names. // they will be substituted with the SubstChar during output if names // are output in a mangled form. MangleChars = "$:?([.)]" // watcom "@$%?" // microsoft "@$%"; // borland SubstChar = '_' ==================================================================== Of these, two areas are of interest. The first is the "NameChars" section, which dictates which characters may be used for naming an address. For maximum flexibility (and to help make IDC scripts that automatically generate names run better), you may want to increase the characters in this section ot include the full range, i.e. "$?@" becomes: "$?@!#%^&*-+=~|\}{[]:;><,./" although this is strictly up to the user. The MangleChars section is also important for those working from code compiled with mangling set on; if the compiler of the target uses different mangling characters that the ones listed (rare), you can include them here--you can also change the character with which the mangled characters are replaced by changing the SubstChar value. Loading a program ----------------- For all of the examples in this primer, I will be using notepad.exe as a target; I will also be assuming that the configuration changes mentioned above have been made. To begin, launch IDAW.EXE and type "c:\windows\notepad.exe" at the "Select File" dialog box, press OK. Immediately IDA will bring up a dialog box prompting you for loading options. Make sure that Portable Executable is checked (for Win32 files), that "Create Segments", "Load Resources", and "Make Imports Section" are checked, and that "Rename DLL Entries" is unchecked. Also ensure that the "DLL directory" is set to the location of kernel32.dll et. al., usually C:\windows\system. Press OK, and wait for the green "Ready" notice to appear in the upper left of the IDA menu bar. A few notes about the IDA user interface may be helpful at this point. IDA uses a text-mode windowing techniques common in console-mode applications; each window has a toborder with a green square (close), a title, and a green arrow (restore/mamximize), a right border with a veritcal scroll bar, and a bottom border with a horizontal scrollbar and a green corner (resize); the windows may be moved by dragging on the title bar, or resized by dragging on the green corner. F6 switches between windows (like Alt-Tab), F7 tiles all windows (except the Messages Windows, which is like a desktop), and F8 cascades all windows. Note that the disassembled listing is referred to as the Code Window or Text Window; you can open multiple views of the same program by selecting the View->Disassembly menu item, or by pressing F3. As with any Windows DOS box, clicking on the small MS-DOS icon (for the system menu) gives you an Edit submenu with Mark and Copy options; to copy text out of IDA and inot a windows editor, select Edit->Mark, highlight the text you want to copy, then select Edit->Copy, then go to the windows editor and Ctrl-V (or Edit->Paste) to insert the text selected from IDA. Viewing Imports --------------- All of the programs's imports will appear as names in the program, and may be viewed in the Names window by selecting the View->Names menu item; however as this contains all of the names in the program it may be a bit confusing. Double-clicking on the name of an inport will bring you to its entry in the .idata segment (see below). Another way to view the imports is to select the View->Segments menu item, which will bring up the Segments window. Double-click on the .idata segment; this will jump the disassembled listing to the start of the .idata segment, which will contain all of the program's imports in pink text. To the right of each import, at the end of the line, will be a list of addresses in the program which all that import. Double-clicking on one of these addresses will jump the disassembled listing to that address. Example: View the .idata segment of Notepad.exe as mentioned above. The imports are sorted by module; scroll down to the Kernel32.dll imports and find the one for "lstrcmpa". You should see a line like this: ª00407300 ?? ?? ?? ?? extrn lstrcmpA:dword ; DATA XREF: sub_401FAC+15 ª00407300 ; sub_4045AF+3E^r ª00407300 ; .text:004046B9^r ª00407300 ; .text:004046DD^r Each of the locations after a ";" is an address in the file that calls lstrcmpa; these are known as cross-references, or X-refs for short. Double-click on the first one; note how it brings you to |00401FC1 FF 15 00 73 40 00 call ds:lstrcmpA |00401FC7 85 C0 test eax, eax |00401FC9 75 10 jnz short loc_401FDB Press Esc to go back to the lstrcmp entry, then double-click on each of the remaining X-refs to scope out the caller code. Note how you can scout out each caller routine by double clicking on call/jmp locations within the code, and by double-clicking on X-refs to see who initiated the caller routine; Esc, as always, returns you back the way you came, one step at a time. A final method of viewing exports is to write an IDC script. IDC is the IDA macro language; it stands for IDA-C much in the way that QCC stands for Quake-C. All IDA scripts must include the file IDC.IDC, which contains a number of internal IDA functions and constants. The IDC language is a lot like C, and is described in the file IDC.TXT--here is brief excerpt summarizing the language: ==================================================================== IDC supports the following statements: if (expression) statement if (expression) statement else statement for ( expr1; expr2; expr3 ) statement while (expression) statement do statement while (expression); break; continue; return <expr>; return; the same as 'return 0;' { statements... } expression; (expression-statement) ; (empty statement) In expressions you may use almost all C operations except: ++,-- complex assigment operations as '+=' , (comma operation) Here is how a function is declared : static func(arg1,arg2,arg3) { ... } Here is how a variable is declared : auto var; ==================================================================== That said and done, here is a script for listing the file's exports by API module to the IDA Messages window (the blue one with all of the yellow writing on it): ==================================================================== //Imports.idc : Outputs list of imported functions to the Message Window #include <idc.idc> static GetImportSeg() { auto ea, next, name; ea = FirstSeg(); next = ea; while ( (next = NextSeg(next)) != -1) { name = SegName(next); if ( substr( name, 0, 6 ) == ".idata" ) break; } return next; } static main() { auto BytePtr, EndImports; BytePtr = SegStart( GetImportSeg() ); EndImports = SegEnd( BytePtr ); Message(" \n" + "Parsing Import Table...\n"); while ( BytePtr < EndImports ) { if (LineA(BytePtr, 1) != "") Message("\n" + "____" + LineA(BytePtr,1) + "____" + "\n"); Message(Name(BytePtr) + "\n"); BytePtr = NextAddr(BytePtr); } Message("\n" + "Import Table Parsing Complete\n"); } ==================================================================== The coding is pretty straight forward if you know C: the script finds the .idata segment, prints each non-blank anterior comment line (i.e., the line that tells what API module the following imports belong to), then prints the Name of each defined/named address in the .idata section. The script is executed by pressing F2 and selecting "imports.idc", assuming that you have saved the script as imports.idc in the \IDA37?\IDC directory. Viewing Exports --------------- Viewing exported functions s a little easier. Perhaps the quickest way is to select the Options-Name Representation menu item, and mark the "type of names" dialog so it includes only publics, as follows: Types of names included in the list of names: [ ] Normal [X] Public [ ] Autogenerated [ ] Weak Press Ok and then select the View->Names menu item; the Names window will now only contain the exported functions of the program. As with any of the Names/Segments/etc windows, double clicking on any line will bring that function up in the "code window". [Note: if you have modified the IDA.cfg file as mentioned above, you can also browse the imports in this manner by checking only "Normal" in the dialog box illustrated above, then ignoring everything with a "str->" prefix; the remainder will be imports.] If the program has an .edata segment, you can also view the exports there much in the same manner as in the .idata method given in the previous section. Note that Notepad has only one export ("start", the program entry point) and also has no .edata segment. The IDC method works for exports as well. The following ID script searches for entry points into the program and displays them in the message window: ==================================================================== //exports.idc : display eprogram entry points to the message window #include <idc.idc> static main() { auto x, ord, ea; Message("\n Program Entry Points: \n \n"); for ( x=0; x<= GetEntryPointQty(); x = x+1){ ord = GetEntryOrdinal( x ); ea = GetEntryPoint( ord ); Message( Name( ea ) + ": Ordinal " + ltoa( ord,16 ) + " at offset " + ltoa( ea, 16) + "\n"); } Message("\n" + "Export Parsing Complete\n"); } ==================================================================== Once again, this script may be run by pressing F2 and selecting "exports.idc". Viewing Strings/Resources ------------------------- The strings can be previewed by selecting "Normal" as the "Type of names to be shown in the list of names" in the Options->Name Representation dialog box, and then looking for everything beginning with the prefix "str->" (or "a", if using IDA straight out of the box). In PE files, strings are commonly kept in a string table in the .rsrc segment. However, IDA does not by default parse the .rsrc segment for strings. Thus, an IDC script can be written to parse the .rsrc section for us, creating strings where any standard ASCII character is found so that the strings may be browsed either in the .rsrc segment, or in the names window: ==================================================================== //RSRC_Strings.IDC //define all std ASCII characters in the .rsrc segment as strings #include <idc.idc> //This file contains all of the //function protos we will be using static main(){ auto ea; //auto is the standard variable type ea = FirstSeg(); //Get Addr of first segment into ea while (ea !=BADADDR) { Message( "Analyzing " + SegName(ea) + "...\n" ); //Is this the .RSRC segment? If so... if ( SegName(ea) == ".rsrc"){ Message(" RSRC found!\n"); while ( ea <= SegEnd(ea)) { //Change every Std ASCII character into a string if ( Byte(ea) > 0x19 && Byte(ea) < 0x7F){ MakeStr( ea, -1 ); MakeRptCmt(ea, Name(ea)); ea = ea + ItemSize( ea ); } else ea = ea + 1; } } ea=NextSeg(ea); //Goto Next Segment } Message("Done!\n"); } ==================================================================== The IDC script is functional, though not perfect (plenty of random bytes defined as strings, but it is quick up-and-running script). Notice that IDC.IDC contains a lot of function prototypes for use in IDC scripts; by including it, you are able to call all of the FirstSeg(), NextSeg(), etc functions. These functions are poorly documented, but the commented prototypes should give you enough to go on. The IDC script can be placed in the \IDC directory and run by pressing F2 and choosing the rsrc_strings.idc script. Note that this script assumes that you have the default string type set as "Unicode"; as such it will parse any Unicode resource names or values in the .rsrc statement. For a full-fledged resource parsing IDC script a lot more work is in order; I have started such a project with a script known as reslib.idc (too large to include here) which is publicly available. After running this script we can create and run a second one which will print out all of the strings (that is, every location name that begins with "str->") in the disassembled listing: ==================================================================== //ss.idc : display all strings in the program #include <idc.idc> static main() { auto ea; ea = FirstSeg(); Message("\n" + "Strings in Application: \n \n"); while( ea != BADADDR) { if( substr( Name(ea), 0, 5) == "str->") { Message( substr(Name(ea), 5, -1) + " at address " + ltoa( ea, 16) + "\n" ); } ea = NextAddr(ea); } Message("\n" + "String Listing Complete\n"); } ==================================================================== Running this after the previous IDC script will reveal the flaw in the first one: a lot of garabage ASCII bytes are listed as strings--more, in fact than there are actually strings. For this reason it is important to refine your scripts so they print out only the string table and resource names in the .rsrc section (as I have done with the reslib.idc script), rather than blindly naming locations. Searching for Strings/Code -------------------------- Once you have defined strings, you can search for them using the Navigate-> Search For->Text... menu item. For instance, entering the string "Cannot" at this dialog box will bring up the "YouCannotQuitWindows" string in the Code window. The shortcut for FindText is Alt-T, and for FindNextText is Ctrl-T. A "Pattern is not found" message will appear at the bottom of the message window when there are no more occurences of the text. What if your string has not been defined? If it is not Unicode, then you can search for it using Navigate->SearchFor->Text In Core... (Alt-B), by entering the string in quotes at the dialog box, as follows: +-[_]--------------- Binary search --------------------+ ▌ ▌ ▌ Enter search (down) string: ▌ ▌ String "FindReplace" _▌▌ ▌ ▌ ▌ [X] Case-sensitive () Hex ▌ ▌ ( ) Decimal ▌ ▌ ( ) Octal ▌ ▌ ▌ ▌ OK _ Cancel _ F1 for Help_ ▌ ▌ ________ ________ ____________ ▌ +------------------------------------------------------+ This will find occurences of "FindReplace" in the file. You can also search for the text using the hexadecimal equivalents of the ASCII characters: +-[_]--------------- Binary search --------------------+ ▌ ▌ ▌ Enter search (down) string: ▌ ▌ String 46 69 6E 64 _▌▌ ▌ ▌ ▌ [X] Case-sensitive () Hex ▌ ▌ ( ) Decimal ▌ ▌ ( ) Octal ▌ ▌ ▌ ▌ OK _ Cancel _ F1 for Help_ ▌ ▌ ________ ________ ____________ ▌ +------------------------------------------------------+ This will search for "Find" in the disassembled listing. In this way you can search for Unicode strings as well: +-[_]--------------- Binary search --------------------+ ▌ ▌ ▌ Enter search (down) string: ▌ ▌ String 43 00 61 00 6E 00 6E _▌▌ ▌ ▌ ▌ [X] Case-sensitive () Hex ▌ ▌ ( ) Decimal ▌ ▌ ( ) Octal ▌ ▌ ▌ ▌ OK _ Cancel _ F1 for Help_ ▌ ▌ ________ ________ ____________ ▌ +------------------------------------------------------+ This will search for the Unicode string "Cannot". Note that simply searching for the string "Cannot" will fail due to the 00 bytes that Unicode inserts between characters. Thus, to search effectively for Unicode strings, they must be defined first. Searching for code can be done in the same way, using the Text In Core method. For example, the following will search for "test eax, eax": +-[_]--------------- Binary search --------------------+ ▌ ▌ ▌ Enter search (down) string: ▌ ▌ String 85 C0 _▌▌ ▌ ▌ ▌ [X] Case-sensitive () Hex ▌ ▌ ( ) Decimal ▌ ▌ ( ) Octal ▌ ▌ ▌ ▌ OK _ Cancel _ F1 for Help_ ▌ ▌ ________ ________ ____________ ▌ +------------------------------------------------------+ And you can use the standard Text search for opcodes as well, though you will get a lot of hits (i.e., you can search for the text "test" but not "test eax, eax"; therefore you will get quite a few hits). There is, of course, a final option to make searching for strings much easier--you must write an IDC script to front-end for the "Search for Text In Core" function. The following IDC script will do just that, allowing you to enter a text string to search for, then converting the string to hexadecimal and feeding it to the "Text In Core" function: ==================================================================== // textsearch.idc : search for undefined strings #include <idc.idc> static main() { auto ea, x, y, searchstr, temp_c, binstr, array_id, alphabet, bin_c, cont; ea = FirstSeg(); // ---- Create Array Of ASCII Characters ------------------------ // ---- Note that the index of each char = its decimal value ---- array_id = CreateArray("AsciiTable"); alphabet = "0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz"; y = 48; for (x = 0; x < strlen(alphabet); x = x + 1 ) { SetArrayString( array_id, y, substr(alphabet, x, x+1)); y = y +1; } // ---- Prompt User For Search String ---------------------------- searchstr = AskStr("", "Enter a search string:\n"); // ---- Cycle through array looking for match -------------------- for (x = 0; x < strlen(searchstr); x = x + 1 ) { temp_c = substr(searchstr, x, x + 1 ); for( y = GetFirstIndex(AR_STR, array_id); y <= GetLastIndex(AR_STR, array_id); y = GetNextIndex(AR_STR, array_id, y) ) { if (temp_c == GetArrayElement(AR_STR, array_id, y)) { bin_c = y; break; } //End "If Match" } //End Array Loop binstr = form("%s %X", binstr, bin_c); //Standard Version //binstr = form("%s %X 00", binstr, bin_c); //Unicode Version } //End Search String Loop Message("Search string is " + binstr + "\n"); //Debug Control // -------- "Search" and "Search Again" Loop... -------------------- cont = 1; while (cont==1) { ea = FindBinary(ea, 1, binstr); //Search From ea if( ea == -1) { //If No Hits Warning("No more occurrences"); //MessageBox cont = 0; break; //Leave } Jump(ea); //Position Cursor At Hit cont = AskYN( 1, "Find next occurence?" ); //Search Again? } // --------- Cleanup and Exit Message("\n" + "Search Complete\n"); DeleteArray(array_id); } ==================================================================== Location Names -------------- In IDA, location names are your greatest asset. Naming locations whose purpose you know or suspect allows you to quickly browse the code for references to that location. For example, do the following: 1. Go to the lstrcmp import listing 2. Double Click on the first X-ref; this should put you at 00401FC1 3. Scroll up to the start of the function (401FAC) and use the N command to name it "StringCmpFunc" 4. Rename 401FDB to "StringCmpFailed" (because of the JNZ at 401FC9) 5. Name 402033 to "Good String Name" (for the JMP at 401FD9) Instantly the function is more readable. Now, go to the X-refs at 401FAC and double click on the first one; this will put you at 00402816 (yes, we are back-tracing! Great, isn't it?). Here you are in a great huge routine, and the "StringCmpFunc" stands out from the rest in bright yellow. The rest of the internal functions (sub_???????) can be named in the same way. Now some elementary searching browsing: You'll notice that you can see all of the names you created with the N command in the Names window. Using Alt-T (search text), you can look for occurences of StringCmpFunc in the disassembled listing, which will show you all of the locations that reference this function. Ok, comments: you can comment code using the ";" key. Go back to the "StringCmpFailed" location (look it up in the Names window), hit the ";" key and type in the text "Bad String Entered!". This is what is known as a "repeatable comment". Why repeatable? Because evey address that refers to this location will now have that comment suffixed to it--go back up to 401FC9 to verify. Cool, eh? You will never go back to W32Dasm... Producing an Output File ------------------------ Producing an output file is relatively simple. If you want a full listing of the names, comments, addresses, in short everything in the Code Window, use File->Produce Output File->Produce LST File. If you just want the ASM source code, with no addresses, use File->Produce Output File->Produce ASM File. If you want to produce a tiny file that will make all of the changes that you just made to an executable (in case you want someone else to be able to duplicate your .idb [idb: IDA database, containing all of your changes to the exe and the disassembled listing]), use File->Produce Output File->Produce IDC file--this will create an IDc script that, when run, will leave the disassembled listing identical to yours. Advanced Technique ------------------ 1.IDS files and Comment Databases Custom IDS files are very useful; you will need to download the IDS utilities from http://www.unibest.ru/~ig/idsutil.zip Basically, you create an IDT file from a .DLL by running the DLL2IDT utility. From there you can comment the IDT file and compress it into an IDS file using ZIPIDS, and finally move it to the appropriate subdirectory (based on OS) of \IDS. An IDT file looks like this: ALIGNMENT 4 ;DECLARATION ; 0 Name=ADVAPI32.dll ; 1 Name=AbortSystemShutdownA 2 Name=AbortSystemShutdownW 3 Name=AccessCheck 4 Name=AccessCheckAndAuditAlarmA 5 Name=AccessCheckAndAuditAlarmW 6 Name=AddAccessAllowedAce 7 Name=AddAccessDeniedAce 8 Name=AddAce 9 Name=AddAuditAccessAce 10 Name=AdjustTokenGroups 11 Name=AdjustTokenPrivileges ... With this file, you can provide comments for various functions by adding "Comment=" lines to each, for example: 154 Name=RegCreateKeyA Comment=Create a Key in the System Registry Note that an IDT line has the following structure: Ordinal Name=name Args=args Drops=drops Pascal=pascal Typeinfo=type Comment=comment RptCmt=ord# The keywords are defined as follows: Name : name of entry point [string] Args : number of bytes occupied by entry point arguments [number] Drops : number of bytes purged from the stack upon return [number] Pascal : the same as Args=Drops= [number] Typeinfo : entry point function prototype (type of input/output arguments [string] Comment : a comment for this entry point [string] Rptcmt : use the comment from the specified entry point [number] Wouldn't it be nice to have all of the API prototypes entered as comments into the IDS files? Well, it can be done, though no-one in their right mind would attempt it by hand. One of the most basic programming tools, grep.exe, will allow you to search an entire directory for lines in any file containing a specific search pattern. If you were to grep an entire directory for WINAPI or STDCALL, you would then have as output a file with every 1-line API prototype in it. The following perl script will take an .idt file and grep output file, and output an .idt file commented with the API prototypes to stdout or a specified filename: ==================================================================== #!/usr/bin/perl if ($#ARGV == 0) { print "Usage: h2idt [idtfile] [grepfile] [outfile]\n"; print "Output defaults to stdout\n"; exit (1); } $idtfile = $ARGV[0]; $grepfile = $ARGV[1]; if ($#ARGV == 2) { $outfile = ">" . $ARGV[2]; } else { $outfile = ">-"; } open(IDTFILE, $idtfile)|| die "Can't open file: $!\n"; open(GREPFILE, $grepfile) || die "Can't open file: $!\n"; open(OUTFILE, "$outfile") || die "Can't create file: $!\n"; $i =0; foreach (<GREPFILE>){ s/\n\r//; @greparray[$i] = $_; $i++; } print OUTFILE ";DECLARATION \n"; print OUTFILE ";ALIGNMENT 2 \n\n"; print OUTFILE "; Module Name and Description \n"; foreach (<IDTFILE>) { if ( /^0/ ){ s/\\//; print OUTFILE $_; print OUTFILE ";---------------------------------------\n"; break; } elsif ( /Name=/ ){ if (/\n/){ chop; #get rid of LF } if (/\r/){ chop; #get rid of CR } $outstr = $_; ($junk, $searchstr, $junk) = split(' ', $_, 3); $searchstr =~ s/Name=//; $comment=''; foreach(@greparray) { if (/\s$searchstr\(/) { $comment = $_; } } $outstr =~ s/\\//; if ($comment != '') { $comment =~ s/^[^a-zA-Z]+//; $comment =~ s/\n//; $comment =~ s/;//; $comment =~ s/STDCALL\s//; print OUTFILE $outstr, " Comment=", $comment, "\n"; }else { print OUTFILE $outstr, "\n"; } } } print OUTFILE ";------------------EOF------------------"; close(OUTFILE); close(GREPFILE); close(GREPTMP); close(IDTFILE); exit (0); ==================================================================== As usual with Perl/unix files, strip the above for CR/LF's before you run it in perl (you can use Editeur, or nedit for this, depending on your OS). So, how do you do this from NT? Well, assuming you have the NT resource kit, the process for extracting and IDT file from an existing IDS file, grepping for prototypes (I use LCC as the protos are all 1-line), creating the commented IDT file and compressing it into an IDS file, is as follows: c:\ntreskit\posix\grep STDCALL c:\lcc\include\* > grep.out c:\ida\Utility\IDSUtil\WIN32\zipids -u c:\ida\Ids\Win\kernel32.ids c:\ntreskit\perl\perl.exe h2idt kernel32.idt grep.out idt.out c:\ida\Utility\IDSUtil\WIN32\zipids out.idt ren out.ids kernel32.ids You will get --in the IDT file-- output similar to the following: =========================================================================== ;DECLARATION ;ALIGNMENT 2 ; Module Name and Description 0 Name=KERNEL32.dll ;--------------------------------------- 50 Name=AddAtomA Pascal=2 Comment=ATOM AddAtomA(LPCSTR); 102 Name=AddAtomW Pascal=2 Comment=ATOM AddAtomW(LPCWSTR); 103 Name=AllocConsole Pascal=0 Comment=BOOL AllocConsole(VOID); 104 Name=AllocLSCallbac Comment=BOOL AllocConsole(VOID); =========================================================================== Note that eveyrthing after the "Comment=" will appear in the comment margin of IDA. In addition to the IDS files, you can also maintain a database of comments that will be inserted into the code upon disassembly. The IDA comment database is stored in the IDA.INT file, and it can be modified with the LoadINT utility available at http://www.unibest.ru/~ig/ldint37.zip The Readme file best documents how to edit this database, but to show you a brief example of the comments supplied with IDA, here is an excerpt from the PC section of the INT: // MMX instructions NN_emms: "Empty MMX state" NN_movd: "Move 32 bits" NN_movq: "Move 64 bits" NN_packsswb: "Pack with Signed Saturation (Word->Byte)" NN_packssdw: "Pack with Signed Saturation (Dword->Word)" NN_packuswb: "Pack with Unsigned Saturation (Word->Byte)" NN_paddb: "Packed Add Byte" NN_paddw: "Packed Add Word" NN_paddd: "Packed Add Dword" These comments will appear (if "auto comments" is turned on) whenever the opcode is encountered in the disassembly; note that you can browse through the .cmt files included with LoadINT to see what the existing comments are. The most interesting will be int.cmt, pc.cmt, portin.cmt, portout.cmt, and vxd.cmt. It is tempting --but rather daunting-- to port Ralph Brown's Interrupt List comments to an INT database... 2.IDC Scripts I have used IDC scripts for a number of monotonous tasks. Basically, you can use an IDC script to parse VCL resources, to parse VB forms (if you take the time...), to encrypt or decrypt sections of code, to print out a call trace, to perform searches for the user (e.g. a front-end to the RegEx feature), etc. Here are a quick few additional IDC scripts to demonstrate their usefulness: ==================================================================== //copy.idc: Outputs selected text to an .asm file //Usage: Select text with mouse or cursor, hit F2 and type copy.idc, enter a filename when prompted // and the selected text will be written to that file. //Future Plans: Make this output to the Windows clipboard. I may have to patch IDA for this.... // // code by mammon_ All rights reversed, use as you see fit..... //------------------------------------------------------------------------------------------------------ #include <idc.idc> static main(){ auto filename, start_loc, end_loc; start_loc = SelStart(); end_loc = SelEnd(); filename = AskFile( "asm", "Output file name?"); WriteTxt( filename, start_loc, end_loc); return 0; } ==================================================================== //------------------------------------------------------------------------------------------------------ //Haeder.idc : Imports #defines from a .h file, adds as enums //Note: This script prompts the user for a header file (*.h), then parses the // file looking for #define statements: these are then converted to members // of enum "Defines". //Bugs: Only the first instance of any value will be preserved; all others will be // discarded with an error as you can have only one instance of any value (or // any name) in a single enumeration. A prompt has been added for the user to // name the enumerations for the header file, so that any duplicate enum values // can be added to a different file and enumerated under a different "enum name." // // code by mammon_ All rights reversed, use as you see fit..... //------------------------------------------------------------------------------------------------------ #include <idc.idc> static strip_spaces( BytePtr, hHeaderFile){ auto tempc; fseek( hHeaderFile, BytePtr, 0); tempc = fgetc(hHeaderFile); while ( tempc == 0x20) { BytePtr = BytePtr + 1; fseek( hHeaderFile, BytePtr, 0); tempc = fgetc(hHeaderFile); } return BytePtr; } static FindStringEnd( StrName ){ auto x, tempc; for ( x = 1; x < strlen(StrName); x = x + 1) { tempc = substr( StrName, x-1, x); if ( tempc == " ") { return substr( StrName, 0, x); } } return substr( StrName, 0, strlen(StrName)); } static FixString( StrName ){ auto x, tempc, newname; newname="def"; //set newname to type character for ( x = 1; x < strlen(StrName); x = x + 1) { tempc = substr( StrName, x-1, x); if ( tempc != "_") { newname = newname + tempc; } } return newname; } static main(){ auto HeaderFile, hHeaderFile, fLength, BytePtr, first_str, second_str, third_str, define_val; auto enum_id, tempc1, x, y, errcode, define_name, FilePtr, define_str, enum_name; FilePtr = 0; Message("\nStart Conversion\n"); HeaderFile = AskFile( "*.h", "Choose a header file to parse:"); enum_name = AskStr("Defines", "Enter a name for the enumerations (alpha only, eg 'VMMDefines'):"); hHeaderFile = fopen( HeaderFile, "r"); fLength = filelength(hHeaderFile); if( fLength == -1) Message( "Bad File Length!\n"); enum_id = AddEnum( GetEnumQty() + 1, enum_name, FF_0NUMH); if ( enum_id == -1) { enum_id = GetEnum( enum_name ); if(enum_id == -1) Message("Enum #Defines not created/not found\n"); } SetEnumCmt( enum_id, "#define from " + HeaderFile, 1); while(FilePtr < fLength ){ FilePtr = strip_spaces( FilePtr, hHeaderFile ); BytePtr = FilePtr; errcode = fseek( hHeaderFile, BytePtr, 0 ); if ( errcode != 0) break; first_str = readstr( hHeaderFile ); if ( first_str == -1 ) { Message( "End of file! \n" ); break; } else if ( substr(first_str, 0, 7) == "#define" || substr( first_str, 0, 7) == "#DEFINE" ) { FilePtr = FilePtr + strlen( first_str ); BytePtr = BytePtr + 7; BytePtr = strip_spaces( BytePtr, hHeaderFile ); errcode = fseek( hHeaderFile, BytePtr, 0 ); if ( errcode != 0 ) break; second_str = readstr( hHeaderFile ); if ( second_str == -1 ) { Message( "End of file after #define!\n" ); break; } else { define_name = FindStringEnd( second_str ); define_name = FixString( define_name ); BytePtr = strip_spaces( BytePtr + strstr( second_str, " " ), hHeaderFile ); errcode = fseek( hHeaderFile, BytePtr, 0); if ( errcode != 0 ) break; third_str = readstr( hHeaderFile); tempc1 = substr(third_str, 0, 2); if ( third_str == -1) { Message( "End of file before value!\n"); break; } else if ( tempc1 == "0x" || tempc1 == "0X") { define_str = FindStringEnd( third_str ); define_val = xtol( define_str ); errcode = AddConst( enum_id, define_name, define_val); if ( errcode == 1 ) Message( "Name " + define_name + " bad or already used in program!\n"); if ( errcode == 2 ) Message( "Value " + define_str + " already used in program!\n"); if ( errcode == 3 ) Message( "Bad enumID!\n"); } } } else FilePtr = FilePtr + strlen( first_str); } Message("\nConversion finished!\n"); } ==================================================================== //------------------------------------------------------------------------------------------------------ //funcalls.idc : Display the calls made by a function #include <idc.idc> static main(){ auto ea,x,f_end; ea = ChooseFunction("Select a function to parse:"); f_end = FindFuncEnd(ea); Message("\n*** Code References from " + GetFunctionName(ea) + " : " + atoa(ea) + "\n"); for ( ea ; ea <= f_end; ea = NextAddr(ea) ) { x = Rfirst0(ea); if ( x != BADADDR) { Message(atoa(ea) + " refers to " + Name(x) + " : " + atoa(x) + "\n"); x = Rnext0(ea,x); } while ( x != BADADDR) { Message(atoa(ea) + " refers to " + Name(x) + " : " + atoa(x) + "\n"); x = Rnext0(ea,x); } } Message("End of output. \n"); } =================================================================== And, finally, I have referred to a reslib.idc file throughout this work. It can be found at http://www.eccentrica.org/Mammon/Reslib.idc with it's "caller file" at http://www.eccentrica.org/Mammon/Res.idc 3.Map files Map files may be generated by IDA using the File->Produce Output File->Produce Map File menu item. All of the user-created and auto-generated names (if selected) will be included as symbols in the .MAP files, which then can be converted into Soft-Ice symbol files using NMSYM.EXE. Note that there are a few tricks to this, I recommend using Gij's MaptoMap utility for the conversion. 4.ASM files The ASM files may be used to produce compilable source code. This is not, strictly speaking, the province of the cracker, but a bit of good practice can be found by taking various small .COM files (such as debug, edit, or the various Crack-me's) and re-compiling them.