Beyond Compare v1.7c - packed with ASPack v1.08.03 (244k).
In this first phase we are going to use ASPack to compress Beyond Compare v1.7c, an arbitry target which I selected purely because it is written in Delphi. ASPack has only 2 real configurable options to speak of, the name of one of the new sections is configurable (defaults to .adata), you can also decide whether to compress the resources i.e. the .rsrc section. These of course are probably trivial options as far as we are concerned.
Compressing beyond32.exe reduces the file size from 663k to 253k (a saving of 410k), see how bloated Delphi programs really are :). The first thing we'll do is PEDUMP each of the files (use the /I switch), these will be invaluable references when we come to the re-construction. Now lets document loosely the tasks that most PE file packers perform:-
i) Implement a compression &/or an encryption algorithm (most are variations on known algorithms).
ii) Addition of loading/decrypting/unpacking code.
iii) Required fix-ups to the PE header to make the file run.
The fundamental weakness of every packer is that no matter how strong the algorithms are, at some stage if the code/resource is used it will be in-memory in decoded form (I have affectionately termed this "Stone's theorem"), the question is how easy that location is to find. ASPack is a pretty standard PE packer in all these respects, it adds a very predictable piece of loading & unpacking code at the entry point, unpacks the sections and eventually resumes execution at the original program entry point.
The first stage for us therefore will be to find the program entry point of our compressed beyond32.exe. You can use any number of tools to do this, IDA is best in this instance because it accurately disassembles the start-up code.
:004AC000 PUSHA <-- Save all registers.
:004AC001 CALL $+5 <-- Anti-W32Dasm & SoftICE loader trick.
:004AC006 POP EBP
.................. <-- Gets delta-offset next.
You should now bpx for this code, I replaced the POP EBP with an INT 3 to make
this easier. Now trace through slowly, I'm not going to show you too much code,
instead I'll explain what happens. At the very start the program retrieves the
Image Base and then passes kernel32.dll to GetModuleHandleA (this means that
the program can next get addresses of exports from kernel32.dll with
GetProcAddress), addresses are retrieved for VirtualAlloc() & VirtualFree(), these
will form the crucial part of the unpacking process.
Before going any further I feel I must explain some key terms, your going to run into the acronym RVA or relative virtual address a lot when unpacking, an RVA is simply an offset to a particular item relative to where the file was memory mapped to. Most .EXE files are usually loaded by default at 0x400000 although this can be overridden by your compiler. Lets say we found in SoftICE the .rsrc section starting at virtual address 0x48B000, the RVA would be:-
0x48B000 - 0x400000 = 0x8B000.
This simple equation is very important. We move back to the important code.
001B:004AC31A ADD EDI,[EBP+004450AC] <-- Original virtual address of section. 001B:004AC320 MOV ESI,[EBP+004450B5] <-- VirtualAlloc(). 001B:004AC326 SAR ECX,02 <-- Number of bytes (/4 as its going to move DWORD's). 001B:004AC329 REPZ MOVSD 001B:004AC32B MOV ECX,EAX 001B:004AC32D AND ECX,03 001B:004AC330 REPZ MOVSB <-- Move any additional bytes required.This is the main "moving" routine for the compressed sections, however you have got to be a little careful, first unpacked is the CODE section, this is a trademark of Borland compilation, unsurprisingly the CODE section contains the main program code and is easily the largest section, at the code shown above note EDI & ECX's values, they are important (you should convert EDI into a RVA).
The CODE section is unpacked starting at EDI=0x401000 & ECX=0x79600 thus RVA=0x1000. In fact unpacked isn't strictly true, this is where the decompressed section is copied from the VirtualAlloc() area. If you take a look at the original PE header with ProcDump you'll see how this checks out. A further 3 sections (virtually identical code) will be moved from specifically allocated VirtualAlloc() areas.
Section Value of EDI Value of ECX RVA ------- ------------ ------------ --------- CODE 0x401000 0x79600 0x1000 DATA 0x47B000 0x1000 0x7B000 .idata 0x47D000 0x2600 0x7D000 .rsrc* 0x48C6EC 0x1EB14 0x8C6EC
With ASPack we've got 2 issues, firstly a lot of people asked me about the import section and at what stage you should dump it to file, this is because ASPack firstly uncompresses the import section and later loops through the import names writing addresses of imported modules to another part of the section (the addresses are retrieved using GetProcAddress()).
This is the advantage of spelunking before we attempt a real target, as its a simple process to compare the 2 dumps you can obtain with the original .idata section. It turns out (for this target), perhaps contrary to what you might think that the first dump i.e. before the GetProcAddress() loop is the original one. You should know that the theory behind the PE structure is that the OS loader doesn't have to work very hard creating a process from a disk file, essentially how the program looks with your HEX editor will be very much how it looks in memory.
Extending this is to the compiler used to build the program, what the above discovery suggests is that this program as seen physically on disk doesn't have this information pre-compiled, you can confirm this by looking with QuickView if you so wish, look how the Import Table isn't filled with import names or ordinals, you should contrast this with something built with a MS compiler (e.g. Paint Shop Pro). So when we manually unpack an ASPack target this will be important, because other targets may depend on these being present.
The last section (.rsrc) also doesn't fall in-line, this is ASPack's "way" (compulsory really), certain resources e.g. the icon, version information cannot be compressed, obvious really otherwise the author would have to patch parts of the OS which wouldn't be an option. This means you've got to find the start of the .rsrc section manually in SoftICE and compute its raw size, this isn't hard with the data window.
These anomalies mean we've got to be careful on 2 counts, firstly we've got to make sure we dump all the uncompressed sections at the correct time and the correct size, secondly we'll have to fix the .rsrc section manually, however this is nothing more than a small time consuming exercise and PE dumping will make this easy :).
i) Identify compressed sections.
ii) Unpack them to disk files (noting RVA's & size), also taking care with the last section.
iii) Attach a PE header to the sections (the compressed file's header will be used).
iv) Fix the PE header (raw sizes, raw offsets etc.).
v) Fix the .rsrc section.
Lets start, the compressed sections we identified above are CODE, DATA, .idata & .rsrc, the first 3 can be dumped at the REPZ MOVSD VirtualAlloc() stage, as this is a Borland compiled program we know the .idata section can be also. This should be finger movements using your favourite SoftICE dumper and should result in the following 4 files.
code.dat (497,152 bytes). data.dat (4,096 bytes). idata.dat (9,728 bytes). rsrc.dat (135,168 bytes), virtual address 0x48B000-0x4C0000.
With these 4 dumps start reconstructing the file (I can't recommend UltraEdit enough for this process).
0h -------------------------------------------------- PE Header from packed Beyond32.exe. 400h -------------------------------------------------- CODE section dump file. 79A00h -------------------------------------------------- DATA section dump file (BSS has no size). 7AA00h -------------------------------------------------- .idata section dump. 7D000h -------------------------------------------------- .tls has no size. .rdata from packed Beyond32.exe. 7D200h -------------------------------------------------- .reloc section appears to be stripped. .rsrc section dump file. 9E200h --------------------------------------------------
Well this is interesting, 9E200h=647,680 bytes which is 31,232 bytes shorter than the original, where did the missing bytes go :), well a quick look at the raw sizes is the answer, the original .reloc accounts for 8800h and our .rsrc is actually E00h longer (problem solved). The .reloc section consists of a table of base relocations, these are adjustment values used by the loader if the specified module could not be loaded at its specified address. You can normally strip away the .reloc section from .exe files as relocation is virtually never an issue (with dll's it can be), ASPack is aware of this and will only strip .exe's. The .rsrc extra E00h looks to be an anomaly, you can verify in SoftICE that 21000h is committed to the .rsrc section so we'll chance ignoring the extra length.
Our fun task will now be fixing the first 400h (better known as the PE header). For convenience you can make all of the very obvious changes with ProcDump, however I'm going to do it the hard way with Hiew. Firstly kill the last 2 sections used by ASPack (insert nulls from offset 0x338-0x388), next update the number of sections stored at offset 106 (0Ah - 08).
Working through the header you'll next need to fix the Entry Point RVA (offset 0x128-0x12A), you should be able to find this pretty easily (its 0x7A444), just bpx for the code after the loop through the imports, then trace a few instructions, the code looks something like this:-
PUSH EAX <-- Real Entry Point.
......
POPAD <-- Restore all registers.
RET
Next to edit will be the Size Of Image (offset 0x151), you can make use of the
0xAF000 from the packed file here, as we killed the 2 sections at the end
this value has to be updated, so subtract the Virtual Sizes of the .adata &
.udata sections (0x3000), hence 0xAF000-0x3000 = 0xAC000.
Moving on, next up is the Import Table RVA (offset 0x180-0x182), good job you noted the RVA of the sections in the table at the start :) (its 0x7D000), the size of the Import Table also needs updating (offset 0x184-0x185), the safest thing you can probably do is set it to the raw size of the section (0x2600). You should also check the RVA of the .rsrc section and update its size to 0x21000. I should point out that it is perfectly safe to leave all of the section characteristics set to C0000040 as per ASPack.
Note if you use ProcDump's kill facility it may shorten the PE header or insert the "unpacked with ProcDump by G-RoM etc....." text even though you might have done a lot of the hard work, I've also noticed several other curious "features", I'm reluctant to call them bugs :), just check each stage thats all. Note also that I didn't restore any sections Virtual Size, its not very clever having a section physically larger than the amount of memory allocated to it :).
A discussion regarding "Virtual Size" is perhaps in order, this is the amount of space in memory that will be occupied by the section, by default ASPack rounds this value upwards to the Section Alignment (1000h or 200h), some of these sections Virtual Sizes you can (and I say this sparingly) recover, if you look around the ends of each section which must have a Raw Size (duh!) you'll probably find a lot of nulls, this is the 'padding', if you are prepared to assume that none of this is actually used then you can edit the sections Virtual Size minus this amount, don't tweak this unless you are a purist.
At this stage our program still has no icon or version information and will not run, we must now fix up our .rsrc table (recall that certain resources couldn't have been compressed). Pedump /I the reconstructed file and navigate to the start of the resource listing, now you've got to work out which resources need fixing. Recall that our .rsrc section was unpacked starting from RVA 0x8C6EC to RVA 0xAC000, any resource therefore with a RVA outside this range must be wrong, just scroll through the list looking at the Offset values:-
ResDir (VERSION) Named:00 ID:01 TimeDate:251F3BF7 Vers:0.00 Char:0 ResDir (1) Named:00 ID:01 TimeDate:251F3BF7 Vers:0.00 Char:0 ID: 00000409 DataEntryOffs: 000011D0 Offset: ACAC8 Size: 00378 CodePage: 0This resource as the name suggests is the version information, you'll see at the moment that there is no Version Information. Using Hiew navigate to the raw offset of the .rsrc section (0x7D200), now add 11D0 (the DataEntryOffs to get 0x7E3D0) then use F5 in Hiew to go to this offset. You'll be looking at where the Offset value (DWORD) is stored, evidently this Offset isn't where the version information is, however we can refer to our compressed file and its PE dump and find exactly what our missing resource looks like.
Here's how:
Pedump /I the compressed file and note the offset of the VERSION resource, its 0xACAC8, find this in Hiew and note the first few bytes "78 03 34....." (its a good idea to note the size and use as long as search string as you can when doing this), search for these bytes in the unpacked file, you'll find them in this example starting at 0xAAD1C, now update the DWORD at 0x7E3D0 to reflect this. There are a further 5 fixes you'll need to make (all eminently similar to this one), here are the offset details.
0007E030: 5C 0007E031: ED 0007E032: 08 0007E040: 44 0007E041: F0 0007E042: 08 0007E050: 6C 0007E051: F1 0007E052: 08 0007E3B0: E4 0007E3B1: AC 0007E3C0: 08 0007E3C1: AD 0007E3D0: 1C <-- The example shown above. 0007E3D1: AD
Now run the file and cross your fingers, it will work :). Perfectionists might like to shorten the .rsrc section, theres really no point though.
ASPack v1.08.03 Phase 2 - File-Folder Description Center v3.6.0.0.
When I did some of this unpacking I sometimes ran into a very strange problem when moving large chunks of memory, even though said area was easily large enough and committed, if you get this problem I recommend assembling in instructions to do the job for you as opposed to moving it with the 'm' command. Just place the start of the mapping file in EDI, the unpacked start in ESI and the number of iterations to move and then REPZ MOVSB/SW/SD as required :). Of course you'll almost certainly crash after but it should be recoverable.