home *** CD-ROM | disk | FTP | other *** search
-
-
- UFC-crypt: ultra fast 'crypt' implementation
- ============================================
-
- @(#)README 2.3 10/04/91
-
- Design goals/non goals:
- -----------------------
-
- - Crypt implementation plugin compatible with crypt(3)/fcrypt
-
- - Extremely high performance when used for password cracking.
-
- - Portable to most 32 bit machines
-
- - Startup time/mixed salt performance not critical
-
- Features of the implementation:
- ------------------------------
-
- - Runs 25-45 times faster than crypt(3) when invoked repeated
- times with the same salt and varying passwords.
- With alternating salts, performance is only about 4 times
- that of crypt(3).
-
- - Tested on 68000,386,SPARC,MIPS,HP-PA and RS/6000 systems
-
- - Requires 280 kb for tables.
-
- Author & licensing etc
- ----------------------
-
- UFC-crypt is written by Michael Glad, email: glad@daimi.aau.dk.
- It is covered by the GNU library license version 2, see the file 'COPYING'.
-
- Installing
- ----------
-
- Edit the Makefile setting the variables
-
- CRYPT: The encryption module to use; crypt.o should always work.
- If running on one of the machines for which special support
- is available, select the appropriate module.
-
- FL: If your OS does not have the routines bcopy/bzero in libc,
- try '-DSYSV'.
-
- CC: The compiler to use
-
- OFLAG: The highest level of optimization available
-
- Now run 'make'. The files 'ufc' and 'libufa.c' are produced.
- Try 'ufc <number>' to test proper operation and to do benchmarks.
- If you link 'ufc' without 'libufc.a' you can benchmark your crypt(3).
-
- 'libufc.a' can be linked into your applications. It is compatible with
- both crypt(3) and the fcrypt shipped with Alec Muffett's Crack program.
-
- For use in Crack: in 'Sources/Makefile', substitute crack-fcrypt.o with
- a path to libufc.a in the CRACKMOD declaration. For benchmarking, use
- the 'speedcrypt' program in the same directory.
-
- Benchmark table:
- ---------------
-
- If special assembly language support was used, the machine name is marked
- with '*'. Times are in milliseconds, a line gives how many encryptions UFC-crypt
- can perform per second on the various architectures.
-
- |----------|---------------------------------------------|
- |Machine | SUN* SUN* HP* DIGITAL HP |
- | | 3/50 ELC 9000/425e DecStation 9000/720 |
- |----------|---------------------------------------------|
- | Crypt(3) | 230 37 60 38 17 |
- | Ufc | 5.2 1.5 1.4 1.2 0,36 |
- | Ufc/sec | 190 670 715 830 2700 |
- |----------|---------------------------------------------|
- | Speedup | 45 25 43 32 47 |
- |----------|---------------------------------------------|
-
- It seems as if performance is limited by CPU bus and data cache capacity.
- E.g. on a SUN4, UFC-crypt reads almost 10 mb/sec into the CPU
- registers. As 128kb of memory are swept during an encryption, it may also
- misbehave with datacaches.
-
- Optimizations:
- -------------
-
- Here are the optimizations used relative to an ordinary implementation
- such as the one said to be used in crypt(3).
-
- Major optimizations
- *******************
-
- - Keep data packed as bits in integer variables -- allows for
- fast permutations & parallel xor's in CPU hardware.
-
- - Let adjacent final & initial permutations collapse.
-
- - Keep working data in 'E expanded' format all the time.
-
- - Implement DES 'f' function mostly by table lookup
-
- - Calculate the above function on 12 bit basis rather than 6
- as would be the most natural. This eats core for tables
- but it's critical for performance.
-
- - Implement setup routines so that performance is limited by the DES
- inner loops only. At the moment the key calculation routine accounts for
- a rather unavoidable 16% of the CPU usage.
-
- Minor (dirty) optimizations
- ***************************
-
- - combine iterations of DES inner loop so that DES only loops
- 8 times. This saves a lot of variable swapping.
-
- - Implement key access by a walking pointer rather than coding
- as array indexing.
-
- - As described, the table based f function uses a 3 dimensional array:
-
- sb ['number of 12 bit segment']['12 bit index']['48 bit half index']
-
- Code the routine with 4 (one dimensional) vectors.
-
- - Design the internal data format & uglify the DES loops so that
- the compiler does not need to do bit shifts when indexing vectors.
-
- - For selected architectures, code the quite little crypt function itself
- in assembly language but keep the support routines in C.
-
-