home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.sources.misc,alt.binaries.pictures.utilities
- From: jstevens@teal.csn.org (John W.M. Stevens)
- Subject: v36i113: unpost - Smart multi-part uudecoder v2.1.2, Part00/07
- Message-ID: <csm-v36i113=unpost.001556@sparky.IMD.Sterling.COM>
- X-Md4-Signature: 687ba8356663ae94c969c211616c69b9
- Date: Mon, 19 Apr 1993 05:18:22 GMT
- Approved: kent@sparky.imd.sterling.com
-
- Submitted-by: jstevens@teal.csn.org (John W.M. Stevens)
- Posting-number: Volume 36, Issue 113
- Archive-name: unpost/part00
- Environment: UNIX, MS-DOS, OS/2, Windows, MacIntosh, Amiga, Vax/VMS
-
- UNPOST Version 2.1.2
-
- More bug fixes, documentation updates, a new configuration file, etc.
-
- UNPOST is a "smart" uudecoder that is designed to extract binaries from
- multi-part USENET or email uuencoded binaries.
-
- Features:
-
- 1) PORTABILITY! UNPOST has been compiled and sucessfully run on
- MS-DOS, OS/2, Windows, Unix workstations, MacIntoshes, Amiga's
- and VAX/VMS systems.
-
- The code is written to be pure ANSI C, within reasonable limits.
- (some ANSI C capabilities are not used where they would be
- appropriate due to lagging compliance in most compilers. Hey,
- Unix types! MS-DOS (Borland C++ 3.1) is a MUCH better compiler
- than anything I've seen on a Unix workstation! And their debugger
- is the best I've used, as well). Unfortunately, there are still
- a lot of Unix boxes that have only a K&R compiler, so it may
- not port well to those. I personally check to make sure that it
- will compile and run on an MS-DOS box running MS-DOS 5 and Windows
- 3.1, using the Borland 3.1 C++ compiler, as well as a Sun (running
- SunOs 4.1.1 sun4c) using the gcc compiler (version 2.1). I know
- for a fact that the Sun cc compiler will NOT compile UNPOST
- succesfully.
-
- K&R compatibility is being considered, but it is a low priority
- feature.
-
- 2) CONFIGURABILITY! UNPOST comes with a default set of rules for
- detecting and parsing a VERY wide range of possible Subject:
- line formats, but no configuration can be correct for every
- situation.
-
- With that in mind, UNPOST can be configured by the user by creating
- a text file that contains the regular expressions, etc. that
- UNPOST uses to recognize, parse, etc. WARNING! UNPOST depends
- almost ENTIRELY on the contents of it's configuration file for
- correct operation.
-
- Regular expressions are complex, and writing one that works the
- way you expect it to takes care and, most importantly,
- experimentation.
-
- To this end, the standard UNPOST installation creates both the
- UNPOST executable and a regular expression test program called
- RETEST. RETEST is like grep, feed it a regular expression and
- a file, and RETEST will tell you what it matched and the sub
- strings that it extracted.
-
- 3) INTELLIGENCE! UNPOST uses every trick in the book to try to
- guess what the poster/sender REALLY meant.
-
- Also, UNPOST is not limited to finding all of it's information
- on a single line, or even in the header of a posting/letter.
-
- UNPOST has succesfully extracted binaries from postings that had,
- as a subject line,
-
- Subject: aaaa
-
- because UNPOST recognized the signature placed into the body of
- the article by a uuencode/split program.
-
- 4) FLEXIBILITY! UNPOST has switches that allow it to be configured
- to do different things for different tastes. For instance, UNPOST
- will intelligently sort out articles into four different classes:
-
- 1) Articles that are part of a complete and correct binary in
- the input file. These are sorted, concatenated, uudecoded
- and written out to a file name that is the same as that
- on the uuencode begin line.
-
- Depending on the setting of the file name switch, the file
- name of the binary may be modified. See below.
-
- 2) Articles that are pure text (no uuencoded data in them).
-
- If the -t switch and a file name are specified, these
- articles will be written out to the file for reading.
-
- Obviously, these articles should NEVER be encountered in
- a binaries news group, but not a single day has ever gone
- by that I did not see non-binary postings to binary news
- groups.
-
- 3) Articles that are part of incomplete postings (four parts,
- but only three have shown up so far), or that comprise
- a complete binary, but one that had an error in uudecoding,
- interpretation, etc.
-
- If the -i flag and a file name are specified, these articles
- will be written out to the file. If the -b switch is
- on, incompletes will be written to separate files. If
- both are on, those incompletes that can be guessed at
- as having a file name will be written to a separate file,
- all else will be written to the file named by the -i
- switch.
-
- In my experience, two types of articles end up in an
- incompletes file, those that have missing parts, and
- those that have been misinterpreted by UNPOST as belonging
- to a different binary than they really do.
-
- 4) Articles that are pure text that describe a posting
- (these are usually found only in the pictures groups).
-
- If the -d flag is set, and the binary to which they
- belong is correct and complete, this article, as well as
- the header and body up to the uuencode begin line of the
- first article, will be written to a file that has the same
- base name as the binary, but with the extension .inf.
-
- UNPOST automatically mungles binary file names to be MS-DOS
- compatible (the lowest common denominator). This is switch
- controllable, and can be turned on or off (depending on the
- default setting selected by the person who compiled UNPOST).
-
- UNPOST also has two lesser modes, sorted mode and uudecode mode.
-
- In sorted mode, UNPOST assumes that the articles still have
- headers, and that there may be un-uuencoded lines in the middle
- of a uuencoded file that have to be filtered out, but it assumes
- that all parts are present, and that they are in order. Header
- information, however, is ignored.
-
- If you use the incompletes file capability of UNPOST, you will
- notice that it writes out the segments that it did interpret
- correctly in sorted order.
-
- In uudecode mode, UNPOST acts like a simple uudecoder. UUencoded
- files must be complete, with a begin and end line, and no
- un-uuencoded lines can appear between the begin and end lines.
-
- However, uudecode mode is the ONLY mode where UNPOST will accept
- a short line (one that was space terminated, but had the spaces
- chopped off) as a legal uuencoded line and properly decode it.
-
- 5) INFORMATIVE! UNPOST is a very talkative program. It detects
- and reports many kinds of problems, tells you what it thinks
- is going on, and tells you what it is doing. All this information
- is written to standard error, or if the -e switch and a file
- name are specified, written to that file.
-
- Changes for UNPOST Version 2.1.2
- --------------------------------
-
- 1) Bug fix. I screwed up the regular expression compilation for the
- -r switch. Fixed. See note 5 for version 2.1.1 in the changes.doc
- file for information on how to select one of four sources for your
- news.
-
- jstevens@csn.org
-
- exit 0 # Just in case...
-