HTTX HTML to TEXT converter Created by Gabriele Favrin (E-Mail: favrin@tin.it - FidoNet: 2:333/726.8) Version 2.0b - December 1999 Index: 1. Utilization terms 2. Property of HTTX and distribution terms 3. Introduction 4. Hardware requirements and installation 5. How to use 5.1 Command line parameters 5.2 External configuration 6. Error Messages and AmigaDOS Return Codes 7. FAQ (advices, interfacing with other programs and more) 8. Technical informations 8.1 What is supported, what is not (yet) and implementation 8.2 Notes about ANSI conversion 8.3 Notes about conversion of
, , <LISTING> and <SCRIPT>
         contents
  9. How to contact the Author
 10. Greetings
 11. Program history
 12. Future versions



-----------------------------------------------------------------------------

                              *** CHILDWARE ***

This  software  is  "CHILDWARE". The author explicitly asks whoever uses this
program  to  make  a  donation toward a beneficial corporate body which works
helping children in some way.

If  you  don't know of any, ask at your local post office and inform yourself
on how to make a donation to UNICEF.

The amount of the offer is up to you, but please do it!

-----------------------------------------------------------------------------



1. Utilization terms
--------------------

Before  running  this  program  on  your  computer, please read carefully the
following  paragraph,  and  continue only if you agree with the terms written
below.

THE  HTTX  AUTHOR  IS IN NO WAY RESPONSIBLE FOR MORAL AND/OR MATERIAL DAMAGES
THAT  HIS PROGRAM MAY CAUSE TO PEOPLE OR THINGS. THE PROGRAMMER GAVE HIS BEST
TO  LIMIT  THE  PROBLEMS THAT HTTX MAY CAUSE, BUT HE IS NOT ABLE TO GUARANTEE
ITS  EFFICIENCY  IN  ALL  THE  SITUATIONS.  USING  HTTX,  YOU,  THE USER, ARE
RESPONSIBLE FOR ALL MORAL, MATERIAL, CIVIL AND PENAL THINGS.


WARNING:

Many HTML documents are under Copyright, and are not freely distributed, even
if  converted  to plain text format. The author declines every responsibility
in the utilization of the files generated with HTTX.

All  of  the  programs  mentioned  in  this  document are properties of their
respective owners.



2. Property of HTTX and distribution terms
------------------------------------------

The executable program, the source code and the ideas that are its basis are
the EXCLUSIVE PROPERTY of Gabriele Favrin. All rights reserved.

HTTX is freeware, NOT public domain. It may be spread only if the executable
files and the documentation remain unchanged. Distribute of the files in
archive formats other from LhA is permitted, but compressing the individual
files using PowerPacker or similar tools is not.

Commercial use of this package is exclusively granted to AmiTrix who may
spread it with complete or partial versions of the AWeb WWW Browser.

Staffs of Aminet, Fred Fish, Meeting Pearls, Amy Resource and Amiga magazines
which have a cover CD disk are authorized to include HTTX in their public
domain software collections.



3. Introduction
---------------

HTTX (HTml > TXt) is a program to convert files from HTML format, used for
viewing files on World Wide Web, to pure ASCII. There are analogous products,
but since none had completely satisfied my needs, I started to write one
myself.

I don't say this is the best or the fastest one, but surely it has some
functions unpublished in similar Amiga programs till now.



4. Hardware requirements and installation
-----------------------------------------

Required system:
Amiga, 512K, Kickstart 2.04 (37.175) or above.

Required memory:
The size of the file to convert to, and about a 15K, for buffers and other.

Install:
Copy HTTX to your C: directory or a directory in your current path.

HTTX is compatible with AmigaOS 3.5 and U.A.E.



5. How to use HTTX
------------------

HTTX can be only used from Shell. Users of AWeb please note that a special
ARexx plugin to fully control HTTX from the browser is included in this
distribution. Also please read section 7.

Command syntax:

HTTX InputFile [OutputFile] [options]

The parameters in square brackets are optional. You are only required to
specify a valid HTML file ("InputFile").

If there is no OutputFile specified, it defaults to 'InputFile'.txt (eg.
"test.html" will be saved as "test.txt"). If a path is specified for
OutputFile, that file will be saved to that path.

Examples:

-> HTTX data:txt/html/abox.html

The file "aboxe.html" will be converted and saved
as "data:txt/html/abox.txt"

-> HTTX data:txt/html/aboxe.html ram:aboxe.txt

The file "abox.html" will be converted and saved
as "ram:abox.txt"

-> HTTX data:txt/html/abox.html data:txt/

The file "abox.html" will be converted and saved
as "data:txt/abox.txt"



5.1 Command line parameters
---------------------------

HTTX offers many options to control the conversion process.


LEN
   Maximum line length of the output file.

   Default: 77 - Minimum: 15 - Maximum: 255


INDENT or IN
   Number of spaces for indentation (re-enter to the right) of the <UL>, <OL>
   and <DL> lists. The specified value must allow at least two levels of
   indentation regarding of the line length specified with the LEN option.

   Default: 3 - Minimum: 1 - Maximum: (LEN value - 10) / 3


ANSIMODE
   ANSI conversion of HTML styles and LINKS (HREF and NAME) and optimization
   of alignment functions.

   Not to be used if the converted text will go on message areas, like
   Fidonet or Usenet newsgroups.

   Please read the section 8.2 (ANSI conversion) for important
   informations about ANSI sequences and general compatibility issues.

   The option supports three modes:

   ANSIMODE=0
      No ANSI codes are used. This equals to not place the option both in the
      command line or in the configuration.

   ANSIMODE=1
      Standard ANSI codes are used. This is the only ANSI conversion type
      available while printing.

   ANSIMODE=2
      VT100 cursor control sequences are used to reduce size of the file by
      compressing spaces. This could cause incompatibility with some programs
      (see section 8.2).

   Default: ANSIMODE=0 (styles are not converted).


ANSICOL
   Color used to render the links during ANSI conversion.
   Accepted value range is from 0 to 9. The colour shown may depend from the
   terminal program used.

   Default: 3 (file save) or 4 (printing).


7BIT
   Conversion of HTML entities (accent letters, symbols, and so...) to ASCII
   codes lower than 128. This is required for text forwarded on nets like
   FidoNet, where the character codes allowed can only range between
   32 and 127.

   IMPORTANT: remember that the ANSI option adds Escape codes (ASCII 27),
   forbidden on FidoNet, and strongly not recommended for a non personal use
   (broadcast) of converted text.

   Default: OFF (8 bit chars are not converted).


HRMODE or HR
   HTML documents often contains the <HR> TAG, which defines a separating
   line between paragraphs. HTTX allow the management of these lines in
   three ways:

   HRMODE=0
      No lines drawn.
      It was NOHR in HTTX versions before 1.5.

   HRMODE=1
      Lines are drawn using the minus "-" character.

   HRMODE=2
      Lines are drawn using underlined spaces (in ANSI). This mode
      generates a nicer line, but introduces ANSI codes, it is absolutely to
      be avoided if the text will go on FidoNet or Usenet newsgroups.

   Default: HRMODE=1 (lines are inserted using the minus "-" character).


NOALIGN or NA
   HTTX supports (right or center) alignments of texts and separators (<HR>).

   Examples:

                                centered text
                                                           right-aligned text

   If NOALIGN option is ON, both the above lines will start on left margin,
   this saves characters.

   Default: OFF (alignment is rightly converted).


TDEOL
   Insertion of an EOL between cells of an HTML table. Using this option
   can improve the layout of some tables.

   Default: OFF (between each cell a space is inserted)


SETNOTE or SN
   Use the document title (<TITLE> TAG) or its url as output file
   comment. This option is ignored when options PRINT or STDIO
   (display on console) are used.

   The options supports the following values:

   SETNOTE=0
      No comment is added.

   SETNOTE=1
      Title of the document, if one, is added as comment of the output file.
      Only first 64 characters are saves, as HTML standard defines.

   SETNOTE=2
      URL of the document, if available, is added as comment of the output
      file.

   Default: OFF (file comment is not set).


SITE
   Insertion of the specified source URL in the output file.
   It may be useful to know which document the file was created from.

   Example:
   HTTX ram:children.html SITE=http://www.unicef.org

   will start the file with "URL  : http://www.unicef.org"

   Note: SITE has priority over GETNOTE, so specifying a site this way will
   override that option if it is active.

   Default: OFF (without this option the URL will not be added).


GETNOTE or GN
   Usage of the input file comment as the URL. This option is alternative to
   SITE and is useful with files created using AWeb or other browsers and
   programs which save the URL in the comment.

   The AmigaDOS comment length limit is 80 chars.

   Default: OFF


NOHEADER or NOHEAD
   Don't insert HTTX version information, title (<TITLE>) and URL in the
   converted file. This option automatically turns off the SITE and GETNOTE
   options if present.

   Default: OFF (HTTX version, optional title and URL may be added).


HREF or LINK
   Adds the addresses of links (<A> TAG).
   Very useful if the document contains links that you want to keep.

   Default: OFF (links aren't added).


IMG
   Insertion of the ALT-text of images (<IMG> TAG) in the output file.
   Useful if the document contains images with descriptions.

   Default: OFF (ALT-text isn't added).


SCRIPT or SCP
   Insertion of the content of <SCRIPT> TAG, eg. JavaScript, in the output
   file.

   Note: this option adds the script itself, not its result!

   Please read the section 8.3 for important information about conversion of
   this type of text.

   Default: OFF (<SCRIPT> content is skipped).


BADHTML or BAD
   Partial support for documents created outside of the HTML standards. Use
   this option only if parts of the converted page are missing. Using this
   option with correct HTML documents may cause unpredictable results in the
   converted document.

   Default: OFF (HTTX uses standard DTD rules).


FORCE
   Forces conversion of input file without checking if it is an HTML
   document.

   USE IT AT YOUR OWN RISK: conversion of normal text or binary files may
   cause unpredictable results.

   Normally, HTTX considers a file valid HTML when one or more of the
   following conditions is valid:

   - file extension is .html or .htm
   - the starting TAG is <HTML>
   - the starting TAG is <!DOCTYPE ...>

   This option must be specified if the three above conditions are false,
   even if the file IS an HTML document.

   Default: OFF (automatic check of file).


STDIO
   Display the converted file on screen instead of saving it on disk. This
   option automatically enables the QUIET option.

   Default: OFF (converted file is saved to disk).


PRINT or PRT
   Print the document instead of displaying or saving it.

   The printer.device will convert standard ANSI codes and end-of-lines to
   the ones used by the Printer set in your Preferences.

   This option should be used if you want to print the converted document.
   If ANSIMODE is enabled (value different than 0) it will be set to 1 for
   use of standard ANSI codes. Also, if not specified, value of option
   ANSICOL will be set to 4 (blue color, as from Commodore specifications
   regarding the printer.device).

   Older versions of HTTX used a solution like "HTTX abox.html prt:",
   which is now to avoid.

   This option automatically enables QUIET, and turns off FILENOTE and STDIO
   options.

   Default: OFF (document is displayed on screen or saved to a file).


APPEND
   Normally HTTX will overwrite an existing file.

   If APPEND is ON, the converted text will be added to the end of the
   specified file.

   Default: OFF (overwrite output file if it already exists).


NOCFG
   HTTX loads a default configuration from ENV:httx.prefs (if another is not
   specified with the CFG option). If this option is ON, HTTX uses the
   default values for the options or the parameters specified in the command.

   For more informations on external configuration, see section 5.2.

   Default: OFF (HTTX searches for default or specified configuration).


CFG
   With this option it is possible to specify the name of the configuration
   file to be used by HTTX. This file MUST be located in the ENV: directory.

   This option turns NOCFG OFF.
   For more informations on external configuration, see section 5.2.

   Default: OFF (HTTX loads the httx.prefs configuration file).


INCLUDE
   Using this option it is possible to include text at the start
   of each output file, before the converted HTML data.

   The included text file is NOT ALTERED IN ANY WAY, no 8 bit character
   conversion, wordwrap, ANSI codes and so on. HTTX will not warn if 8 bit
   characters are included.

   Remember this, especially if the converted text will go on message areas,
   like FidoNet or Usenet newsgroups.

   Default: OFF (no text file is included in the output file).


QUIET
   Don't display any HTTX messages. This option is useful when HTTX is used
   within a script.

   WARNING: if active this option also hides error messages, but the AmigaDOS
   error codes are always returned.

   Default: OFF (HTTX output is displayed).


If not specified, HTTX uses the default settings.

When conversion is finished, if QUIET option is OFF, HTTX will show:

 - the size of input and output files.
 - the presence of 8 bit chars and their conversion, if active.
   (relative only to HTML content, not the optional included file).
 - TABs or ASCII chars less than 32 not converted because they are included
   in pre-formatted text.
 - non-standard HTML comments, which may make parts of document invisible.
   If the converted file appears incomplete, try using BADHTML option.



5.2 External configuration
--------------------------

HTTX supports an external configuration, this is a text file that includes
the most used options, so they do not need to be typed every time you use the
program.

By default (except when NOCFG option is set, or CFG option with a different
filename) HTTX searches the file "ENV:httx.prefs".

It's possible to create multiple configurations, maybe one to use for file
conversion and another one to use for printing, creating different
configuration files and enabling the CFG option with the name of the file (do
not specify the path, it is always 'ENV:').

Example:

-> HTTX abox.html

Converts "abox.html" using default configuration (ENV:httx.prefs).

-> HTTX abox.html PRINT CFG=httxprt.prefs

Converts "abox.html" using the configuration file ENV:httxprt.prefs



Allowed parameters
------------------

The external configuration file supports a subset of available command line
options. Each option MUST be specified in its extended form (for example
HRMODE, not HR, INDENT instead of IN, and so on).

The file must contain only the options and their parameters. It's allowed to
put each option on a separate line for better readability.

Available options (for description see section 5.1):

LEN      - the maximum length for output lines.
INDENT   - the indentation size.
ANSIMODE - type of ANSI sequences used during conversion.
ANSICOL  - colour used for links during ANSI conversion.
7BIT     - conversion of 8 bit HTML entities to 7 bit chars.
HRMODE   - line drawing mode.
NOALIGN  - ignore center and right alignment.
TDEOL    - insertion of an EOL between table cells.
SETNOTE  - use HTML document title or URL as the output file comment.
GETNOTE  - use the source file comment as URL of destination file.
NOHEADER - skip header (HTTX version, URL and title of original
           document) in the converted file.
HREF     - insertion of links (<A>) in the destination file.
IMG      - insertion of the ALT-Text of images in the destination file.
SCRIPT   - insertion of the content of <SCRIPT> element in the
           destination file.
BADHTML  - partial support for badly written HTML.

Parameters specified on command line acts after the parameters specified in
the configuration file. This can eventually override (or toggle twice) one or
more options.

Examples:

If a configuration file has the following line:

IMG GETNOTE LEN=70

and on command line you write:

-> HTTX abox.html IMG

the result is IMG turned on because it's present in configuration file, but
is turned off again because it's also present in command line.

-> HTTX abox.html LEN=74

LEN is both present in configuration file and command line, but this one
overrides the previous value. LEN is now set to 74.



How to create an external configuration
---------------------------------------

External configuration files are in effect system variables and are located
in ENVARC: directory (on disk) and ENV: (generally on RAM). So, the contents
in ENV: are valid only for the current session, while the contents in ENVARC:
are also valid after a reset.

To permanently save a configuration file, copy it both in ENV: and ENVARC:

Use your favorite text editor (Ed, Cygnus Editor, GoldEd, and so on) to
create your prefs file, httx.prefs is the default filename. Save it in ENV:,
also save the file to ENVARC: so it will not be lost when you reboot.
Temporary changes can be made by editing just the ENV: file.

HTTX configuration may be fully managed using the plugin for the AWeb WWW
browser.



6. Error Messages and AmigaDOS Return Codes
-------------------------------------------

When execution terminates, HTTX returns the appropriate AmigaDOS Return Code
(RC), usable within scripts to determine if the conversion was successful.
See your AmigaDOS handbook for a complete list of error codes.

In case of error, if QUIET is off, the appropriate AmigaDOS message will be
displayed.

Following is a list of the most common errors. If the system is localized,
messages are displayed in the appropriate language. See your AmigaDOS manual
for further information.


Argument line invalid or too long
   Arguments entered in a wrong way.

*** Break
   The user has pressed Control-C keys, interrupting the conversion, and the
   output file has been removed.

Not enough memory available
   There is no memory available to allocate the buffers used by HTTX.
   This can happen if the HTML file to convert or the text file to include
   is too big or the memory is too fragmented.
   Try to rebooting your Amiga.

Object not found
   Specified file doesn't exist or it's not accessible.
   Check file name and path.

Object is not of the required type
   The input file seems not to be an HTML document.
   Try using the FORCE option.


HTTX can display other errors (in English only) due to wrong use of commands
or options:


The line length must be between 15 and 255 characters (current is NN)
   The line size specified with LEN parameter is a number less than 15 or
   more than 255.

Indentation size must be at least 1 (current is NN)
   The indentation size must be at least one character.

With line length XX, indentation size YY, max indent level is ZZ.
You must allow at least 3 indentations.
   The maximum indentation level, with specified line size and indent value,
   is less than 3.

HRMODE value must be 0, 1 or 2
   Value set for HRMODE is not valid. It must be 0, 1 or 2.


Finally, there are a few warnings which may be displayed. The conversion will
take place but there may be situations altering the final result:


Error in env config. HTTX will use its defaults.
   External configuration file has errors. HTTX will use the default settings
   and the parameters passed on the command line.

   Check external configuration file (see section 5.2).

ENV config 'NAME' not found.
   The configuration file specified with CFG option was not found.
   HTTX will use the default settings and the parameters passed on command
   line.

   Remember that the configuration file must be located in ENV:
   Remember also to copy it again to ENVARC: when you edit it.

Found non-ASCII chars in preformatted text!
   In a non formatted text section HTTX has found some 8 bit characters.
   Do not ignore this warning if the converted text will be posted in
   Fidonet conferences or Usenet newsgroups.

This file contains non standard HTML comment(s)!
   File could be not completely converted, since non standard HTML comments
   were found. If this is the case, try using BADHTML option.

Include file could not be added.
   File specified with INCLUDE option can't be added because it doesn't
   exist or it's not accessible.
   Check file name and path.



7. FAQ (advises, interfacing to other programs and more)
--------------------------------------------------------

Q. ANSI styles (bold, italic, underline, blue) stop after first line.
A. See section 8.2.

Q. Converted text isn't centered, but in the original document it is.
A. This can happen if the text in a table row (<TR>) or cell (<TD>) is
   defined as centered. To maintain compatibility to some programs used with
   HTTX, this version doesn't yet supports alignment defined in those
   elements. This will be added in future versions that will have more table
   support. Otherwise it may be an ANSI compatibility problem
   (check section 8.2).

Q. Sometimes alignment doesn't work, wordwrap and lists are not correctly
   formatted or HTML TAGS are shown.
A. It's the text included in the element <PRE>. HTTX copies this text as is,
   without formatting. This choice was made because often that kind of text
   contains text that the document's author probably wishes to show as is.

   In <LISTING> and <XMP> the TAGS are left as they are, as specifications
   for those elements define. Although its use is deprecated in HTML 4.0,
   <XMP> is still largely used for examples in many documents.
   eg. in the Netscape JavaScript specifications.

Q. Some pages are not correctly converted...
A. There could be many reasons: layout based on tables (not fully supported,
   see section 8.1), errors on HTML source (HTTX is quite tolerant, but there
   are limits) or errors on HTTX engine. If you think the page is correct,
   send me an E-mail with the URL.
   (E-Mail: favrin@tin.it, FidoNet: 2:333/726.8)

Q. Can I use HTTX from other programs?
A. In this archive an ARexx plugin is provided to use HTTX with the AWeb
   browser.

   Regarding external programs, HTTX can be easily used from Directory Opus
   by creating a button configured as follow (Directory Opus 4.12):

   New Entry/AmigaDOS:
     C:HTTX {f} {d}
     (replace C: with path for HTTX)

   With this configuration, a file selected from "source" directory will be
   converted to text and saved to the "destination" directory.

   By activating "Do all files" flag it's possible to convert more than one
   file, by selecting them and clicking the HTTX button.

Q. Are there GUI for HTTX?
A. Alfonso Ranieri (alfier@iol.it) has written an ARexx script that offers a
   StormWizard interface to configure HTTX.

   The AWeb plugin contains a Reaction interface for configuration. A
   separated version to be used from wb will be available soon.

Q. I'm using a program that refers to HTTX for printing html (eg. MoreHTML).
   If I install versions newer than HTTX 1.1b the printing doesn't work any
   more.
A. Various programs that use HTTX were written to work with version HTTX
   1.1b. They use a command like "HTTX <filename> prt:" to print the
   converted text. This doesn't work any more, especially if the ANSIMODE
   option is used. HTTX versions 1.5 and newer may use some ANSI codes that
   are not compatible with this printing method.

   The solution is to ask author of these programs to change the template
   they use to call HTTX, if it isn't directly configurable by the user.

Q. How can I improve the performances of HTTX?
A. To speed up the conversion, try using a filesystem with blocks of 1024
   bytes, like RAM disk. Note that if memory is almost full or fragmented,
   saving to RAM disk may slow down the conversion process.

Q. Can I make HTTX resident?
A. Starting from version 2.0 HTTX "should" be resident-able. Anyway it hasn't
   been possible to fully test its behavior in this state, so, to avoid
   problems to users, this option is not officially supported.

   Whoever wants to try can type the following command in a shell:

   Resident C:HTTX FORCE
   (replace C: with path for HTTX)



8. Technical informations
-------------------------

This section talks about some thematics of HTML and its implementation in
HTTX. Although reading this is not required to learn how to use HTTX, there
is important information about conversion that you should read if you plan to
distribute your converted texts.



8.1 What is supported, what is not (yet) and implementation
-----------------------------------------------------------

Supported HTML:

 - Entities described in RFC 1866, &copy; and &reg; (NHTML), numeric entities
   (both decimal and hex), Win'95 numeric entities and special characters.
 - Separators (<CENTER>, <DIV>, <BR>, <P>, <HR>) and font height change
   (from <H1> to <H6>).
 - Alignment (center and right) of text (headers and paragraphs) and
   separators.
 - Physical and logical styles.
 - Numbered lists (<OL> with possible START attribute) of numeric,
   alphabetical and mixed type, unnumbered (<UL>) and definition (<DL>) lists
   to a maximum of 255 levels.
 - Document title (<TITLE>).
 - Links (<A>), user maps and inline images (<IMG> with optional
   ALT-text).
 - Pre-formatted text (<PRE>, <XMP> and <LISTING>).
 - Scripts (content of the<SCRIPT> TAG).
 - Non standard use of "<" and ">" in a preformatted text (this may be
   changed).

What is (not yet fully) supported:

 - Tables (<TABLE>). Currently each table cell is treated as a separate
   document with its alignment, list indentation level, styles and so on.
 - <APPLET>, <STYLE> and <SELECT>: content of these elements is skipped.

Implementation of the standard:

 - Unknown TAGS are ignored.
 - Double spaces, trailing and leading blanks for each line are removed.
 - Unprintable ASCII chars (lower than 32) are converted to spaces.
 - PC end-of-lines (CR+LF) and MAC (CR) are converted to Amiga format (LF).
 - For better readability of text in tables, <TD> is converted to a space and
   <TR> TAGS are converted to EOL. Between two <TR> (a table row) a
   maximum of one separator (<HR>) is shown.
 - Consecutive EOLs are reduced to one EOL (except for <BR>).

These rules are followed except in <SCRIPT>, <PRE>, <LISTING> and <XMP>
elements). See section 8.3 for more informations about this.



8.2 Notes about ANSI conversion
-------------------------------

If ANSI option is enabled (values 1 or 2) HTTX uses ANSI escape sequences for
converting HTML styles (such as bold, italic, underlined and so on...), links
(rendered as underlined blue), centering and indentation of text.

The ANSI codes used are taken from standard ANSI specifications and should be
supported by any program (should be...).

These are the sequences (ESC is replaced by "\e"):

Bold         \e[1m
Italic       \e[3m
Underlined   \e[4m
Color (Blue) \e[33m (\e[34m with the PRINT option)

Multiple ANSI definition is done using the ";", ie. to set bold and italic
HTTX uses "\e[1;3m".

If ANSIMODE option gets value 2, for list indentation and text alignment,
HTTX uses the cursor position sequence "\e[nnC" where nn is the number of
characters to move right. This sequence is not used when printing.


Compatibility problems:

- ANSI standard says that the end of line doesn't cause a style to stop, so
  if a style is terminated after first line the problem is in the program
  used to display text. Standard shell and MultiView works correctly.

- Some text viewers (like Multiview) don't support VT100 cursor positioning
  sequences that are used when ANSIMODE=2 (optimized ANSI) is enabled. If a
  page contains lists or aligned text that appears badly converted, try to
  convert it using ANSIMODE=1 option or use a different text viewer.



8.3 Notes about conversion of <PRE>, <XMP>, <LISTING> and <SCRIPT> contents
---------------------------------------------------------------------------

Rules specified for implementation of the standard (section 8.1) wordwrap of
text and 7 bit conversion of 8 bit characters aren't totally valid for some
elements. The HTML specifications require them to be treated differently.

- <PRE> element (preformatted text)
  In this mode wordwrap is not done. Parsing of HTML TAGS (except lists and
  alignment) and entities is done. Numeric entities lower than 32 aren't
  converted to avoid problems with uuencoded files contained in HTML pages.

- <XMP>, <LISTING> and <SCRIPT> elements.
  The contents of these elements are left unchanged. No wordwrap, entities
  conversion, or TAGS parsing is done at all. If the 7BIT option is used,
  ASCII characters lower than 32 are converted to spaces. ASCII characters
  higher than 128 are left as they are and Win'95 entities are not remapped.
  Remember this, if the converted text will go on message areas, like Fidonet
  or Usenet newsgroups!



9. How to contact the Author
----------------------------

Beyond every communication, problem, bug report, advise or other things, a
comment about HTTX will be appreciated, and information of actions toward
corporations who takes care of children (see CHILDWARE).

E-Mail : favrin@tin.it
FidoNet: 2:333/726.8

Please write me in italian or english, thank you.

HTTX support page:
http://www.freeweb.org/varie/poing/soft/httx/index.html



10. Greetings
-------------

Beta testing:

William Parker        bill@amitrix.com
Giuseppe Pasanisi     amicus@net-service.net
Neil Bothwick         aweb@wirenet.co.uk
Marco L. Buschini     shido@mclink.it
Claudio Mazzuco       kirk@maya.dei.unipd.it
Giuseppe Ammendolia   ryuga@freenet.hut.fi

English documentation and spell check:

Fabio Belli           zak@anturio.com
William Parker        bill@amitrix.com

Beta testing of AWeb Plugin:

William Parker        bill@amitrix.com
Neil Bothwick         aweb@wirenet.co.uk
Dale Currie           dalec@zorro.amitrix.com

Special thanks:

To Yvon Rozijn for having wanted HTTX inside AWeb II and for all the help he
gives me!

To Bill Parker for all the help, his suggestions and his friendship.

To Wouter van Oortmerssen for the splendid AmigaE and to Tomasz Wiszkowski
for its evolution, CreativE.

To Enrico Altavilla for his programming hints.

To the italian music group Pooh (http://www.pooh.it) for the emotions they
gived me for years with their music. Maybe HTTX is born thanks to them too...

Finally special thanks to those who wrote me about HTTX and to those who use
it!



11. Program history
-------------------

V2.0b (December 1999)
  -Fixed a bug that could cause insertion of incomplete ANSI codes
   in standard ANSI conversion.
  +Made the source compatible with the new CreativE compiler.

V2.0  (September 1999)
  +Reorganized in a modular way the HTML parser, now faster and more
   expandable.
  +Style HTML tags are now recognized only when ANSI conversion is enable,
   thus making non ANSI conversion a bit faster.
  +Enlarged the file writing buffer. This makes HTTX faster on fast devices
   (such as RAM or SCSI HDs) but slower on slow ones (like Zip drives).
  +Added full support for alphabetical and mixed (numeric/alphabetical)
   lists.
  +Added options ANSIMODE (choice of the used ANSI mode), ANSICOL (colour of
   the links), TDEOL (insertion of an EOL between the cells of a table) and
   SETNOTE (type of the file note added to the saved file).
  +Reorganized the display of current settings.
  +Content of the <SELECT> element is now skipped.
  +If ANSI conversion is enabled, content of the <TH> element is now rendered
   in bold.
  +Added an easter egg ;-)
  -Fixed (hopefully) all the wordwrap bugs and their side effects, such as
   duplication of part of the text, bad line length sizing, and so on...
  -If SCRIPT option was enabled a bad output file length could have been
   showed.
  -After an ANSI reset, with underlined text (<U> HTML element), text could
   have been showed in underlined blue.
  -Improved ANSI compatibility while printing: now starting spaces are always
   preceded by an ANSI reset sequence.
  -The settings showed when no argument was given are bad.
  -Error 'ENV config 'NAME' not found.' could have been showed even with
   options QUIET or STDIO.
  -Each cell in a table could contain only one <HR>.
  -FILENOTE option disabled insertion of URL and TITLE of the document even
   when NOHEADER option wasn't used.
  -When printing, links was showed in yellow.

V1.7a (June 1998)
  +Added alias PRT for PRINT.
  -Fixed a dangerous bug that could cause HTTX to crash when a (too) long
   line was aligned.
  -Fixed an error in wordwrap that caused layout problems after some lines.

V1.7  (January 1998)
  +Rewritten in clearer way various important parts of the source code like
   wordwrap and ANSI management.
  +Made HTML parser more SGML compliant.
  +Improved table support by managing each table cell as a separate document.
  +Added support for hex numeric entities (&#xnn) up to 255 and Win'95
   numeric entities and special characters in the range 128-159.
  +Added support for <SELECT>, <XMP> and improved <LISTING> one.
  +Now text into a <BLOCKQUOTE> (or <BQ>) element is indented.
  +In <PRE> mode numeric entities (&#nn) lower than 32 are left as they are.
  +Added options SCRIPT and INCLUDE.
  +Now when option NOHEADER is used the HTTX version string is not added.
  +Improved support for bad HTML coding.
  +And many other things!
  -Fixed various little aesthetical bugs and bugs related to bad HTML coding
   in documents.
  -Fixed many little problems with old wordwrap (the more relevant was that
   if a line didn't contain spaces and ended with an entity, wordwrap didn't
   work correctly).
  -Numeric entities lower than ASCII 32 weren't converted to spaces.
  -In case of bad arguments, false errors messages were displayed.
  -Last character of file could be skipped in some cases.
  -Various other bugs wiped out!

V1.5  (May 1997)
  +Speeded up very much the program, due to a complete rewrite of HTML parser
   and optimization of many functions.
  +Added options HRMODE, NOALIGN, PRINT, APPEND, NOCFG, CFG.
  +Added alignment support (center and right) of text and separators.
  +Added support for various TAGS and HTML attributes (like START in numbered
   lists).
  +Optimized the ANSI output.
  +Added support for ANSI separators.
  +Added support for external configuration.
  +Improved entities support, now identified, despite of closing character.
  +More clear HTTX internal error codes. Custom DOS error messages replaced
   with PrintFault() of AmigaDOS.
  +Added support for "HTTX source_filename destination_path".
  +Improved support of separators in tables.
  +Now HTTX follows with higher fidelity HTML DTD for many TAGS.
  -Fixed some bugs in wordwrap.
  -Fixed all signalled bugs in version 1.1b.
  +... Many many other improvements!!!

V1.1b (January 1997)
  -Fixed management of some entities.

V1.1a (January 1997)
  -Removed a stupid and rare bug in TAGS management.
  -Fixed support for MAC EOLs.

V1.1  (November 1996)
  +Improved speed.
  +Added STDIO, QUIET, BADHTML, NOHEADER, GETNOTE options.
  +Added AmigaDOS return codes support.
  +Added <ADDRESS> and <LISTING> TAG support.
  +Modified <BR> and <LI> management, as requested by HTML 3.2 standard.
  +Rewritten entities management: now they will be converted even if they
   are not closed with ";".
  +Extended 7 bit conversion, now faster, more complete (almost all
   characters are converted) and more accurate (no accent letters are left in
   words, for example "HTTX è bello" becomes "HTTX e` bello" while
   "Belphégor" will become "Belphegor".
  +As popular demand, default 7BIT option is now OFF.
  +Improved support for multiline comments and badly written HTML.
  +Added support for TAGS or ALT-Text that contains LF or "<" and ">".
  -Fixed many little aesthetical bugs in conversion (double spaces in some
   cases, bad chars exiting from <PRE> and <SCRIPT> and other).
  -Fixed an error that could cause a lock on file to convert.
  -Fixed an error in FILENOTE option.

V1.0  (July 1996)
   First public release.



12. Future versions
-------------------

HTTX is a program in continuous growth, because I use it daily, I notice
lacks or possible improvements. This version will be the basis for a new even
better version that will be out... soon.



-----------------------------------------------------------------------------

                              *** CHILDWARE ***

This  software  is  "CHILDWARE". The author explicitly asks whoever uses this
program to make a donation to a beneficial corporate body which works helping
children in some way.

If  you  don't know of any, ask at your local post office and inform yourself
on how to make a donation to UNICEF.

The amount of the offer is up to you, but please do it!

-----------------------------------------------------------------------------