What's on the CD-ROM

The CD that comes with this book should be readable on a Mac, Solaris, Linux, and 
Windows 95/98/Me/NT/2000. Just put the CD in the drive, and mount it using whatever 
method you normally use to load a CD on your platform, probably filemanager in Solaris, 
and just stick it in the drive if you're using a Mac, Linux or Windows. There's no fancy 
installer. You can browse the directories as you would a hard drive.

All CD-ROM files are read-only. Therefore, if you open a file from the CD-ROM and 
make any changes to it, you'll need to save it to your hard drive. If you copy a file from 
the CD-ROM to your hard drive on Windows, the file retains its read-only attribute. To 
change this attribute after copying a file, right-click the file name or icon and select 
Properties from the shortcut menu. In the Properties dialog box, click the General tab and 
remove the checkmark from the Read-only checkbox.

The CD is divided into seven main directories:

*	Browsers
*	Parsers
*	Specifications
*	Examples
*	Source Code
*	Utilities
*	PDFs

Browsers

This directory contains a number of Web browsers that support XML to a greater or 
lesser extent including:

*	Microsoft Internet Explorer 5.5 for Windows
*	Microsoft Internet Explorer 5.0 for MacOS
*	Mozilla 0.8 (various platforms)
*	Amaya 4.3 (Linux, Windows 95,98,NT)

Parsers

This directory contains a variety of open source XML parsers including:

*	The Xerces Java XML parser
*	The Xerces-C XML parser for C++
*	The Xerces-Perl XML parser for Perl
*	The expat parser for C++

Most of the examples in this book that have used a specific parser have used Xerces Java, 
in particular, the sax.SAXCount program, To install it, just copy the xerces.jar and 
xercesSamples.jar archives to your jre\lib\ext directory. You'll need the Java Runtime 
Environment (JRE) 1.2 or later, which you can download from http://java.sun.com/. If 
you've installed the Java Development Kit (JDK) instead of the JRE on Windows, you 
may have two ext directories, one somewhere like C:\jdk1.3\jre\lib\ext and the other 
somewhere like C:\Program Files\Javasoft\jre\1.3\lib\ext. You need to copy the jar 
archive into both ext directories. Putting one copy in one directory and an alias into the 
other directory does not work. You must place complete, actual copies into each ext 
directory.

Specifications

This directory contains the XML Specifications from the World Wide Web Consortium 
(W3C) including:

*	XML 1.0, second edition
*	Namespaces in XML
*	CSS Level 1
*	CSS Level 2
*	XSLT 1.0
*	XPath 1.0
*	HTML 4.0
*	XHTML 1.0
*	MathML 2.0
*	The Resource Description Framework
*	SMIL

These are all included in HTML format, and most are available in XML as well. Some 
are also provided in additional formats such as PDF or plain text. Many technologies 
discussed in this book are not yet finalized (for example, XLinks). You can find the 
current draft specifications for these on the World Wide Web Consortium (W3C) Web 
site at http://www.w3.org/TR/.

Examples

This directory contains several examples of large XML files and large collections of 
XML documents. Some (but not all) of these are based on smaller examples printed in the 
book. For instance, you'll find complete statistics for the 1998 Major League Baseball 
season including all players and teams. Examples include:

*	The 1998 Major League Baseball season
*	The complete works of Shakespeare (courtesy of Jon Bosak)
*	The Old Testament (courtesy of Jon Bosak)
*	The New Testament (courtesy of Jon Bosak)
*	The Koran (courtesy of Jon Bosak)
*	The Book of Mormon (courtesy of Jon Bosak)
*	The periodic table of the elements

Source Code

All complete numbered code listings from this book are on the CD-ROM in a directory 
called source. They are organized by chapter. Very simple HTML indexes are provided 
for the examples in each chapter. However, because most of the examples are raw XML 
files and because most don't have style sheets, some Web browsers won't display them 
very well. Internet Explorer 5.x probably does the best job with most of these files. 
Otherwise, you're probably better off just opening the directories in Windows Explorer, 
the Finder or the equivalent on your platform of choice, and reading the files with a text 
editor.
Most of the files are named according to the listing number in the book (for example, 
6-1.xml, 27-1.cdf). However, in a few cases in which a specific name is used in the book, 
such as family.dtd or family.xml, then that name is also used on the CD. The files on the 
CD appear exactly as they do in the book's listings.

Utilities

The utilities directory contains assorted programs that will be useful for processing XML 
documents of one type or another. These include:

*	Dave Raggett's HTML Tidy, compiled for a variety of platforms. Tidy can clean 
	up most HTML files so that they become well-formed XML. Tidy can correct 
	many common problems and warn you about the ones you need to fix yourself. 
	
The latest version can be found at http://www.w3.org/People/Raggett/tidy.

*	The Xalan-J XSLT Processor from the XML Apache Project 
*	Michael Kay's SAXON XSLT Processor 
*	James Clark's XT XSLT Processor
*	The Batik SVG Viewer from the XML Apache Project 
*	FOP, a print formatter driven by XSL formatting objects from the XML Apache 
	Project.
	
PDF

The pdf directory contains Acrobat PDF files for this entire book. To read them, you'll 
need the free Acrobat Reader software included on the CD-ROM. Feel free to put them 
on your local hard disk for easy access. I don't really care if you loan the CD-ROM to 
some cash-strapped undergrad who finds it cheaper to tie up a school printer for a few 
hours printing all 1200+ pages rather than spend $49.99 for a printed copy. (If you're 
using your own printer, toner, and paper, it's much cheaper to buy the book.) However, I 
would very much appreciate it if you do not place these files on Web, FTP, Gnutella, 
Publius, or any other servers. This includes intranet servers, password-protected sites, and 
other things that aren't meant for the public at large. Most local sites and intranets are far 
more exposed to the broader Internet than most people think. Today's search engines are 
very good at locating content that is supposed to be hidden. Putting mirror copies of these 
files around the Web makes it extremely difficult to keep all the files up to date and to 
make sure that search engines find the right copies.