home *** CD-ROM | disk | FTP | other *** search
-
- Date: 29 Oct 93 12:20 -0600
- From: Rob Slade <roberts@decus.arc.ab.ca>
- Subject: Book Review: "The Unicode Standard"
-
-
- BKUNICOD.RVW 980921
-
- Addison-Wesley Publishing Co.
- P.O. Box 520
- 26 Prince Andrew Place
- Don Mills, Ontario M3C 2T8
- 416-447-5101 fax: 416-443-0948
- or
- 1 Jacob Way
- Reading, MA 01867-9984
- 800-527-5210 617-944-3700
- 5851 Guion Road
- Indianapolis, IN 46254
- 800-447-2226
- or
- Unicode, Inc.
- 1965 Charleston Road
- Mountain View, CA 94043
- (415) 961-4189 Fax: (415) 966-1637
-
- "The Unicode Standard", U$32.95/C$42.95
- steve@unicode.org unicode-inc@unicode.org rick_mcgowan@next.com
-
- In the dim and distant past, the late, and generally unlamented, SUZY
- information system was born in Vancouver. Rather an oddball as far as
- online services went, one "feature" was that the programmer had tried
- to allow for the use of all of the IBM graphics characters. This lead
- to an entirely new field of "smiley" or "emoticon" (emotional icon)
- endeavours. Instead of the usual sideways happy face of the colon,
- hyphen and right parenthesis; ":-)"; we were able to use the "Ctrl-A"
- alternative of the IBM PC character set. Having a decimal value of
- one, this character is an upright happy face. This allowed other
- expansions, such as Ctrl-A and the right square bracket, which looks
- like a face and a telephone handset, and was used (usually in the
- "chat" modes) for "I am on the phone."
-
- "How nice," I hear you mutter between clenched teeth. "Can we now get
- on with the review?" Patience, stout nerds. This *is* the review.
-
- As SUZY users, particularly those who had been introduced to computer
- communications on the system, moved on to other services or local
- bulletin boards, they were usually quite shocked to find that their
- favourite symbols no longer worked. The little diamond (Ctrl-C) would
- kill a message on a VAX. Fidonet users might find that the cute
- tagline they had formed from graphics characters completely
- disappeared when they sent the message through an Internet gateway.
-
- ASCII (the American Standard Code for Information Interchange) is
- widely, and mistakenly, believed to define two hundred and fifty-six
- characters. It doesn't. Furthermore, of the hundred and twenty-eight
- characters it does define, many are "control" rather than printable
- characters. (The "card suit" symbols on the IBM PC graphics set are
- defined as "end of text", "end of transmission", "enquiry" and
- "acknowledgement" under the real ASCII standard.) In addition, many
- believe ASCII to be a universal standard; also not true. An octet
- with the decimal value thirty-five, for example, is the number sign
- (sometimes called an "octothorpe") in the United States, but a pound
- sign (the British currency) in Britain. As with most fields of
- computer endeavor, the nice thing about standards is that there are so
- many to choose from. Many vary only slightly -- but they vary.
-
- The point is that there are a number of symbols which we commonly
- know, but which cannot be consistently displayed on terminals or
- printers. Certain terminals will have certain "international"
- character sets, but not all are identical. Accents and other phonetic
- modifiers may be difficult to handle: entire character sets are given
- over strictly to accented characters. (In Canada we are acutely aware
- of the problems, with "French" keyboards used at many sites. On one,
- I was having difficulty finding some necessary punctuation marks for
- network addressing, and asked a Francophone programmer for help. "Who
- knows," he growled, "I never use the ____ things!")
-
- Unicode seeks to address this problem. Including not only the
- variations on the Latin alphabet, Unicode incorporates Greek,
- Cyrillic, Hebrew and other alphabets. It also includes punctuation,
- diacriticals, mathematical and scientific symbols and miscellaneous
- graphics. Asian ideographs are also assigned codes. This is no
- longer suitable, of course, for a seven-bit code, and Unicode is based
- on a sixteen-bit address space.
-
- The book gives some background and plans (chapter one), general
- principles and rules for conformance (chapter two). To comment on
- these in any meaningful way would be to rewrite these chapters. This
- is technical material, though not the same technology that computer
- types are used to. Some background study in linguistics would be a
- good idea, although it is not strictly necessary to understand and use
- the Unicode standard. There are, however, a wealth of symbols,
- punctuation marks and typesetting codes which Unicode gives
- standardized access to. On the other hand, any application which used
- the standard in a significant way would likely require a linguistics
- background in any case.
-
- The bulk of the books (two volumes) is, of course, taken up with the
- actual code charts. (Volume two, in fact, is almost completely
- concerned with Han ideographs. In spite of the recent widespread use
- of the English alphabet, this is still the standard written language
- of Chinese, Japanese and Korean: CJK in Unicode terminology.) The
- charts are augmented with verbal definitions of the symbols, and with
- cross references to similar forms.
-
- The Unicode standard is recent. In comparative terms its current
- usage is negligible. However, it is the defacto standard for broadly
- based international character sets. With the recent rejection of the
- proposed ISO thirty-two bit standard, and the recasting of that
- standard to follow Unicode's lead, Unicode is a significant factor in
- the development of any international applications.
-
- copyright Robert M. Slade, 1993 BKUNICOD.RVW 980921
-
- (Postscriptum - Unicode Inc. maintains an FTP site at unicode.org
- (192.195.185.2). Some of the mapping tables, and the Han cross
- reference lists are available. Some tables are also available on IBM
- PC or Mac compatible floppy disks.)
-
- Permission granted to distribute only with unedited copies of TELECOM
- Digest and associated newsgroups/mailing lists.
-
-
- DECUS Canada Communications, Desktop, Education and Security group newsletters
- Editor and/or reviewer ROBERTS@decus.ca, RSlade@sfu.ca, Rob Slade at 1:153/733
- DECUS Symposium '94, Vancouver, BC, Mar 1-3, 1994, contact: rulag@decus.ca
-