home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Telecom
/
1996-04-telecom-walnutcreek.iso
/
book.reviews
/
unicode.standard
< prev
next >
Wrap
Text File
|
1995-01-01
|
6KB
|
128 lines
Date: 29 Oct 93 12:20 -0600
From: Rob Slade <roberts@decus.arc.ab.ca>
Subject: Book Review: "The Unicode Standard"
BKUNICOD.RVW 980921
Addison-Wesley Publishing Co.
P.O. Box 520
26 Prince Andrew Place
Don Mills, Ontario M3C 2T8
416-447-5101 fax: 416-443-0948
or
1 Jacob Way
Reading, MA 01867-9984
800-527-5210 617-944-3700
5851 Guion Road
Indianapolis, IN 46254
800-447-2226
or
Unicode, Inc.
1965 Charleston Road
Mountain View, CA 94043
(415) 961-4189 Fax: (415) 966-1637
"The Unicode Standard", U$32.95/C$42.95
steve@unicode.org unicode-inc@unicode.org rick_mcgowan@next.com
In the dim and distant past, the late, and generally unlamented, SUZY
information system was born in Vancouver. Rather an oddball as far as
online services went, one "feature" was that the programmer had tried
to allow for the use of all of the IBM graphics characters. This lead
to an entirely new field of "smiley" or "emoticon" (emotional icon)
endeavours. Instead of the usual sideways happy face of the colon,
hyphen and right parenthesis; ":-)"; we were able to use the "Ctrl-A"
alternative of the IBM PC character set. Having a decimal value of
one, this character is an upright happy face. This allowed other
expansions, such as Ctrl-A and the right square bracket, which looks
like a face and a telephone handset, and was used (usually in the
"chat" modes) for "I am on the phone."
"How nice," I hear you mutter between clenched teeth. "Can we now get
on with the review?" Patience, stout nerds. This *is* the review.
As SUZY users, particularly those who had been introduced to computer
communications on the system, moved on to other services or local
bulletin boards, they were usually quite shocked to find that their
favourite symbols no longer worked. The little diamond (Ctrl-C) would
kill a message on a VAX. Fidonet users might find that the cute
tagline they had formed from graphics characters completely
disappeared when they sent the message through an Internet gateway.
ASCII (the American Standard Code for Information Interchange) is
widely, and mistakenly, believed to define two hundred and fifty-six
characters. It doesn't. Furthermore, of the hundred and twenty-eight
characters it does define, many are "control" rather than printable
characters. (The "card suit" symbols on the IBM PC graphics set are
defined as "end of text", "end of transmission", "enquiry" and
"acknowledgement" under the real ASCII standard.) In addition, many
believe ASCII to be a universal standard; also not true. An octet
with the decimal value thirty-five, for example, is the number sign
(sometimes called an "octothorpe") in the United States, but a pound
sign (the British currency) in Britain. As with most fields of
computer endeavor, the nice thing about standards is that there are so
many to choose from. Many vary only slightly -- but they vary.
The point is that there are a number of symbols which we commonly
know, but which cannot be consistently displayed on terminals or
printers. Certain terminals will have certain "international"
character sets, but not all are identical. Accents and other phonetic
modifiers may be difficult to handle: entire character sets are given
over strictly to accented characters. (In Canada we are acutely aware
of the problems, with "French" keyboards used at many sites. On one,
I was having difficulty finding some necessary punctuation marks for
network addressing, and asked a Francophone programmer for help. "Who
knows," he growled, "I never use the ____ things!")
Unicode seeks to address this problem. Including not only the
variations on the Latin alphabet, Unicode incorporates Greek,
Cyrillic, Hebrew and other alphabets. It also includes punctuation,
diacriticals, mathematical and scientific symbols and miscellaneous
graphics. Asian ideographs are also assigned codes. This is no
longer suitable, of course, for a seven-bit code, and Unicode is based
on a sixteen-bit address space.
The book gives some background and plans (chapter one), general
principles and rules for conformance (chapter two). To comment on
these in any meaningful way would be to rewrite these chapters. This
is technical material, though not the same technology that computer
types are used to. Some background study in linguistics would be a
good idea, although it is not strictly necessary to understand and use
the Unicode standard. There are, however, a wealth of symbols,
punctuation marks and typesetting codes which Unicode gives
standardized access to. On the other hand, any application which used
the standard in a significant way would likely require a linguistics
background in any case.
The bulk of the books (two volumes) is, of course, taken up with the
actual code charts. (Volume two, in fact, is almost completely
concerned with Han ideographs. In spite of the recent widespread use
of the English alphabet, this is still the standard written language
of Chinese, Japanese and Korean: CJK in Unicode terminology.) The
charts are augmented with verbal definitions of the symbols, and with
cross references to similar forms.
The Unicode standard is recent. In comparative terms its current
usage is negligible. However, it is the defacto standard for broadly
based international character sets. With the recent rejection of the
proposed ISO thirty-two bit standard, and the recasting of that
standard to follow Unicode's lead, Unicode is a significant factor in
the development of any international applications.
copyright Robert M. Slade, 1993 BKUNICOD.RVW 980921
(Postscriptum - Unicode Inc. maintains an FTP site at unicode.org
(192.195.185.2). Some of the mapping tables, and the Han cross
reference lists are available. Some tables are also available on IBM
PC or Mac compatible floppy disks.)
Permission granted to distribute only with unedited copies of TELECOM
Digest and associated newsgroups/mailing lists.
DECUS Canada Communications, Desktop, Education and Security group newsletters
Editor and/or reviewer ROBERTS@decus.ca, RSlade@sfu.ca, Rob Slade at 1:153/733
DECUS Symposium '94, Vancouver, BC, Mar 1-3, 1994, contact: rulag@decus.ca