OpenAmiga (881/959)

From:Rudi Chiarito
Date:31 Aug 2000 at 08:33:48
Subject:Re: AMIOPEN: Serious problem

On Mon, Aug 28, 2000 at 08:56:32AM -0500, Marc Culler wrote:
> First of all, the documentation on the cooker
> (/dev/keyboard/cooked/usr.html) is terrible.

I reported this in the past and was promised better documentation from
the maintainer of the keyboard subsystem.

> F1: ESC 5b 31 31 7e
> F2: ESC 5b 31 31 7e
> F3: ESC 5b 31 31 7e
> F4: ESC 5b 31 31 7e
> F5: ee 80 85
> F6: ee 80 86

For F1-F4, I get 'ee 80 81..84'.

> What is wrong with this, apart from the fact that it is completely
> random and non-standard and uses five bytes in some cases where two or

It isn't random here. I've asked Tao if they can think of any
explanations for the different behaviour.

> three would suffice? The byte ee is being used to introduce a
> multi-byte sequence of variable length. In other words, it is an

No, it isn't...

> escape character. That is the job of ESC. I don't see any need to create
> a new escape character. Besides, it seems like a bad idea to just pick a
> number greater than 128 and use it as an escape character. Nowadays
> those bytes may have other meanings. In the Latin-1 encoding the
> byte ee is used for the circumflex i. If you randomly decide to use

I was saying, it isn't an escape character and it isn't to be interpreted
as a Latin-1 character either.

The key to this is the file <file:/usr/doc/intent/dev/keyboard/api.html>.
At the bottom, it says:

"By convention for raw keyboard devices, this integer is an Elate raw
keycode. These are defined in dev/keyboard/keyboard.inc. Bit 31 of this
raw keycode is set if the packet represents the release of the key."

And now the important part:

"For the cooking device this keycode is a valid Unicode character. Keys
which are not representable by Unicode characters (such as the cursor
keys) have codes defined in the user space of Unicode. The Elate codes
are between 0xE000 and 0xEFFF."

So what you're seeing is Unicode characters. How are they encoded,
though? The answer is at page 56 of the SDK manual: UTF-8. ASCII codes
(0-127) are encoded as they are. Codes between 128 and 2047 are encoded
in two bytes, %110xxxxx %10xxxxxx. Codes between 2048 and 65535 are
encoded in three bytes, %1110xxxx %10xxxxxx %10xxxxxx. After some masking
and shifting, one can finally figure that ee 80 83 is actually character
U-e003 ($e003).

The list of keys and their associated custom Unicode characters is in
/dev/keyboard/keyboard.h (and keyboard.inc).

The so-called user space of Unicode is a range assigned by the standard
for applications to use. You can associate any of those numbers to
whatever you want, as long as that association and those numbers are used
only inside programs. You shouldn't be storing or interchanging data
which contain them, because other applications will have no idea what
they're supposed to mean.

So, once you have your UTF-8 character/string, you should call writeascii
or writeutf8 (check /usr/doc/intent/dev/display/api.html).

> ee as an escape character then you have removed the circumflex i from
> somebody's alphabet. That seems like a dumb thing to do.

Nobody did that. :)

> AMIGA folks: Please have a look at this. At least explain the reasoning
> behind this setup.

I hope the above solved your doubts. It goes without saying that more
documentation is needed, though.



"Without deviation from the norm, progress is not possible." (F. Zappa)
Rudi Chiarito SGML/XML, user interface, i18n Amiga Inc.
rudi@amiga.com http://amiga.com/
Subscribe/Unsubscribe: open-request@amiga.com
Amiga FAQ: http://www.amiga.com/faq.html