[links-list] Re: Display of SGML Greek characters

Steve White swhite at zipcon.net
Mon Aug 12 03:56:11 PDT 2002


On Mon, 12 Aug 2002 02:12:50 -0400 (EDT),
Kragen Sitaker <kragen at pobox.com> wrote
>Have you considered using code points in the Private Use Area?  You
>only need, what, 44?  U+E000 to U+F8FF is a 16-bit private use area
>for situations just like this one; this is probably "corporate use",
>so it should probably be somewhere near the top of this area.
>
I just figured out why not.

As I understand it, the internal issue isn't whether the scalar value 
(such as U+E000) can be represented in 16 bits, but whether the UTF-8 
character can be represented in 16 bits.  Have a look at page 47 of
<http://www.unicode.org/unicode/uni2book/ch03.pdf>

The UTF-8 representation of 0xE000 is
	11101110 10000000 10000000
which isn't 16 bits.

Are there any other holes in lower Unicode that could be exploited?  
How badly do we want this?


-- 
Unsubscribe: send email to links-list-request at linuxfromscratch.org
and put unsubscribe in the subject header of the message



More information about the links-list mailing list