HTMLGoodies
The ultimate html resource
Earthweb.com


About the Double-Underlined Links


Become a Partner






internet.commerce















HTML Goodies : HTML and Graphics Tutorials : HTML 4.01 Tags : HTML4 Reference: Internationalization


Web Development Daily Newsletter


Other Related Newsletters

Internationalization


By Sue Charlesworth

Despite its name, the World Wide Web has had some difficulty reaching out past the Western languages and alphabets. In general, character representation in HTML was largely confined to the use of the ISO 8859-1 (Latin-1) character set. This character set contains letters for English, French, Spanish, German, and the Scandinavian languages, but no Greek, Hebrew, Arabic, or Cyrillic characters, among others, and few scientific and mathematical symbols. Also, the Latin-1 character set contains no provisions for marking reading direction.

Part of the problem with Latin-1 is that it simply doesn't have room to handle all the alphabets and languages of the world. It is an 8-bit, single-byte coded graphic character set and, as such, can represent only up to 256 characters.

Enter Unicode. Unicode is a character-encoding standard that uses a 16-bit set, thereby increasing the number of encoded characters to more than 65,000 characters.

HTML 4.0 uses the Universal Character Set (UCS) as its character set. UCS is a character- by-character equivalent to Unicode 2.0.

from Special Edition Using HTML 4: Appendix A
What's New in HTML 4.0

© Copyright Macmillan Computer Publishing. All rights reserved.

Tools:
Add htmlgoodies.com to your favorites
Add htmlgoodies.com to your browser search box
IE 7 | Firefox 2.0 | Firefox 1.5.x
Receive news via our XML/RSS feed



IT Management Networking & Communications Web Development Hardware & Systems Software Development Earthwebnews.com

Internet.com
The Network for Technology Professionals

Search:

About Internet.com

Legal Notices, Licensing, Permissions, Privacy Policy.
Advertise | Newsletters | E-mail Offers