Difference between revisions of "International Characters Support"

From Octave
Jump to navigation Jump to search
Line 6: Line 6:
 
Typical computer support consisted in early loading the adequate character map, then glyphs were rendered correctly.
 
Typical computer support consisted in early loading the adequate character map, then glyphs were rendered correctly.
  
The first issue with this approach is about convertion.
+
The first issue with this approach is about conversion. To view some text in Greek or Cyrillic language on a display configured for Western European requires to switch back and forth between codepages.
 +
 
 +
=Unicode=
 +
Unicode is a standard and an effort to encode symbols from every language existing or having existed on Earth. There are actually 190000 signs from 93 languages. Unicode is equivalent to  ISO/CEI 10646.

Revision as of 01:52, 1 April 2014

ANSI

The first widely character set was the 7-bits ANSI, with values ranging from 0 to 127. Being developped for English, it uses latin character set, but without accents and other punctuation signs.

In the '80s, extensions were provided by using 8-bits character tables, whose characters 128 to 255 where used to encode the missing values. But there were so many that those 128 values were not enough. So a number of maps where defined. For instance, ISO-8859-1 for Western Europeans Languages, with letter for french: é, Nordic languages: Ø, a few symbols: ½, and so on. Typical computer support consisted in early loading the adequate character map, then glyphs were rendered correctly.

The first issue with this approach is about conversion. To view some text in Greek or Cyrillic language on a display configured for Western European requires to switch back and forth between codepages.

Unicode

Unicode is a standard and an effort to encode symbols from every language existing or having existed on Earth. There are actually 190000 signs from 93 languages. Unicode is equivalent to ISO/CEI 10646.