[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
27.19 Charsets
Emacs groups all supported characters into disjoint charsets.
Each character code belongs to one and only one charset. For
historical reasons, Emacs typically divides an 8-bit character code
for an extended version of ASCII into two charsets:
ASCII, which covers the codes 0 through 127, plus another
charset which covers the “right-hand part” (the codes 128 and up).
For instance, the characters of Latin-1 include the Emacs charset
ascii
plus the Emacs charset latin-iso8859-1
.
Emacs characters belonging to different charsets may look the same,
but they are still different characters. For example, the letter
‘o’ with acute accent in charset latin-iso8859-1
, used for
Latin-1, is different from the letter ‘o’ with acute accent in
charset latin-iso8859-2
, used for Latin-2.
There are two commands for obtaining information about Emacs charsets. The command M-x list-charset-chars prompts for a name of a character set, and displays all the characters in that character set. The command M-x describe-character-set prompts for a charset name and displays information about that charset, including its internal representation within Emacs.
To find out which charset a character in the buffer belongs to, put point before it and type C-u C-x =.