File:,  Node: Using Symbols,  Next: Character Classes,  Prev: Font Positions,  Up: Using Fonts

5.19.4 Using Symbols

A "glyph" is a graphical representation of a "character".  While a
character is an abstraction of semantic information, a glyph is
something that can be seen on screen or paper.  A character has many
possible representation forms (for example, the character 'A' can be
written in an upright or slanted typeface, producing distinct glyphs).
Sometimes, a sequence of characters map to a single glyph: this is a
"ligature"--the most common is 'fi'.

   Space characters never become glyphs in GNU 'troff'.  If not
discarded (as when trailing on text lines), they are represented by
horizontal motions in the output.

   A "symbol" is simply a named glyph.  Within 'gtroff', all glyph names
of a particular font are defined in its font file.  If the user requests
a glyph not available in this font, 'gtroff' looks up an ordered list of
"special fonts".  By default, the PostScript output device supports the
two special fonts 'SS' (slanted symbols) and 'S' (symbols) (the former
is looked up before the latter).  Other output devices use different
names for special fonts.  Fonts mounted with the 'fonts' keyword in the
'DESC' file are globally available.  To install additional special fonts
locally (i.e., for a particular font), use the 'fspecial' request.

   Here are the exact rules how 'gtroff' searches a given symbol:

   * If the symbol has been defined with the 'char' request, use it.
     This hides a symbol with the same name in the current font.

   * Check the current font.

   * If the symbol has been defined with the 'fchar' request, use it.

   * Check whether the current font has a font-specific list of special
     fonts; test all fonts in the order of appearance in the last
     'fspecial' call if appropriate.

   * If the symbol has been defined with the 'fschar' request for the
     current font, use it.

   * Check all fonts in the order of appearance in the last 'special'

   * If the symbol has been defined with the 'schar' request, use it.

   * As a last resort, consult all fonts loaded up to now for special
     fonts and check them, starting with the lowest font number.  This
     can sometimes lead to surprising results since the 'fonts' line in
     the 'DESC' file often contains empty positions, which are filled
     later on.  For example, consider the following:

          fonts 3 0 0 FOO

     This mounts font 'foo' at font position 3.  We assume that 'FOO' is
     a special font, containing glyph 'foo', and that no font has been
     loaded yet.  The line

          .fspecial BAR BAZ

     makes font 'BAZ' special only if font 'BAR' is active.  We further
     assume that 'BAZ' is really a special font, i.e., the font
     description file contains the 'special' keyword, and that it also
     contains glyph 'foo' with a special shape fitting to font 'BAR'.
     After executing 'fspecial', font 'BAR' is loaded at font
     position 1, and 'BAZ' at position 2.

     We now switch to a new font 'XXX', trying to access glyph 'foo'
     that is assumed to be missing.  There are neither font-specific
     special fonts for 'XXX' nor any other fonts made special with the
     'special' request, so 'gtroff' starts the search for special fonts
     in the list of already mounted fonts, with increasing font
     positions.  Consequently, it finds 'BAZ' before 'FOO' even for
     'XXX', which is not the intended behaviour.

   *Note Device and Font Description Files::, and *note Special Fonts::,
for more details.

   The 'groff_char(7)' man page houses a complete list of predefined
special character names, but the availability of any as a glyph is
device- and font-dependent.  For example, say

     man -Tdvi groff_char > groff_char.dvi

to obtain those available with the DVI device and default font
configuration.(1)  (*note Using Symbols-Footnote-1::) If you want to use
an additional macro package to change the fonts used, 'groff' (or
'gtroff') must be run directly.

     groff -Tdvi -mec -man groff_char.7 > groff_char.dvi

   Special character names not listed in 'groff_char(7)' are derived
algorithmically, using a simplified version of the Adobe Glyph List
(AGL) algorithm, which is described in
.  The (frozen) set of
names that can't be derived algorithmically is called the "'groff' glyph
list (GGL)".

   * A glyph for Unicode character U+XXXX[X[X]], which is not a
     composite character is named 'uXXXX[X[X]]'.  X must be an uppercase
     hexadecimal digit.  Examples: 'u1234', 'u008E', 'u12DB8'.  The
     largest Unicode value is 0x10FFFF. There must be at least four 'X'
     digits; if necessary, add leading zeroes (after the 'u').  No zero
     padding is allowed for character codes greater than 0xFFFF.
     Surrogates (i.e., Unicode values greater than 0xFFFF represented
     with character codes from the surrogate area U+D800-U+DFFF) are not
     allowed either.

   * A glyph representing more than a single input character is named

          'u' COMPONENT1 '_' COMPONENT2 '_' COMPONENT3 ...

     Example: 'u0045_0302_0301'.

     For simplicity, all Unicode characters that are composites must be
     maximally decomposed to NFD;(2) (*note Using Symbols-Footnote-2::)
     for example, 'u00CA_0301' is not a valid glyph name since U+00CA
     (LATIN CAPITAL LETTER E WITH CIRCUMFLEX) can be further decomposed
     into U+0045 (LATIN CAPITAL LETTER E) and U+0302 (COMBINING
     CIRCUMFLEX ACCENT).  'u0045_0302_0301' is thus the glyph name for

   * groff maintains a table to decompose all algorithmically derived
     glyph names that are composites itself.  For example, 'u0100'
     (LATIN LETTER A WITH MACRON) is automatically decomposed into
     'u0041_0304'.  Additionally, a glyph name of the GGL is preferred
     to an algorithmically derived glyph name; 'groff' also
     automatically does the mapping.  Example: The glyph 'u0045_0302' is
     mapped to '^E'.

   * glyph names of the GGL can't be used in composite glyph names; for
     example, '^E_u0301' is invalid.

 -- Escape sequence: \(nm
 -- Escape sequence: \[name]
 -- Escape sequence: \[base-glyph combining-component ...]
     Typeset a special character NAME (two-character name NM) or a
     composite glyph consisting of BASE-GLYPH overlaid with one or more
     COMBINING-COMPONENTs.  For example, '\[A ho]' is a capital letter
     "A" with a "hook accent" (ogonek).

     There is no special syntax for one-character names--the analogous
     form '\N' would collide with other escape sequences.  However, the
     four escape sequences '\'', '\-', '\_', and '\`', are translated on
     input to the special character escape sequences '\[aa]', '\[-]',
     '\[ul]', and '\[ga]', respectively.

     A special character name of length one is not the same thing as an
     ordinary character: that is, the character 'a' is not the same as

     If NAME is undefined, a warning in category 'char' is produced and
     the escape is ignored.  *Note Warnings::, for information about the
     enablement and suppression of warnings.

     GNU 'troff' resolves '\[...]' with more than a single component as

        * Any component that is found in the GGL is converted to the
          'uXXXX' form.

        * Any component 'uXXXX' that is found in the list of
          decomposable glyphs is decomposed.

        * The resulting elements are then concatenated with '_' in
          between, dropping the leading 'u' in all elements but the

     No check for the existence of any component (similar to 'tr'
     request) is done.


     '\[A ho]'
          'A' maps to 'u0041', 'ho' maps to 'u02DB', thus the final
          glyph name would be 'u0041_02DB'.  This is not the expected
          result: the ogonek glyph 'ho' is a spacing ogonek, but for a
          proper composite a non-spacing ogonek (U+0328) is necessary.
          Looking into the file 'composite.tmac', one can find
          '.composite ho u0328', which changes the mapping of 'ho' while
          a composite glyph name is constructed, causing the final glyph
          name to be 'u0041_0328'.

     '\[^E u0301]'
     '\[^E aa]'
     '\[E a^ aa]'
     '\[E ^ ']'
          '^E' maps to 'u0045_0302', thus the final glyph name is
          'u0045_0302_0301' in all forms (assuming proper calls of the
          'composite' request).

     It is not possible to define glyphs with names like 'A ho' within a
     'groff' font file.  This is not really a limitation; instead, you
     have to define 'u0041_0328'.

 -- Escape sequence: \C'xxx'
     Typeset the glyph of the special character XXX.  Normally, it is
     more convenient to use '\[XXX]', but '\C' has some advantages: it
     is compatible with AT&T device-independent 'troff' (and therefore
     available in compatibility mode(3) (*note Using
     Symbols-Footnote-3::)) and can interpolate special characters with
     ']' in their names.  The delimiter need not be a neutral
     apostrophe; see *note Delimiters::.

 -- Request: .composite id1 id2
     Map special character name ID1 to ID2 if ID1 is used in '\[...]'
     with more than one component.  See above for examples.  This is a
     strict rewriting of the special character name; no check is
     performed for the existence of a glyph for either.  A set of
     default mappings for many accents can be found in the file
     'composite.tmac', loaded by the default 'troffrc' at startup.

 -- Escape sequence: \N'n'
     Typeset the glyph with code N in the current font ('n' is _not_ the
     input character code).  The number N can be any non-negative
     decimal integer.  Most devices only have glyphs with codes between
     0 and 255; the Unicode output device uses codes in the range
     0-65535.  If the current font does not contain a glyph with that
     code, special fonts are _not_ searched.  The '\N' escape sequence
     can be conveniently used in conjunction with the 'char' request:

          .char \[phone] \f[ZD]\N'37'

     The code of each glyph is given in the fourth column in the font
     description file after the 'charset' command.  It is possible to
     include unnamed glyphs in the font description file by using a name
     of '---'; the '\N' escape sequence is the only way to use these.

     No kerning is applied to glyphs accessed with '\N'.  The delimiter
     need not be a neutral apostrophe; see *note Delimiters::.

   A few escape sequences are also special characters.

 -- Escape sequence: \'
     An escaped neutral apostrophe is a synonym for '\[aa]' (acute

 -- Escape sequence: \`
     An escaped grave accent is a synonym for '\[ga]' (grave accent).

 -- Escape sequence: \-
     An escaped hyphen-minus is a synonym for '\[-]' (minus sign).

 -- Escape sequence: \_
     An escaped underscore ("low line") is a synonym for '\[ul]'
     (underrule).  On typesetting devices, the underrule is
     font-invariant and drawn lower than the underscore '_'.

 -- Request: .cflags n c1 c2 ...
     Assign properties encoded by the number N to characters C1, C2, and
     so on.

     Input characters, including special characters introduced by an
     escape, have certain properties associated with them.(4)  (*note
     Using Symbols-Footnote-4::) These properties can be modified with
     this request.  The first argument is the sum of the desired flags
     and the remaining arguments are the characters to be assigned those
     properties.  Spaces between the CN arguments are optional.  Any
     argument CN can be a character class defined with the 'class'
     request rather than an individual character.  *Note Character

     The non-negative integer N is the sum of any of the following.
     Some combinations are nonsensical, such as '33' (1 + 32).

          Recognize the character as ending a sentence if followed by a
          newline or two spaces.  Initially, characters '.?!' have this

          Enable breaks before the character.  A line is not broken at a
          character with this property unless the characters on each
          side both have non-zero hyphenation codes.  This exception can
          be overridden by adding 64.  Initially, no characters have
          this property.

          Enable breaks after the character.  A line is not broken at a
          character with this property unless the characters on each
          side both have non-zero hyphenation codes.  This exception can
          be overridden by adding 64.  Initially, characters
          '\-\[hy]\[em]' have this property.

          Mark the glyph associated with this character as overlapping
          other instances of itself horizontally.  Initially, characters
          '\[ul]\[rn]\[ru]\[radicalex]\[sqrtex]' have this property.

          Mark the glyph associated with this character as overlapping
          other instances of itself vertically.  Initially, the
          character '\[br]' has this property.

          Mark the character as transparent for the purpose of
          end-of-sentence recognition.  In other words, an
          end-of-sentence character followed by any number of characters
          with this property is treated as the end of a sentence if
          followed by a newline or two spaces.  This is the same as
          having a zero space factor in TeX.  Initially, characters
          '"')]*\[dg]\[dd]\[rq]\[cq]' have this property.

          Ignore hyphenation codes of the surrounding characters.  Use
          this in combination with values 2 and 4 (initially, no
          characters have this property).

          For example, if you need an automatic break point after the
          en-dash in numeric ranges like "3000-5000", insert

               .cflags 68 \[en]

          into your document.  However, this practice can lead to bad
          layout if done thoughtlessly; in most situations, a better
          solution instead of changing the 'cflags' value is to insert
          '\:' right after the hyphen at the places that really need a
          break point.

     The remaining values were implemented for East Asian language
     support; those who use alphabetic scripts exclusively can disregard

          Prohibit a line break before the character, but allow a line
          break after the character.  This works only in combination
          with flags 256 and 512 and has no effect otherwise.
          Initially, no characters have this property.

          Prohibit a line break after the character, but allow a line
          break before the character.  This works only in combination
          with flags 128 and 512 and has no effect otherwise.
          Initially, no characters have this property.

          Allow line break before or after the character.  This works
          only in combination with flags 128 and 256 and has no effect
          otherwise.  Initially, no characters have this property.

     In contrast to values 2 and 4, the values 128, 256, and 512 work
     pairwise.  If, for example, the left character has value 512, and
     the right character 128, no break will be automatically inserted
     between them.  If we use value 6 instead for the left character, a
     break after the character can't be suppressed since the neighboring
     character on the right doesn't get examined.

 -- Request: .char c [contents]
 -- Request: .fchar c [contents]
 -- Request: .fschar f c [contents]
 -- Request: .schar c [contents]
     Define a new character or glyph C to be CONTENTS, which can be
     empty.  More precisely, 'char' defines a 'groff' object (or
     redefines an existing one) that is accessed with the name C on
     input, and produces CONTENTS on output.  Every time glyph C needs
     to be printed, CONTENTS is processed in a temporary environment and
     the result is wrapped up into a single object.  Compatibility mode
     is turned off and the escape character is set to '\' while CONTENTS
     is processed.  Any emboldening, constant spacing, or track kerning
     is applied to this object rather than to individual glyphs in

     An object defined by these requests can be used just like a normal
     glyph provided by the output device.  In particular, other
     characters can be translated to it with the 'tr' or 'trin'
     requests; it can be made the leader character with the 'lc'
     request; repeated patterns can be drawn with it using the '\l' and
     '\L' escape sequences; and words containing C can be hyphenated
     correctly if the 'hcode' request is used to give the object a
     hyphenation code.

     There is a special anti-recursion feature: use of the object within
     its own definition is handled like a normal character (not defined
     with 'char').

     The 'tr' and 'trin' requests take precedence if 'char' accesses the
     same symbol.

          .tr XY
              => Y
          .char X Z
              => Y
          .tr XX
              => Z

     The 'fchar' request defines a fallback glyph: 'gtroff' only checks
     for glyphs defined with 'fchar' if it cannot find the glyph in the
     current font.  'gtroff' carries out this test before checking
     special fonts.

     'fschar' defines a fallback glyph for font F: 'gtroff' checks for
     glyphs defined with 'fschar' after the list of fonts declared as
     font-specific special fonts with the 'fspecial' request, but before
     the list of fonts declared as global special fonts with the
     'special' request.

     Finally, the 'schar' request defines a global fallback glyph:
     'gtroff' checks for glyphs defined with 'schar' after the list of
     fonts declared as global special fonts with the 'special' request,
     but before the already mounted special fonts.

     *Note Character Classes::.

 -- Request: .rchar c ...
 -- Request: .rfschar f c ...
     Remove definition of each ordinary or special character C, undoing
     the effect of a 'char', 'fchar', or 'schar' request.  Those
     supplied by font description files cannot be removed.  Spaces and
     tabs may separate C arguments.

     The request 'rfschar' removes glyph definitions defined with
     'fschar' for font F.

