manpagez: man pages & more
html files: harfbuzz
Home | html | info | man

The distinction between levels 0 and 1

The preceding examples demonstrate the main effects of using cluster levels 0 and 1. The only difference between the two levels is this: in level 0, at the very beginning of the shaping process, HarfBuzz also merges clusters between any base character and all Unicode marks (combining or not) that follow it.

For example, let us start with the following character sequence (top row) and accompanying initial cluster values (bottom row):

      A,acute,B
      0,1    ,2
    

The acute is a Unicode mark. If HarfBuzz is using cluster level 0 on this sequence, then the A and acute clusters will merge, and the result will become:

      A,acute,B
      0,0    ,2
    

This initial cluster merging is the default behavior of the Windows shaping engine, and the old HarfBuzz codebase copied that behavior to maintain compatibility. Consequently, it has remained the default behavior in the new HarfBuzz codebase.

But this initial cluster-merging behavior makes it impossible to color diacritic marks differently from their base characters. That is why, in level 1, HarfBuzz does not perform the initial merging step.

For client programs that rely on HarfBuzz cluster values to perform cursor positioning, level 0 is more convenient. But relying on cluster boundaries for cursor positioning is wrong: cursor positions should be determined based on Unicode grapheme boundaries, not on shaping-cluster boundaries. As such, level 1 clusters are preferred.

One last note about levels 0 and 1. HarfBuzz currently does not allow a MultipleSubst lookup to replace a glyph with zero glyphs (in other words, to delete a glyph). But, in some other situations, glyphs can be deleted. In those cases, if the glyph being deleted is the last glyph of its cluster, HarfBuzz makes sure to merge the cluster with a neighboring cluster.

This is done primarily to make sure that the starting cluster of the text always has the cluster index pointing to the start of the text for the run; more than one client currently relies on this guarantee.

Incidentally, Apple's CoreText does something else to maintain the same promise: it inserts a glyph with id 65535 at the beginning of the glyph string if the glyph corresponding to the first character in the run was deleted. HarfBuzz might do something similar in the future.

© manpagez.com 2000-2025
Individual documents may contain additional copyright information.