![]() |
![]() |
![]() |
![]() |
HarfBuzz's level 2 cluster behavior uses a significantly different model than that of level 0 and level 1.
The level 2 behavior is easy to describe, but it may be difficult to understand in practical terms. In brief, level 2 performs no merging of clusters whatsoever.
When glyphs form a ligature (or when some other feature substitutes multiple glyphs with one glyph), the cluster value of the first glyph is retained as the cluster value for the ligature. However, no subsequent clusters — including marks and modifiers — are affected.
Level 2 cluster behavior is less complex than level 0 or level 1, but there are a few cases in which processing cluster values produced at level 2 may be tricky.
The first example of how HarfBuzz's level 2 cluster behavior can be tricky is when the text to be shaped includes combining marks attached to ligatures.
Let us start with an input sequence with the following characters (top row) and initial cluster values (bottom row):
A,acute,B,breve,C,circumflex 0,1 ,2,3 ,4,5
If the sequence A,B,C
forms a ligature,
then these are the cluster values HarfBuzz will return under
the various cluster levels:
Level 0:
ABC,acute,breve,circumflex 0 ,0 ,0 ,0
Level 1:
ABC,acute,breve,circumflex 0 ,0 ,0 ,5
Level 2:
ABC,acute,breve,circumflex 0 ,1 ,3 ,5
Making sense of the level 2 result is the hardest for a client
program, because there is nothing in the cluster values that
indicates that B
and C
formed a ligature with A
.
In contrast, the "merged" cluster values of the mark glyphs that are seen in the level 0 and level 1 output are evidence that a ligature substitution took place.
Another example of how HarfBuzz's level 2 cluster behavior can be tricky is when glyphs reorder. Consider an input sequence with the following characters (top row) and initial cluster values (bottom row):
A,B,C,D,E 0,1,2,3,4
Now imagine D
moves before
B
in a reordering operation. The cluster
values will then be:
A,D,B,C,E 0,3,1,2,4
Next, if D
forms a ligature with
B
, the output is:
A,DB,C,E 0,3 ,2,4
However, in a different scenario, in which the shaping rules
of the script instead caused A
and
B
to form a ligature
before the D
reordered, the
result would be:
AB,D,C,E 0 ,3,2,4
There is no way for a client program to differentiate between these two scenarios based on the cluster values alone. Consequently, client programs that use level 2 might need to undertake additional work in order to manage cursor positioning, text attributes, or other desired features.
There may be other problems encountered with ligatures under level 2, such as if the direction of the text is forced to opposite of its natural direction (for example, left-to-right Arabic). But, generally speaking, these other scenarios are minor corner cases that are too obscure for most client programs to need to worry about.