manpagez: man pages & more
info aspell
Home | html | info | man
[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

C.1 Compound Words

In some languages, such as German, it is acceptable to string two words together, thus forming a compound word. However, there are rules to when this can be done. Furthermore, it is not always sufficient to simply concatenate the two words. For example, sometimes a letter is inserted between the two words. Aspell currently has support for unconditionally stringing words together. I tried implementing more sophisticated support for compound words in Aspell but it was too limiting and no one used it.

After receiving feedback from several people it seems that acceptable support for compound words involved two basically independent parts. If this is not sufficient for your language please let me know.

Part One

Describes how the word needs to be changed when forming a compound

 
CMP <flag> <strip> <add> <cond> <cond2>

<flag>  is the compound flag
<strip> is the string to strip or 0 for the null string
<add>   is the string to add or 0 for the null string
<cond>  is the condition to match at the end of the current word
<cond2> is the condition to match at the beginning of the next word

All but the last field are the same as a suffix entry in the existing affix code.

<cond> is a simplified regular expression. Some examples:

 
. (for anything)
e
[^aeiou]y
[^ey]
[aeiou]y

It does not seem necessary to change the beginning of a word when forming compounds

Part Two

Describes the position a word can appear in (beginning, middle, or end) and with which words.

To do this each word can be assigned a category. Then each category can be given a set of rules to describe how it can be used in a compound word for example

 
A + B: indicates that category A may appear at the beginning of a
  word when followed by a category B word.  When combined it is then
  considered a category B word.
A + C + B: here a C word may only appear between an A or B word
A + A + B
A + A
A + A + A
etc..

I have not decided if a word should be allowed to belong to more than one category as a new category can be created in necessary to mean words in both category A and B for example.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]
© manpagez.com 2000-2024
Individual documents may contain additional copyright information.