Previous: WordType CONFIGURATION, Up: WordType [Contents][Index]
Normalize a word according to configuration specifications and builtin transformations. Every word inserted in the inverted index goes thru this function. If a word is rejected (return value has WORD_NORMALIZE_NOTOK bit set) it will not be inserted in the index. If a word is accepted (return value has WORD_NORMALIZE_OK bit set) it will be inserted in the index. In addition to these two bits, informational values are stored that give information on the processing done on the word. The bit field values and their meanings are as follows:
the word length exceeds the value of 
    the
wordlist_maximum_word_length
 configuration parameter.
the word length is smaller than the value of 
    the
wordlist_minimum_word_length
 configuration parameter.
the word contained capital letters and has been converted 
    to lowercase. This bit is only set
    if the
wordlist_lowercase
 configuration parameter
    is true.
the word contains digits and the configuration 
    parameter
wordlist_allow_numbers
 is set to false.
the word contains control characters.
the word is listed in the file pointed by 
    the
wordlist_bad_word_list
 configuration parameter.
the word is a zero length string.
at least one character listed in 
    the
wordlist_valid_punctuation
 attribute was removed
    from the word.
the word does not contain any alphanumerical character.
Returns a string explaining the return flags of the Normalize method.