I've added 'cosmos', 'atlas' to porter2 as exceptions, just to show how
amenable I am. The -ive endings are a different matter: the word pairs share
polysemic forms, the question being to what extent the sharing is small
enough to warrant separating the words (see the introductory paper). To
solve this problem one should be using the stemmer in conjunction with a
dictionary. Anyway there are many worse examples: I discover that
'combative' stems to 'comb'!
As I said you can always add pet exceptions to a private version.
What was the context in which you noticed these weaknesses of the stemmer
incidentally? I'd be interested to know.
Snowball-discuss mailing list
VirusChecked by the Incepta Group plc
This archive was generated by hypermail 2.1.3 : Thu Sep 20 2007 - 12:02:40 BST