Wed Nov 21 2001


I've added 'cosmos', 'atlas' to porter2 as exceptions, just to show how
amenable I am. The -ive endings are a different matter: the word pairs share
polysemic forms, the question being to what extent the sharing is small
enough to warrant separating the words (see the introductory paper). To
solve this problem one should be using the stemmer in conjunction with a
dictionary. Anyway there are many worse examples: I discover that
'combative' stems to 'comb'!

As I said you can always add pet exceptions to a private version.

What was the context in which you noticed these weaknesses of the stemmer
incidentally? I'd be interested to know.


