I have added an exception case to the French stemmer to prevent -is removal
from colis, tapis, paris.
Unfortunately we have received no notices of stemming howlers for languages
other than English. (Apart from one for Russian way back which I will try to
resurrect.) The experience with English suggests that a small number of
exceptional forms are needed, but that in practice only a small number is
required. So similar notifications for other languages would be useful.
(Discovering important exceptions should derive from real IR use, like the
arsenic/arsenal example of the previous email.)
Martin
This archive was generated by hypermail 2.1.3 : Thu Sep 20 2007 - 12:02:48 BST