RE: [Snowball-discuss] Dutch stemmer: undouble "nn", "mm", "ff"?

From: Edwin de Jonge (ejne@rnd.vb.cbs.nl)
Date: Wed Jan 07 2004 - 07:56:02 GMT


Hi Martin,

Martin Porter wrote:
> I may have to let you pursue your study of the Dutch stemmer
> for sometime before getting involved myself, as I have a
> number of things on the go. Would that be okay by you?
That is fine with me.

> What I could do finally is offer a rerelease of the stemmer, or
> perhaps support two stemmers, simple and advanced.
I like this idea: the advanced will be slower (and more complex)
than the simple, so users can choose.

> I do have the Kraaij-Pohlmann stemmer in Snowball (exact
> except for a handful of words in a large vocabulary). They
> created a complex piece of work, not always easy to
> understand, but it does tackle the points you have been
> raising in your emails, as well as I can recall. Would it be
> useful to you if I put up my version on the Snowball website?
I'm very interested in the Kraaij-Pohlmann stemmer, I think it would be
useful to me.

> If the Dutch stemmer is redone, I'll need to find a new
> vocabulary (mine doesn't have apostophe in words, so stemming
> 'tje etc can't be demonstrated).

I will try to help you find/create such a vocabulary.

Edwin



This archive was generated by hypermail 2.1.3 : Thu Sep 20 2007 - 12:02:46 BST