Re: Re[4]: [Snowball-discuss] an inconsistency with Russian stemmer

From: Oleg Bartunov (
Date: Sun Nov 18 2001 - 15:35:29 GMT

Good news, Martin !

It's pity I miss the discussion in the mailing list archive :-)
I'm going to subscribe now. Andrew has raised very important question
about diminutives form. What's about opposite forms like 'zaichishe' ?



PS. btw, Andrew, we could communicate in russian.

On Sun, 18 Nov 2001, Martin Porter wrote:

> Andrew Aksyonoff has spotted something wrong with the definition of the
> Russian stemmer, so I am putting in a new definition of a slightly modified
> algorithm.
> The new definition is shorter and simpler, the snowball script is slightly
> shorter, and I think more natural, and the small number of words in the
> vocabulary which are affected by the change stem better than they did before.
> The essential change is that the adjective ending test always precedes the
> verb ending test, which has come about through removal of the 'verbal' test
> where it was done the other way round.
> I got a bit concerned about the removal of the reflexive endings not being
> in the context of the preceding ending (si^a is supposed to follow consonant
> and s' to follow vowel), but careful study of the vocabulary suggests that
> it does not matter, or at least does not matter very much, so I am leaving
> that alone.
> I will update the website with this change shortly. There are a number of
> other changes to go in so I'm not sure it will be today.
> Martin
> Andrew, if you are extending your stemmer to include diminutives ('ik',
> 'onok' etc) our stemmer definitions will probably diverge anyway, but it
> would be interesting to hear how you get on. I have tended to avoid endings
> of this type since in Dutch for example diminutives can radically affect
> meaning, in which case one does not want to remove them as part of an IR
> process. I don't know their significance in Russian, although I realise
> diminutives are used a lot with personal names.

Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
phone: +007(095)939-16-83, +007(095)939-23-83

Snowball-discuss mailing list

VirusChecked by the Incepta Group plc

This archive was generated by hypermail 2.1.3 : Thu Sep 20 2007 - 12:02:40 BST