Re[2]: [Snowball-discuss] question about russian stemmer

From: Oleg Bartunov (oleg@sai.msu.su)
Date: Fri Feb 13 2004 - 20:32:02 GMT


On Fri, 13 Feb 2004, "Yuri" wrote:

> Hello
>
> > The English stemmer gives a scheme for including exceptions, which you might
> > try and adapt to the Russian stemmer if the "Kiev" case was sufficiently
> > important.
>
> I'm not a linguist, i'm just a programmer. SBL definitions look very uncommon
> for me, i will try to find out where to put exceptions. May be you can
> help me, how to add just one exception: stem Kiev => Kiev.
>
> Or if it hard, as workaround I make my stemmer subclass which looks for exceptions
> and use it, or if word is not listed in exceptions call Snowball.

I think subclassing of our Perl interface would be the best way. There are
too many exceptions, so it's impractical if anybody's complains would
resulted in modifying snowball rules.

>
> > You must of course realise that the stemmers are not 100% accurate, and a
> > certain rate of error is inevitable. These errors do not necessarily degrade
> > retrieval performance however (see the Introduction to Snowball).
> >
> > Are there many other words that mis-stem in a similar way?
>
> No, this was first and only one problem (at least for now).
> I'm writting search engine, which index all word in text.
> And i noticed when i search "Kieva" (Kiev's or 'of Kiev' in english),
> my search engine does not find text containing word "Kiev".
>
> When i started to search where is the error i've found that stemmer,
> stems 'Kiev' as 'Ki', and stem('Kiev') != stem('Kieva'),
> ('Ki' != 'Kiev')
>
> Thank you
>
> PS. I'm sorry for my english.
>
>
> _______________________________________________
> Snowball-discuss mailing list
> Snowball-discuss@lists.tartarus.org
> http://lists.tartarus.org/mailman/listinfo/snowball-discuss
>

        Regards,
                Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83



This archive was generated by hypermail 2.1.3 : Thu Sep 20 2007 - 12:02:46 BST