Re[2]: [Snowball-discuss] question about russian stemmer

From: Yuri (ykar@list.ru)
Date: Fri Feb 13 2004 - 20:22:26 GMT


Hello

> The English stemmer gives a scheme for including exceptions, which you might
> try and adapt to the Russian stemmer if the "Kiev" case was sufficiently
> important.

I'm not a linguist, i'm just a programmer. SBL definitions look very uncommon
for me, i will try to find out where to put exceptions. May be you can
help me, how to add just one exception: stem Kiev => Kiev.

Or if it hard, as workaround I make my stemmer subclass which looks for exceptions
and use it, or if word is not listed in exceptions call Snowball.

> You must of course realise that the stemmers are not 100% accurate, and a
> certain rate of error is inevitable. These errors do not necessarily degrade
> retrieval performance however (see the Introduction to Snowball).
>
> Are there many other words that mis-stem in a similar way?

No, this was first and only one problem (at least for now).
I'm writting search engine, which index all word in text.
And i noticed when i search "Kieva" (Kiev's or 'of Kiev' in english),
my search engine does not find text containing word "Kiev".

When i started to search where is the error i've found that stemmer,
stems 'Kiev' as 'Ki', and stem('Kiev') != stem('Kieva'),
('Ki' != 'Kiev')

Thank you

PS. I'm sorry for my english.



This archive was generated by hypermail 2.1.3 : Thu Sep 20 2007 - 12:02:46 BST