[Snowball-discuss] Turkish stemmer code

From: Olly Betts (olly@survex.com)
Date: Fri Feb 16 2007 - 22:02:32 GMT


I noticed some uses of "test" in the new turkish stemmer which seem
redundant to me (or else I don't understand Snowball well enough, which
is quite possible).

This snippet is used 4 times (once with "non-vowel" instead of "vowel"):

    test(next (test vowel))

But the inner "test" seems redundant as the cursor will be reset after
"vowel" anyway by the outer "test", so I think this is just the same as:

    test(next vowel)

Also, this snippet is used 4 times (once with a grouping instead of
a single character literal):

    ((test 'n') next (test vowel))

But "next" advances the cursor by a character, so isn't that the same
as this:

    ('n' (test vowel))

I found a turkish wordlist, but it's aimed at checking for insecure
passwords so only contains ASCII characters (and the licence means
it's not suitable for distributing with snowball anyway). But I
tried the changes above and for this restricted test set I get the
same results with and without these changes.

Cheers,
    Olly



This archive was generated by hypermail 2.1.3 : Thu Sep 20 2007 - 12:02:49 BST