Re: [Snowball-discuss] a simple algorithm problem

From: Martin Porter (
Date: Wed Dec 15 2004 - 23:16:55 GMT

See the section

and look at the widechars option etc.

Quite a bit of work has, as you probably know, been done on Turkish
stemming, but it is not work I am at all familiar with.


At 22:22 15/12/2004 +0000, ayhan peker wrote:
>Hi Martin,
>Thank you very much for your reply.
>I dont think i will want to run snowball as 8 bit ascii. Because my
>system and my database is modified to accept unicode chars (utf-8). I
>tried to run it previously but database was unable to return back to
>client as unicode which i intend to do.
>By the way this is for a pure turkish search engine (when i tried to run
>it with ascii only my robot-database-web interface all got muddled). So
>for me it is too late to try to go back to non-unicode mode.
>Could you tell me how i can run snowball in 16-bit char mode or do you
>have a piece of documentation i can read about it?
>I quite like to develop turkish-stemming algorithm. But it is one of the
>most difficult languages in the world to do. What i am trying to do is
>have a start on this project -a simple start- :) . I intend to continue
>to develop and get some others to contribute to this development.
>Best regards.

This archive was generated by hypermail 2.1.3 : Thu Sep 20 2007 - 12:02:47 BST