Re: [Snowball-discuss] a simple algorithm problem

From: Martin Porter (martin.porter@grapeshot.co.uk)
Date: Wed Dec 15 2004 - 23:16:55 GMT


See the section

http://snowball.tartarus.org/q/use.html

and look at the widechars option etc.

Quite a bit of work has, as you probably know, been done on Turkish
stemming, but it is not work I am at all familiar with.

Martin

At 22:22 15/12/2004 +0000, ayhan peker wrote:
>Hi Martin,
>Thank you very much for your reply.
>I dont think i will want to run snowball as 8 bit ascii. Because my
>system and my database is modified to accept unicode chars (utf-8). I
>tried to run it previously but database was unable to return back to
>client as unicode which i intend to do.
>By the way this is for a pure turkish search engine (when i tried to run
>it with ascii only my robot-database-web interface all got muddled). So
>for me it is too late to try to go back to non-unicode mode.
>Could you tell me how i can run snowball in 16-bit char mode or do you
>have a piece of documentation i can read about it?
>
>I quite like to develop turkish-stemming algorithm. But it is one of the
>most difficult languages in the world to do. What i am trying to do is
>have a start on this project -a simple start- :) . I intend to continue
>to develop and get some others to contribute to this development.
>
>Best regards.
>Ayhan



This archive was generated by hypermail 2.1.3 : Thu Sep 20 2007 - 12:02:47 BST