**Next message:**xiao shibin: "Re: [Snowball-discuss] Unicode version of snowball"**Previous message:**Martin Porter: "Re: [Snowball-discuss] Unicode version of snowball"**In reply to:**Martin Porter: "Re: [Snowball-discuss] Unicode version of snowball"**Next in thread:**xiao shibin: "Re: [Snowball-discuss] Unicode version of snowball"**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ] [ attachment ]

Hi Martin,

By your help, I can compile the stemmer to process UCS2-based unicode.

But my russian text is encoded in UTF8-based unicode, and I don't want to translate UTF8 data to UCS2 data, Could you tell me how to modify the stemmer?

Or have a done version of snowball which support UTF8?

search the snowball front-page, get "cyrillic letters in utf-8", should I do other modify?

stringdef a decimal '45264'

stringdef b decimal '45520'

stringdef v decimal '45776'

stringdef g decimal '46032'

stringdef d decimal '46288'

stringdef e decimal '46544'

stringdef zh decimal '46800'

stringdef z decimal '47056'

stringdef i decimal '47312'

stringdef i` decimal '47568'

stringdef k decimal '47824'

stringdef l decimal '48080'

stringdef m decimal '48336'

stringdef n decimal '48592'

stringdef o decimal '48848'

stringdef p decimal '49104'

stringdef r decimal '32977'

stringdef s decimal '33233'

stringdef t decimal '33489'

stringdef u decimal '33745'

stringdef f decimal '34001'

stringdef kh decimal '34257'

stringdef ts decimal '34513'

stringdef ch decimal '34769'

stringdef sh decimal '35025'

stringdef shch decimal '35281'

stringdef " decimal '36049'

stringdef y decimal '35793'

stringdef ' decimal '35537'

stringdef e` decimal '36305'

stringdef iu decimal '36561'

stringdef ia decimal '36817'

thanks for your help.

xiao shibin

----- Original Message -----

From: "Martin Porter" <martin.porter@grapeshot.co.uk>

To: "xiao shibin" <xiao.shibin@trs.com.cn>; <snowball-discuss@lists.tartarus.org>

Sent: Sunday, May 09, 2004 7:06 PM

Subject: Re: [Snowball-discuss] Unicode version of snowball

*> At 13:44 09/05/2004 +0800, xiao shibin wrote:
**> >>May 2002 - Unicode support added
**> >
**> >where can I download the unicode version?
**> >
**> >thanks,
**> >
**> >xiao shib
**>
**> Just download the whole thing and use the -w[idechars] option when
**> compiling. If you put "unicode" in the snowball front-page search box you
**> can see the emails that were passed around when 16 bit character support was
**> being added, which provides useful background.
**>
**> Martin
**>
**>
**>
**> *

**Next message:**xiao shibin: "Re: [Snowball-discuss] Unicode version of snowball"**Previous message:**Martin Porter: "Re: [Snowball-discuss] Unicode version of snowball"**In reply to:**Martin Porter: "Re: [Snowball-discuss] Unicode version of snowball"**Next in thread:**xiao shibin: "Re: [Snowball-discuss] Unicode version of snowball"**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ] [ attachment ]

*
This archive was generated by hypermail 2.1.3
: Thu Sep 20 2007 - 12:02:46 BST
*