Re: [Snowball-discuss] 16 bit characters in Snowball

From: Andreas Jung (andreas@andreas-jung.com)
Date: Fri May 24 2002 - 12:30:13 BST


I upgraded my Snowball sandbox to the latest version in the CVS
and rebuilt all from scratch. I created an input file 'test.txt
containing the string 'splicing', converted it to UTF-16 and ran
the port stemmer on it and it outputs the unstemmed string:

yetix@/develop/REPOSITORY/snowball/website/porter(107)% hexdump test.txt
0000000 7300 7000 6c00 6900 6300 6900 6e00 6700
0000010 0a00
0000012
yetix@/develop/REPOSITORY/snowball/website/porter(108)% ./stemmer test.txt
splicing
1 calls to stem

Andreas

----- Original Message -----
From: "Martin Porter" <martin_porter@softhome.net>
To: "Andreas Jung" <andreas@andreas-jung.com>
Cc: <snowball-discuss@lists.sourceforge.net>
Sent: Friday, May 24, 2002 06:50
Subject: Re: [Snowball-discuss] 16 bit characters in Snowball

>
> Andreas,
>
> I can't really help with Python, never having used it. Nevertheless, with
> the sample program as guide, I'm sure you can't be too far from a working
> version.
>
> I don't know if anyone else can help ...
>
> Martin
>
>
>
> _______________________________________________________________
>
> Don't miss the 2002 Sprint PCS Application Developer's Conference
> August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm
>
> _______________________________________________
> Snowball-discuss mailing list
> Snowball-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/snowball-discuss
>

_______________________________________________________________

Don't miss the 2002 Sprint PCS Application Developer's Conference
August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm

_______________________________________________
Snowball-discuss mailing list
Snowball-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/snowball-discuss



This archive was generated by hypermail 2.1.3 : Thu Sep 20 2007 - 12:02:42 BST