On Fri, 2002-05-24 at 20:47, Andreas Jung wrote:
> Seems that the problem is still not solved.
> I re-created all stemmers with and without -w option and in
> both cases snowball produced identical sources. Any ideas why?
Yes, -w doesn't change the output. What it does is allow snowball
programs to use character values in the range 0-65535 instead of 0-255.
A snowball program which can be generated successfully without -w will
not be affected by use of -w. However, a snowball program which uses
characters out of the range 0-255 will not be generated successfully
without -w.
If you're using -w to generate snowball output, you must also set
the typedef of "symbol" in api.h to something appropriate when you
compile the sources: see the comment at the start of api.h
Note that using -w and setting the size of symbol still doesn't
guarantee that the snowball program is using a 16 bit character set: see
the russian/stem.sbl file for an example: by default it uses KOI8-R (in
which all the character codes fit in one byte), but if you change the
comments around you can make it use Unicode instead.
-- Richard_______________________________________________________________
Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm
_______________________________________________ Snowball-discuss mailing list Snowball-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/snowball-discuss
This archive was generated by hypermail 2.1.3 : Thu Sep 20 2007 - 12:02:42 BST