Re: [Snowball-discuss] Unicode support

From: James Aylett (james-xapian@tartarus.org)
Date: Mon May 20 2002 - 14:05:13 BST


On Mon, May 20, 2002 at 06:29:35AM -0600, Martin Porter wrote:

> BOM is 'byte order mark' - some special character of termination?
> The answer then is no.

It's actually usually at the begining. 0xffef or its equivalent
byte-swapped, I think. Snowball doesn't want to do this, because it's
somewhat too much overhead for each individual stemming call, I'd have
thought.

James

-- 
/--------------------------------------------------------------------------\
  James Aylett                                            zap.tartarus.org
  james@tartarus.org                                        footlights.org

_______________________________________________________________ Hundreds of nodes, one monster rendering program. Now that's a super model! Visit http://clustering.foundries.sf.net/

_______________________________________________ Snowball-discuss mailing list Snowball-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/snowball-discuss



This archive was generated by hypermail 2.1.3 : Thu Sep 20 2007 - 12:02:42 BST