Re: [Snowball-discuss] Unicode and python bindings

From: Andreas Jung (
Date: Tue May 16 2006 - 20:50:49 BST

TextIndexNG3 for Zope ( comes with its own
Python bindings against the latest Snowball code base...and the
completeimplementation is based on unicode and in use since ages...


--On 16. Mai 2006 14:39:05 +0200 Patrick MĂ©zard <> wrote:

> Hello,
> Trying to solve issues I raised in a previous post
> (<>), I
> finally rewrote parts of the original Weongyo Jeong python bindings to
> fit my needs. The main change is the module interface now consumes python
> Unicode strings (UTF-16) instead of native strings. The idea is that code
> dealing with multiple languages usually unifies first the documents
> encodings into Unicode before passing them to other modules, including
> stemming. With the original bindings, since I failed to use the UTF-8
> interface, I had to convert back from Unicode to specific encodings which
> was at best a pain, at worst impossible.
> The new version is temporary available there:
> <> and I
> can provide a copy of the darcs (<>)
> repository I used to rewrite my branch.
> I think it still needs to be reviewed before any release (I am far from
> being a python C extension expert), even if it passes the few tests I
> could imagine.
> What's your opinion about this?
> --
> Patrick MĂ©zard
> _______________________________________________
> Snowball-discuss mailing list


This archive was generated by hypermail 2.1.3 : Thu Sep 20 2007 - 12:02:48 BST