Re: [Snowball-discuss] More patches

From: Richard Boulton (richard@lemurconsulting.com)
Date: Mon Feb 12 2007 - 13:31:25 GMT


Olly Betts wrote:
> I'm currently updating Xapian to use UTF-8 stemmers generated by the
> latest version of snowball. I've patched the snowball compiler to
> generate the stemmers as C++ classes, and I'm embedding the patched
> compiler in the Xapian build system, so Xapian users can easily drop
> in new stemmers.

I'd be interested in adding a "C++" output mode to snowball, so patches
to do this would probably be accepted.

Ideally, I'd like to make a C++ version of the libstemmer library, and
maintain it in Snowball rather than Xapian. In particular, it would
seem useful to me for developers to be able to link against a
system-wide snowball dynamic library, rather than the specific version
compiled into Xapian. However, that discussion possibly belongs on the
Xapian mailing lists rather than here, and for now whatever works is
fine by me. :)

> This improves the shortcutting of backwards among - if there are fewer
> characters available than the shortest string in the among, there's
> no way it can match. It also includes a cosmetic tweak (avoiding
> generating "z->c - 0" in the output) which makes the generated source
> a little more readable (of course the C compiler will optimise the "- 0"
> away anyway):
>
> http://oligarchy.co.uk/xapian/patches/snowball-min-length-shortcut-backwards-among.patch

I've applied this patch too - I believe I've now applied all that
patches you've sent so far!

-- 
Richard



This archive was generated by hypermail 2.1.3 : Thu Sep 20 2007 - 12:02:49 BST