[Snowball-discuss] Rebuilding from distributed snowball_code.tgz doesn't quite work

From: Tom Lane (tgl@sss.pgh.pa.us)
Date: Sun Jun 03 2007 - 00:11:12 BST


Hi folks,
  I was looking into the possibility of incorporating snowball_code,
rather than the derived libstemmer distribution, into the sources of
another project (www.postgresql.org if you care). The reasoning was
(a) it's half the size, and (b) distributing real source code beats
distributing derived files any day; not to mention that it's required
for GPL compliance. (Postgres is BSD, but I don't like doing things
that would forbid its inclusion in a GPL project...)

Anyway, I found that I couldn't build libstemmer_c from the current
snowball_code.tgz without hackery. Specifically:

* GNUmakefile thinks that libstemmer/libstemmer.c and
libstemmer/libstemmer_utf8.c should be built from
libstemmer/libstemmer_c.in, but there is no such file in the distributed
tarball. I think you should ship the _in file and not the two derived
.c files, especially given the triviality of the conversion. (Actually,
there's got to be a better way than this to compile two versions of the
same source code ... why not have one .c file and compile with two
different #define's for instance?)

* doc/libstemmer_c_README is also missing from the tarball.

* "cp -a" is not portable. Recommend "cp -p -r" instead.

                        regards, tom lane



This archive was generated by hypermail 2.1.3 : Thu Sep 20 2007 - 12:02:49 BST