chiark - git - mdw - disorder/commit

author	Richard Kettlewell <rjk@greenend.org.uk>
	Tue, 20 Nov 2007 18:13:56 +0000 (18:13 +0000)
committer	Richard Kettlewell <rjk@greenend.org.uk>
	Tue, 20 Nov 2007 18:13:56 +0000 (18:13 +0000)
commit	8818b7fca12456e62410ef914a7bef250a0633c9
tree	df1992a190796971fb46f160babd454010808851	tree \| snapshot
parent	7bbe944b70a8a904dd15905fbf351b5e906224ff	commit \| diff

utf32_word_split() and utf8_word_split() splits a string into words
using the UAX #29 word boundary algorithm.  words() is therefore now a
wrapper around this.  There is scope for improvement in the use of
this function as currently we do some needless converting back and
forth between encoding forms.

casefold() now uses the compatibility case-folding algorithm, which
seems more appropriate for searching.

dbversions are now integers not strings.  Some dbversion=2
functionality can be selectively disabled for testing purposes.

README.dbversions documents the differences between the dbversions.

12 files changed:

lib/configuration.c		diff \| blob \| blame \| history
lib/configuration.h		diff \| blob \| blame \| history
lib/test.c		diff \| blob \| blame \| history
lib/unicode.c		diff \| blob \| blame \| history
lib/unicode.h		diff \| blob \| blame \| history
lib/vector.h		diff \| blob \| blame \| history
lib/words.c		diff \| blob \| blame \| history
server/Makefile.am		diff \| blob \| blame \| history
server/README.dbversions	[new file with mode: 0644]	blob
server/rescan.c		diff \| blob \| blame \| history
server/trackdb.c		diff \| blob \| blame \| history
server/trackdb.h		diff \| blob \| blame \| history

Multi-user software jukebox -- https://www.greenend.org.uk/rjk/disorder/

RSS Atom