README.bible

   1
   2         README.bible
   3         Bible Retrieval System
   4         Chip Chapin, Hewlett Packard Company
   5         Initial release, September 5, 1989
   6         Last Updated: April 26, 1993
   7
   8
   9 The Bible Retrieval System (BRS) consists of a textual database of the
  10 Authorized ("King James") Version of the Old and New Testaments, a set
  11 of libraries for finding and retrieving text, and a program ("bible")
  12 which uses the libraries to retrieve Bible passages given references
  13 on the command line or from standard input.  A built-in Concordance
  14 (word search facility) is also supported.  A man page is provided.
  15
  16 While the raw Bible text consumes over 4.4 megabytes, the BRS stores
  17 it in a special compressed form, requiring less than 1.8 megabytes.
  18 Despite the compression, retrieval is very fast, and a buffer caching
  19 scheme makes second and following references to a particular region of
  20 the text almost instantaneous.
  21
  22 The concordance facility requires an additional datafile of less than
  23 0.9 megabytes, which provides a pre-computed list of each verse in
  24 which a word appears.  The current implementation provides a very
  25 simple but effective way to logically combine the results of word
  26 searches to narrow a selection.
  27
  28
  29
  30 INSTALLATION (building from source)
  31
  32 You probably have two files:
  33
  34     bible.data.tar      Tar file containing the two data files.
  35     bible.tar.Z         BRS program source distribution, including
  36                         READMEs and man page.
  37
  38 Create a directory to work in, then "cd" to it and extract files from
  39 the distribution archives as follows:
  40
  41                 $ tar xvf bible.data.tar
  42                 $ zcat bible.tar.Z | tar xvf -
  43
  44 Now execute "make" and wait for a while.
  45
  46                 $ make
  47
  48 When make has completed, you should be able to start up the bible
  49 program for interactive use:
  50
  51                 $ bible
  52         or      $ ./bible
  53
  54 Type "?" for a summary of commands.  "Q" quits the program.
  55 Review the man page:
  56
  57                 $ nroff bible.1 | more
  58
  59 If you wish to install the program, data, and man files into
  60 system-wide locations ("/usr/local/..."), and you have the proper
  61 permissions, type:
  62
  63                 $ make install
  64
  65 If you wish to install them somewhere else, either edit the Makefile
  66 and change the DEST variable, or just install the files by hand.  If
  67 you install the data files anywhere besides /usr/local/lib, you may
  68 want to edit "bible.c" to assign an initial value to "dfpath",
  69 otherwise use the program's "-p" option (see man page).
  70
  71
  72
  73 THE LIBRARIES
  74
  75 The Bible Retrieval System is intended to be more than just the
  76 "bible" retrieval program.  Two libraries of routines are provided in
  77 the BRS that may be used to construct other applications.
  78
  79 The "Text Storage Library" (TSL) routines could be used for *any*
  80 textual data file; they are entirely independent of the structure of
  81 the Bible.  They support the use of the windowed compression scheme on
  82 any text, with fast retrieval of any particular line of the text.  The
  83 concordance facility is also completely generic and should work with
  84 any text.  For this release, no separate documentation is provided for
  85 the TSL, but comments in the files tsl.c, tsl.h, makeindex.c, and
  86 makeconcfile.c are fairly extensive.
  87
  88 The "Bible Retrieval Library" (BRL) includes routines that are
  89 specifically oriented to the Book-Chapter-Verse structure of the Bible
  90 text, however they are independent of the storage structure of the
  91 textual data, leaving that to the TSL.  The BRL routines make
  92 retrieval programs such as "bible" extremely simple.  For this release
  93 no separate documentation is provided for the BRL -- see brl.c, brl.h,
  94 brl-index.c and bible.c.
  95
  96 Actually, there's also a third library of sorts.  "Compresslib"
  97 contains a routine which may be called to uncompress a buffer of
  98 LZW-compressed data.
  99
 100
 101 THE COMPRESSION SCHEME
 102
 103 The text is compressed using a modified version of the
 104 Lempel-Ziv-Welch "compress" program.  The modification is very simple,
 105 and consists merely of forcing compress to emit checkpoints after a
 106 fixed number of input bytes which I call a "window".  One can thus
 107 easily determine which compressed "window" contains a particular byte
 108 of the original text.  By keeping track of the locations of the
 109 checkpoints in the compressed data, it is then possible to uncompress
 110 only the windows that are needed.  By the way, the uncompression is
 111 done by a subroutine within the library -- no exec's or temporary
 112 files are used.
 113
 114 Compression windows can be any size -- the size is stored in the data
 115 file and the retrieval routines treat the file accordingly.  In the
 116 default configuration, the windows are 64Kbytes, which was shown by
 117 experiment to offer a reasonable compromise between efficient
 118 compression and efficient buffer management.  If you want to
 119 experiment, you can change the window size by editing the argument to
 120 "squish" in the Makefile.
 121
 122
 123 Some Personal Notes...
 124
 125 In 1979, as the owner of "Chapin Associates" in San Diego, I started a
 126 project to create an affordable computer-based retrieval system for
 127 Bible text.  Working in UCSD Pascal on a PDP-11/03 with 60Kbytes of
 128 memory and two 500Kbyte RX02 floppy drives, with my associates Neil
 129 Fraser and Jan Denser, we succeeded in prototyping a system that used
 130 word-level Huffman-coding for the text of the New Testament.
 131 Unfortunately, pressed between economics and the limitations of the
 132 available hardware, I wound up abandoning the effort in 1980.
 133
 134 In early 1989 I gained access to one of the available freeware Bible
 135 retrieval programs for the PC.  I immediately decided that the time
 136 had come to "close the loop" on this particular personal dream, with
 137 Unix as the target environment.  There really aren't any serious
 138 technical challenges any more to producing an acceptable Bible
 139 retrieval implementation for Unix systems.  So I snatched the Bible
 140 text, spent a few weekends and evenings at my workstation, and here it
 141 is.  The LZW compression scheme is much simpler than the word-level
 142 Huffman coding, though the compression is not as good.  And it's great
 143 being able to count memory and disk storage in MBytes instead of
 144 KBytes.  Even so, I've really tried to keep the system's use of
 145 resources to a minimum.  But I don't think 1.7+ megabytes of data is
 146 too high a price to pay nowadays in most Unix environments.
 147
 148 So... I hope others find these tools useful.
 149
 150 Chip Chapin
 151 Hewlett Packard Company
 152 September 5, 1989
 153
 154 --------------------------------------------------------------------
 155 Chip Chapin, Hewlett-Packard Company, California Language Lab
 156             (HP/CSO/STG/STD/CLO/CLL)
 157 Internet:  chip@cup.hp.com             HPDesk: Chip Chapin/hp4700/um
 158 uucp:      ... {allegra,decvax,ihnp4,ucbvax} !hplabs!hpclbis!chip
 159         or ... uunet!hp-sde!hpclbis!chip
 160 USMail:    MS42U5; 11000 Wolfe Road; Cupertino, CA  95014-9804;  USA
 161 Phone:     408/447-5735    Fax: 408/447-4924    HPTelnet: 1-447-5735
 162 --------------------------------------------------------------------
 163
 164