Upper vs lower case - important note regarding `l' and `I'
----------------------------------------------------------
-We maintain separate dictionaries for upper and lower case. At the
+We maintain separate dictionaries for upper case (Upper), lower case
+(Lower), and (initial portions of) mid-phrase words (Word). At the
beginning of each cell in the table, we expect uppercase; in the
middle of a word we expect lowercase; and, unfortunately, after an
inter-word gap, we are not sure.
So any time we see a word starting with `l' or `I', the program has to
ask about it.
+After an interword gap, we first search for a Word entry in the
+dictionary. If there is a match we use it. Otherwise we search both
+the uppercase and lowercase dictionaries; if one matches and the other
+doesn't, or one matches a wider character than the other, we use it.
+If that fails to resolve the ambiguity we must ask.
+
*Do not* make an entry in the character set dictionary mapping `vertical
stick' to `l' or `I'. Instead, select enough of the whole word in
question that no word would start with the other letter, and enter the
-whole word or part of it as a new glyph.
+whole word or part of it as a new glyph as a new Word.
For example, in the supplied dictionary there is already a glyph for
`Iron'; this is OK because there are no words which start `lron'.
Instead, make a new glyph for the last letter of the previous word
plus the (unusually narrow) inter-word space, and end that entry with
-\x20 (yes, type \ x 20).
+a literal space ` '.
For example, you might find that `y<space>G' is treated as
`y<??lowercase>' and the G doesn't get matched. Select the `y<space>'
-region of the bitmap and type `y\x20' into the string box.
-Sorry for this rather poor UI!
+region of the bitmap and type `y ' into the string box.
Overlapping characters - ligatures
all if the dictionary contains errors, you shouldn't rely on this.
If you think you have made mistakes answering OCR queries (for
-example, the recognised data is wrong), you should download a fresh
-copy of charset-15.txt from
- http://www.chiark.greenend.org.uk/~ijackson/ypp-sc-tools/master/pctb/charset-15.txt
+example, the recognised data is wrong), you should delete the file
+#local-char*#.txt, which contains your local updates. It will then
+only use the centrally provided (and vetted) master file (which is
+automatically updated when you run the PCTB client, by default).
Enabling interactive character set update