X-Git-Url: http://www.chiark.greenend.org.uk/ucgi/~yarrgweb/git?p=ypp-sc-tools.main.git;a=blobdiff_plain;f=pctb%2FREADME.charset;h=770d3f3c29313a6b4d6c30f936e7fe7ecc6952e9;hp=25eb5d88feb9e6d8ad603533610219eb34c682b5;hb=124bfe502b5841d6d05f3f7e9fc825f918d3109d;hpb=a08a94640f4f0c02c536f337630dd234e917121f diff --git a/pctb/README.charset b/pctb/README.charset index 25eb5d8..770d3f3 100644 --- a/pctb/README.charset +++ b/pctb/README.charset @@ -140,17 +140,17 @@ errors. If you think you have made mistakes answering OCR queries (for example, the recognised data is wrong), you should delete the file -#local-char*#.txt, which contains your local updates. It will then +_local-char*.txt, which contains your local updates. It will then only use the centrally provided (and vetted) master file (which is automatically updated when you run the PCTB client, by default). It is also possible to have the OCR system reject particular strings. -If you put a regexp in #local-reject#.txt, any OCR result which +If you put a regexp in _local-reject.txt, any OCR result which matches this string will instead cause an OCR failure, invoking the -OCR dictionary editor if appropriate. #master-reject#.txt is the +OCR dictionary editor if appropriate. _master-reject.txt is the centrally maintained version of this file. -Alternatively you can edit #local-char15#.txt with a text editor. The +Alternatively you can edit _local-char*.txt with a text editor. The format is not documented at the moment. @@ -161,7 +161,7 @@ Now that you have read this document, you should rerun your OCR job with the --edit-charset option. So run ./ypp-commodities --edit-charset In future, this option is not usually needed, because it is the -default if there is a local character set dictionary #local-#.txt +default if there is a local character set dictionary _local-.txt for the relevant character height. With --edit-charset, when the OCR finds characters it does not