While it is capturing the screenshots, do not move the mouse or use
the keyboard. Keyboard focus must stay in the YPP client window.
-*IMPORTANT*
-It may put up a window asking about characters it does not understand.
-It is important to get these inputs right or your client may
-misrecognise text in future. You *must* read the documentation in
-README.charset before answering these questions.
-
Command-line options
--------------------
--quiet Suppress progress messages
--screenshot-file F Store or read screenshots in F rather than #pages#.pnm
--window-id ID Specified X window is the YPP client - do not search
+ --edit-charset Enable character set editing. See README.charset.
Controlling what happens to the results:
--upload (default) Upload to the PCTB server
Map from commodity names to the numbers required by the PCTB
server. This is fetched and updated automatically as necessary.
It can safely be deleted as it will then be refetched.
-
* <file>.new
When any of these tools overwrite one of the persistent database
-Character set query tool, and semantics of the glyphs
------------------------------------------------------
+Handing OCR failures
+--------------------
-Sometimes the OCR will not be able to recognise some text and you will
-have to help it out. It will display the part it is having trouble
-with, showing where it has got to, and allow you to edit the character
-set database it uses for recognising the text.
+Sometimes the OCR will not be able to recognise some text. By
+default, when this happens, the program will stop with a fatal error
+and refer you to this document.
-*This is subtle* and it is important to understand the way the
-machinery works, and the possible mistakes you can make, before
-answering the program. *Please read this documentation*
-
-If you need help please ask me (ijackson@chiark.greenend.org.uk, or
-Aristarchus on Midnight in game if I'm on line, or ask any pirate of
-the crew Special Circumstances if they happen to know where I am
-and/or can get in touch).
+It is possible to fix this by editing the character set database used
+by the OCR algorithm. But, it is important to get these inputs right
+or your client may misrecognise text in future. You *must* read the
+documentation here first.
Recognition algorithm
http://www.chiark.greenend.org.uk/~ijackson/ypp-sc-tools/master/pctb/charset-15.txt
+Enabling interactive character set update
+-----------------------------------------
+
+Now that you have read this document, you should rerun your OCR job
+with the --edit-charset option. You probably want to supply --same as
+well, to avoid having to wait for it to page through and recapture all
+the screenshots. So, this time,
+ ./ypp-commodities --edit-charset --same
+and in future, just always run it with the --edit-charset option.
+
+With --edit-charset, when the OCR finds characters it does not
+understand, it will put up an OCR resolution query window. This will
+display the part of the text it is having trouble with, showing where
+it has got to, and allow you to edit the character set database it
+uses for recognising the text.
+
+*This is subtle* and it is important to understand the way the
+machinery works, and the possible mistakes you can make, before
+answering the program. *Please read this documentation*, which
+explains the meaning of the entries you make.
+
+If you need help please ask me (ijackson@chiark.greenend.org.uk, or
+Aristarchus on Midnight in game if I'm on line, or ask any pirate of
+the crew Special Circumstances if they happen to know where I am
+and/or can get in touch).
+
+
Send me your updates
--------------------
your charset file (ijackson@chiark.greenend.org.uk) so that I can
include your contributions in future versions. This will also let me
check that they seem right :-).
+
+In future I may have the program phone home automatically so that I
+can double-check your answers and distribute them in the next version.
const char *p;
char cb;
Pixcol pv;
+
+ if (!o_resolver)
+ fatal("OCR failed - unrecognised characters or ligatures.\n"
+ "Character set database needs to be updated or augmented.\n"
+ "See README.charset.\n");
if (!resolver) {
sysassert(! pipe(jobpipe) );
/* we know donepipe[1] is >= 4 and we have dealt with all the others
* so we aren't in any danger of overwriting some other fd 4: */
sysassert( dup2(donepipe[1],4) ==4 );
- execlp("./yppsc-ocr-resolver", "yppsc-ocr-resolver",
+ execlp(o_resolver, o_resolver,
DEBUGP(callout) ? "--debug" : "--noop-arg",
"--automatic-1",
(char*)0);