X-Git-Url: http://www.chiark.greenend.org.uk/ucgi/~yarrgweb/git?p=ypp-sc-tools.main.git;a=blobdiff_plain;f=pctb%2FREADME;h=a8dc8d7b8f59cbb58b7330b99cc117f812fd8d2f;hp=986338ad659dcd1c4669589149e9ec1ce1b0a771;hb=a16191c13cdc1eab1e43cc9662c2271481b3a9b8;hpb=cde017ed6b76840ce2ae1aa5fc740a6e06352f92 diff --git a/pctb/README b/pctb/README index 986338a..a8dc8d7 100644 --- a/pctb/README +++ b/pctb/README @@ -4,14 +4,18 @@ Overview This tool can: - screenscrape the commodities trading screen - produce the results as a tab separated values file - - **TODO** upload the results to PCTB + - upload the results to PCTB To run it, change to this directory, type `make', and then: ./ypp-commodities --tsv >commods.tsv +or + ./ypp-commodities --upload While it is capturing the screenshots, do not move the mouse or use the keyboard. Keyboard focus must stay in the YPP client window. +You will probably need to turn off `Use antialiased font' in the YPP +client. This is in the Ye panel, Options, tab `General'. Command-line options -------------------- @@ -28,8 +32,13 @@ Options to vary the processing: --screenshot-file F Store or read screenshots in F rather than #pages#.pnm --window-id ID Specified X window is the YPP client - do not search --edit-charset Enable character set editing. See README.dictionary. + --no-edit-charset Do not edit charset even if #local-char*#.txt exists. + --find-island Find and print the ocean and island. Suppresses OCR + and output unless used with result processing option. + --test-servers Set default servers to be the test servers, not + the real live ones (doesn't affect explicit settings). -Controlling what happens to the results: +Controlling what happens to the results - only one at a time: --upload (default) Upload to the PCTB server --tsv Print data as clean tab-separated-values file --raw-tsv Dump the raw (not deduped, unsorted) OCR'd data @@ -44,9 +53,10 @@ Privacy options, which control conversations with the dictionary server: Please do not use options marked * with --upload. See README.privacy. Options to override which servers we talk to: - --pctb-url HOST|URL Talk to the PCTB server at HOST or URL. - --dict-submit-url URL Submit dictionary entries with HTTP POST under URL. - --dict-update-url URL Fetch updated master dictionary with rsync from URL. + --pctb-server HOST|URL Talk to the PCTB server at HOST or URL. + --dict-submit-url URL Submit dictionary entries with HTTP POST under URL. + --dict-update-from SRC Fetch updated master dictionary with rsync from SRC. +Or set the environment variables YPPSC_PCTB{_PCTB, _DICT_UPDATE, _DICT_SUBMIT} Files we use and update @@ -68,17 +78,42 @@ The program reads and writes the following files: it. Don't try `display vid:#pages#.pnm' as this will consume truly stupendous quantities of RAM - it wedged my laptop. - * charset-15.txt + * #master-newcommods#.txt #local-newcommods#.txt - Character set dictionary. For the semantics of the contents of this - file see README.charset. There is not currently any accurate + Dictionary of newly introduced commodities. When a new commodity + appears in Puzzle Pirates, the PCTB server operators need to add it + to their database for us to be able to upload data about it. + + It can sometimes take a few days to do this. In the meantime, it + is possible to upload partial data - data just omitting that + commodity. This is controlled by these files: they list + commodities which should be automatically ignored if the PCTB + server doesn't know about them. The master file is downloaded and + updated automatically from my server. You may create the local + file yourself. The format is simple: one commodity per line. + + Unrecognised commodities can also be due to OCR failure so + double-check what you're doing before overriding the uploader by + telling it to ignore an unrecognised commodity. + + * #master-char*#.txt #local-char*#.txt + #master-pixmap#.txt #local-pixmap#.txt + + Character set and image dictionaries. For the semantics of the + char* files README.charset. There is not currently any accurate documentation of this dictionary format. - If you delete this file you'll have to re-enter a lot of glyph data - (and probably get it wrong and make the program misrecognise - things). If you want to undo any mistakes you may have made - answering OCR questions you can safely revert this to the version - I've supplied. + #master-*#.txt contain the centrally defined and approved data. + They are downloaded automatically from the SC PCTB server and + updated each run. You can safely delete this file, if everything + is online, if you want to fetch a fresh copy. + + #local-*#.txt are a local copy of your submissions, so that they + will be used by your client pending approval by me. You can delete + this file if you think you may have made a mistake. + + See README.privacy for details of the communications with the SC + server about the contents of these dictionaries. * #commodmap#.tsv @@ -86,15 +121,21 @@ The program reads and writes the following files: server. This is fetched and updated automatically as necessary. It can safely be deleted as it will then be refetched. - * .new + * #upload-1#.html #upload-2#.html + + We screenscrape the pages from the PCTB upload server. The actual + HTML returned from the upload server is left in these dropping + files for debugging etc. + + * .tmp When any of these tools overwrite one of the persistent dictionary - files, they temporarily write to .new. + files, they temporarily write to .tmp. These files are all in the current working directory. There is not yet any feature to have them be somewhere else. The helper programs - yppsc-ocr-resolver - yppsc-commod-processor + dictionary-manager + commod-results-processor must (currently) also be in the current directory. Future versions may have more helpers and more data files. @@ -111,6 +152,7 @@ This program has quite a few dependencies: - pnm command line utilities for image manipulation netpbm - X11 libraries, including dev files for building libx11-dev - XTEST library, including dev files for building libxtst-dev + - Perl-compatible regexp library, including dev files libpcre3-dev - Tk interpreter /usr/bin/wish tk8.4 - Perl module XML::Parser libxml-parser-perl - Perl module JSON::Parser libjson-perl