X-Git-Url: http://www.chiark.greenend.org.uk/ucgi/~yarrgweb/git?p=ypp-sc-tools.db-test.git;a=blobdiff_plain;f=pctb%2FREADME;h=4f56fd8ab0d0d9a53e773f804ed8c24386d00548;hp=0d83e3931428540292911fe1b4f6b4fa4807d125;hb=f2358d0ea7b40ba405621947513f48108ca93504;hpb=2337ae5465a29659b44037dcbdaf6fa03eb46d84 diff --git a/pctb/README b/pctb/README index 0d83e39..4f56fd8 100644 --- a/pctb/README +++ b/pctb/README @@ -4,20 +4,18 @@ Overview This tool can: - screenscrape the commodities trading screen - produce the results as a tab separated values file - - **TODO** upload the results to PCTB + - upload the results to PCTB To run it, change to this directory, type `make', and then: ./ypp-commodities --tsv >commods.tsv +or + ./ypp-commodities --upload While it is capturing the screenshots, do not move the mouse or use the keyboard. Keyboard focus must stay in the YPP client window. -*IMPORTANT* -It may put up a window asking about characters it does not understand. -It is important to get these inputs right or your client may -misrecognise text in future. You *must* read the documentation in -README.charset before answering these questions. - +You will probably need to turn off `Use antialiased font' in the YPP +client. This is in the Ye panel, Options, tab `General'. Command-line options -------------------- @@ -33,14 +31,32 @@ Options to vary the processing: --quiet Suppress progress messages --screenshot-file F Store or read screenshots in F rather than #pages#.pnm --window-id ID Specified X window is the YPP client - do not search + --edit-charset Enable character set editing. See README.dictionary. + --find-island Find and print the ocean and island. Suppresses OCR + and output unless used with result processing option. + --test-servers Set default servers to be the test servers, not + the real live ones (doesn't affect explicit settings). -Controlling what happens to the results: +Controlling what happens to the results - only one at a time: --upload (default) Upload to the PCTB server --tsv Print data as clean tab-separated-values file --raw-tsv Dump the raw (not deduped, unsorted) OCR'd data --best-prices Print best buy and sell price for each commodity --arbitrage Print arbitrage opportunities +Privacy options, which control conversations with the dictionary server: + --dict-local-only * Do not talk to the server even to fetch new dictionary. + --dict-read-only * Only fetch new dictionary, do not submit new entries. + --dict-anon Don't quote pirate name if submitting entries. + --dict-submit Submit entries quoting my pirate name. (default) +Please do not use options marked * with --upload. See README.privacy. + +Options to override which servers we talk to: + --pctb-server HOST|URL Talk to the PCTB server at HOST or URL. + --dict-submit-url URL Submit dictionary entries with HTTP POST under URL. + --dict-update-from SRC Fetch updated master dictionary with rsync from SRC. +Or set the environment variables YPPSC_PCTB{_PCTB, _DICT_UPDATE, _DICT_SUBMIT} + Files we use and update ----------------------- @@ -61,17 +77,42 @@ The program reads and writes the following files: it. Don't try `display vid:#pages#.pnm' as this will consume truly stupendous quantities of RAM - it wedged my laptop. - * charset-15.txt + * #master-newcommods#.txt #local-newcommods#.txt + + Dictionary of newly introduced commodities. When a new commodity + appears in Puzzle Pirates, the PCTB server operators need to add it + to their database for us to be able to upload data about it. + + It can sometimes take a few days to do this. In the meantime, it + is possible to upload partial data - data just omitting that + commodity. This is controlled by these files: they list + commodities which should be automatically ignored if the PCTB + server doesn't know about them. The master file is downloaded and + updated automatically from my server. You may create the local + file yourself. The format is simple: one commodity per line. + + Unrecognised commodities can also be due to OCR failure so + double-check what you're doing before overriding the uploader by + telling it to ignore an unrecognised commodity. - Character set database. For the semantics of the contents of this - file see README.charset. There is not currently any accurate - documentation of this database format. + * #master-char*#.txt #local-char*#.txt + #master-pixmap#.txt #local-pixmap#.txt - If you delete this file you'll have to re-enter a lot of glyph data - (and probably get it wrong and make the program misrecognise - things). If you want to undo any mistakes you may have made - answering OCR questions you can safely revert this to the version - I've supplied. + Character set and image dictionaries. For the semantics of the + char* files README.charset. There is not currently any accurate + documentation of this dictionary format. + + #master-*#.txt contain the centrally defined and approved data. + They are downloaded automatically from the SC PCTB server and + updated each run. You can safely delete this file, if everything + is online, if you want to fetch a fresh copy. + + #local-*#.txt are a local copy of your submissions, so that they + will be used by your client pending approval by me. You can delete + this file if you think you may have made a mistake. + + See README.privacy for details of the communications with the SC + server about the contents of these dictionaries. * #commodmap#.tsv @@ -79,15 +120,21 @@ The program reads and writes the following files: server. This is fetched and updated automatically as necessary. It can safely be deleted as it will then be refetched. + * #upload-1#.html #upload-2#.html + + We screenscrape the pages from the PCTB upload server. The actual + HTML returned from the upload server is left in these dropping + files for debugging etc. + * .new - When any of these tools overwrite one of the persistent database + When any of these tools overwrite one of the persistent dictionary files, they temporarily write to .new. These files are all in the current working directory. There is not yet any feature to have them be somewhere else. The helper programs - yppsc-ocr-resolver - yppsc-commod-processor + dictionary-manager + commod-results-processor must (currently) also be in the current directory. Future versions may have more helpers and more data files. @@ -104,6 +151,7 @@ This program has quite a few dependencies: - pnm command line utilities for image manipulation netpbm - X11 libraries, including dev files for building libx11-dev - XTEST library, including dev files for building libxtst-dev + - Perl-compatible regexp library, including dev files libpcre3-dev - Tk interpreter /usr/bin/wish tk8.4 - Perl module XML::Parser libxml-parser-perl - Perl module JSON::Parser libjson-perl @@ -133,27 +181,24 @@ error messasge. I'll then be able to understand what's wrong, hopefully. -Phoning home - privacy ----------------------- +Privacy +------- The main purpose of this program is to connect to the PCTB server and -upload data. The program does not currently phone home at all in -modes other than --upload, and when it does it connects to the -PCTB server not to a system of mine. +upload data. It will do that if you run it with --upload. -However, there are some improvements which I may introduce in the -future which may change this. I am considering: +This program will also, by default, talk to the dictionary server I +have set up: to download updated image dictionaries, and to upload new +dictionary entries which you create with the PCTB client dictionary +GUI. This feature is mentioned in and controllable in the GUI itself, +so it won't happen without you knowing about it. - * Having the ocr character resolver talk to a server run by me - to look for missing glpyhs, and/or upload those glyphs back - to that server so that they can be shared. +The uploads will by default mention your ocean and pirate name; if you +don't want that, pass the --dict-anon option, or untick the box in the +GUI. - * Having the upload client upload a copy of the data to a server run - by me, when run in --upload mode. +See README.privacy for full details. -If I do do this these new functions may be enabled by default, but it -will be possible to turn them off, or direct them to different -servers, with command-line options, and they will be documented here. - Ian Jackson