X-Git-Url: http://www.chiark.greenend.org.uk/ucgi/~yarrgweb/git?p=ypp-sc-tools.db-live.git;a=blobdiff_plain;f=pctb%2FREADME;h=49f2c28e006375d8ddd7d6c86867c3e458af5af0;hp=5f1cea2e7dfe6e6cd67f50e34ff62aaa600faf13;hb=bf23ebca674a0a7a53b1a9bfc7121a38fb8216b9;hpb=e9825058f1499f656335b3215cc3c79bf0ef4715 diff --git a/pctb/README b/pctb/README index 5f1cea2..49f2c28 100644 --- a/pctb/README +++ b/pctb/README @@ -1,40 +1,204 @@ +Overview +-------- + This tool can: - screenscrape the commodities trading screen - produce the results as a tab separated values file - - **TODO** upload the results to PCTB + - upload the results to PCTB To run it, change to this directory, type `make', and then: ./ypp-commodities --tsv >commods.tsv -It may put up a window asking about characters it does not understand. -It is important to get these inputs right or it may misrecognise -things in future. **TODO** write actual useful instructuions to cover the -subtleties. The results are stored in the file `charset-15.txt'. - -If you need to report a bug, please be sure to remember the exact -error message and circumstances. Also, for recognition problems there -will probably be a very useful screenshot file called `#pages#.pnm'. -This is likely to be very large so don't just email it to me, but if -you can put it up on a webpage for me to download that will help. - -Options available: - - Setting the operation mode: - --find-window-only Just check that we can find the YPP client window. - --screenshot-only Page through and take screenshots, do not OCR - --analyse-only | --same Process previously taken screenshots - --everything (default) Take screenshots and process them - - Options to vary the processing: - --single-page One screenful, no paging - results will be incomplete - --quiet Suppress progress messages - --screenshot-file F Store or read screenshots in F rather than #pages#.pnm - --window-id ID Specified X window is the YPP client - do not search - - Setting the output processing: - --raw-tsv Dump the raw not deduped unsorted OCR'd data - --upload (default) Upload to the PCTB server - --tsv Print data as clean tab-separated-values file - --best-prices Print best buy and sell price for each commodity - --arbitrage Print arbitrage opportunityes +While it is capturing the screenshots, do not move the mouse or use +the keyboard. Keyboard focus must stay in the YPP client window. + +You will probably need to turn off `Use antialiased font' in the YPP +client. This is in the Ye panel, Options, tab `General'. + +Command-line options +-------------------- + +Setting the operation mode: + --find-window-only Just check that we can find the YPP client window. + --screenshot-only Page through and take screenshots, do not OCR + --analyse-only | --same Process previously taken screenshots + --everything (default) Take screenshots and process them + +Options to vary the processing: + --single-page One screenful, no paging - results will be incomplete + --quiet Suppress progress messages + --screenshot-file F Store or read screenshots in F rather than #pages#.pnm + --window-id ID Specified X window is the YPP client - do not search + --edit-charset Enable character set editing. See README.dictionary. + --find-island Find and print the ocean and island. Suppresses OCR + and output unless used with result processing option. + --test-servers Set default servers to be the test servers, not + the real live ones (doesn't affect explicit settings). + +Controlling what happens to the results - only one at a time: + --upload (default) Upload to the PCTB server + --tsv Print data as clean tab-separated-values file + --raw-tsv Dump the raw (not deduped, unsorted) OCR'd data + --best-prices Print best buy and sell price for each commodity + --arbitrage Print arbitrage opportunities + +Privacy options, which control conversations with the dictionary server: + --dict-local-only * Do not talk to the server even to fetch new dictionary. + --dict-read-only * Only fetch new dictionary, do not submit new entries. + --dict-anon Don't quote pirate name if submitting entries. + --dict-submit Submit entries quoting my pirate name. (default) +Please do not use options marked * with --upload. See README.privacy. + +Options to override which servers we talk to: + --pctb-server HOST|URL Talk to the PCTB server at HOST or URL. + --dict-submit-url URL Submit dictionary entries with HTTP POST under URL. + --dict-update-from SRC Fetch updated master dictionary with rsync from SRC. +Or set the environment variables YPPSC_PCTB{_PCTB, _DICT_UPDATE, _DICT_SUBMIT} + + +Files we use and update +----------------------- + +The program reads and writes the following files: + + * #pages#.pnm + + Contains one or more images (as raw ppms, end-to-end) which are the + screenshots taken in the last run. This is (over)written whenever + we take screenshots from the YPP client. You can reprocess an + existing set of screenshots with the --same (aka --analyse-only) + option; in that case we just read the screenshots file. + + You can specify a different file with --screenshot-file. + + If you want to display the contents of this file, `display' can do + it. Don't try `display vid:#pages#.pnm' as this will consume + truly stupendous quantities of RAM - it wedged my laptop. + + * #master-newcommods#.txt #local-newcommods#.txt + + Dictionary of newly introduced commodities. When a new commodity + appears in Puzzle Pirates, the PCTB server operators need to add it + to their database for us to be able to upload data about it. + + It can sometimes take a few days to do this. In the meantime, it + is possible to upload partial data - data just omitting that + commodity. This is controlled by these files: they list + commodities which should be automatically ignored if the PCTB + server doesn't know about them. The master file is downloaded and + updated automatically from my server. You may create the local + file yourself. The format is simple: one commodity per line. + + Unrecognised commodities can also be due to OCR failure so + double-check what you're doing before overriding the uploader by + telling it to ignore an unrecognised commodity. + + * #master-char*#.txt #local-char*#.txt + #master-pixmap#.txt #local-pixmap#.txt + + Character set and image dictionaries. For the semantics of the + char* files README.charset. There is not currently any accurate + documentation of this dictionary format. + + #master-*#.txt contain the centrally defined and approved data. + They are downloaded automatically from the SC PCTB server and + updated each run. You can safely delete this file, if everything + is online, if you want to fetch a fresh copy. + + #local-*#.txt are a local copy of your submissions, so that they + will be used by your client pending approval by me. You can delete + this file if you think you may have made a mistake. + + See README.privacy for details of the communications with the SC + server about the contents of these dictionaries. + + * #commodmap#.tsv + + Map from commodity names to the numbers required by the PCTB + server. This is fetched and updated automatically as necessary. + It can safely be deleted as it will then be refetched. + + * #upload-1#.html #upload-2#.html + + We screenscrape the pages from the PCTB upload server. The actual + HTML returned from the upload server is left in these dropping + files for debugging etc. + + * .new + + When any of these tools overwrite one of the persistent dictionary + files, they temporarily write to .new. + +These files are all in the current working directory. There is not +yet any feature to have them be somewhere else. The helper programs + dictionary-manager + commod-results-processor +must (currently) also be in the current directory. + +Future versions may have more helpers and more data files. + + +Installation requirements +------------------------- + +This program has quite a few dependencies: + Package (Debian etch) + + - For building, C compiler and build environment build-essential + - pnm library, including dev files for building libnetpbm10-dev + - pnm command line utilities for image manipulation netpbm + - X11 libraries, including dev files for building libx11-dev + - XTEST library, including dev files for building libxtst-dev + - Perl-compatible regexp library, including dev files libpcre3-dev + - Tk interpreter /usr/bin/wish tk8.4 + - Perl module XML::Parser libxml-parser-perl + - Perl module JSON::Parser libjson-perl + - XTEST extension in the X server (part of X package) + - Perl interpreter and basic modules perl (usu.installed) + +On other Linux distros the packages may have different names, but +these should be roughly right for Debian and its derivatives. + + +Reporting problems +------------------ + +If you need to report a bug, for example an inability to recognise, +please be sure to remember the exact error message and circumstances. +Also, for recognition problems there will probably be a very useful +screenshot file called `#pages#.pnm'. This is likely to be very large +so don't just email it to me, but if you can put it up on a webpage +for me to download that will help. At least keep a copy of it. + +If the problem is a failure to cope with some particular YPP client +display and is reproducible, try running: + ./ypp-commodities --raw-tsv --single-page +If this reproduces the problem, please email me the screenshot file +#pages#.pnm, which will consist only of the single screen, plus the +error messasge. I'll then be able to understand what's wrong, +hopefully. + + +Privacy +------- + +The main purpose of this program is to connect to the PCTB server and +upload data. It will do that if you run it with --upload. + +This program will also, by default, talk to the dictionary server I +have set up: to download updated image dictionaries, and to upload new +dictionary entries which you create with the PCTB client dictionary +GUI. This feature is mentioned in and controllable in the GUI itself, +so it won't happen without you knowing about it. + +The uploads will by default mention your ocean and pirate name; if you +don't want that, pass the --dict-anon option, or untick the box in the +GUI. + +See README.privacy for full details. + + + - Ian Jackson + ijackson@chiark.greenend.org.uk + Aristarchus on the Midnight ocean