5 - screenscrape the commodities trading screen
6 - produce the results as a tab separated values file
7 - upload the results to PCTB
9 To run it, change to this directory, type `make', and then:
10 ./ypp-commodities --tsv >commods.tsv
12 ./ypp-commodities --upload
14 While it is capturing the screenshots, do not move the mouse or use
15 the keyboard. Keyboard focus must stay in the YPP client window.
17 You will probably need to turn off `Use antialiased font' in the YPP
18 client. This is in the Ye panel, Options, tab `General'.
23 Setting the operation mode:
24 --find-window-only Just check that we can find the YPP client window.
25 --screenshot-only Page through and take screenshots, do not OCR
26 --analyse-only | --same Process previously taken screenshots
27 --everything (default) Take screenshots and process them
29 Options to vary the processing:
30 --single-page One screenful, no paging - results will be incomplete
31 --quiet Suppress progress messages
32 --screenshot-file F Store or read screenshots in F rather than #pages#.pnm
33 --window-id ID Specified X window is the YPP client - do not search
34 --edit-charset Enable character set editing. See README.dictionary.
35 --no-edit-charset Do not edit charset even if #local-char*#.txt exists.
36 --find-island Find and print the ocean and island. Suppresses OCR
37 and output unless used with result processing option.
38 --test-servers Set default servers to be the test servers, not
39 the real live ones (doesn't affect explicit settings).
41 Controlling what happens to the results - only one at a time:
42 --upload (default) Upload to the PCTB server
43 --tsv Print data as clean tab-separated-values file
44 --raw-tsv Dump the raw (not deduped, unsorted) OCR'd data
45 --best-prices Print best buy and sell price for each commodity
46 --arbitrage Print arbitrage opportunities
48 Privacy options, which control conversations with the dictionary server:
49 --dict-local-only * Do not talk to the server even to fetch new dictionary.
50 --dict-read-only * Only fetch new dictionary, do not submit new entries.
51 --dict-anon Don't quote pirate name if submitting entries.
52 --dict-submit Submit entries quoting my pirate name. (default)
53 Please do not use options marked * with --upload. See README.privacy.
55 Options to override which servers we talk to:
56 --pctb-server HOST|URL Talk to the PCTB server at HOST or URL.
57 --dict-submit-url URL Submit dictionary entries with HTTP POST under URL.
58 --dict-update-from SRC Fetch updated master dictionary with rsync from SRC.
59 Or set the environment variables YPPSC_PCTB{_PCTB, _DICT_UPDATE, _DICT_SUBMIT}
62 Files we use and update
63 -----------------------
65 The program reads and writes the following files:
69 Contains one or more images (as raw ppms, end-to-end) which are the
70 screenshots taken in the last run. This is (over)written whenever
71 we take screenshots from the YPP client. You can reprocess an
72 existing set of screenshots with the --same (aka --analyse-only)
73 option; in that case we just read the screenshots file.
75 You can specify a different file with --screenshot-file.
77 If you want to display the contents of this file, `display' can do
78 it. Don't try `display vid:#pages#.pnm' as this will consume
79 truly stupendous quantities of RAM - it wedged my laptop.
81 * #master-newcommods#.txt #local-newcommods#.txt
83 Dictionary of newly introduced commodities. When a new commodity
84 appears in Puzzle Pirates, the PCTB server operators need to add it
85 to their database for us to be able to upload data about it.
87 It can sometimes take a few days to do this. In the meantime, it
88 is possible to upload partial data - data just omitting that
89 commodity. This is controlled by these files: they list
90 commodities which should be automatically ignored if the PCTB
91 server doesn't know about them. The master file is downloaded and
92 updated automatically from my server. You may create the local
93 file yourself. The format is simple: one commodity per line.
95 Unrecognised commodities can also be due to OCR failure so
96 double-check what you're doing before overriding the uploader by
97 telling it to ignore an unrecognised commodity.
99 * #master-reject#.txt #local-reject#.txt
101 Dictionary of regexps which, when the OCR appears to match, we
102 reject instead. At the moment this is used to stop us thinking
103 that `Butterfly weed' is `Butterflyweed'. This happens if the
104 character set dictionary is missing the lowercase `y ' glyph.
107 * #master-char*#.txt #local-char*#.txt
108 #master-pixmap#.txt #local-pixmap#.txt
110 Character set and image dictionaries. For the semantics of the
111 char* files README.charset. There is not currently any accurate
112 documentation of this dictionary format.
114 #master-*#.txt contain the centrally defined and approved data.
115 They are downloaded automatically from the SC PCTB server and
116 updated each run. You can safely delete this file, if everything
117 is online, if you want to fetch a fresh copy.
119 #local-*#.txt are a local copy of your submissions, so that they
120 will be used by your client pending approval by me. You can delete
121 this file if you think you may have made a mistake.
123 See README.privacy for details of the communications with the SC
124 server about the contents of these dictionaries.
128 Map from commodity names to the numbers required by the PCTB
129 server. This is fetched and updated automatically as necessary.
130 It can safely be deleted as it will then be refetched.
132 * #upload-1#.html #upload-2#.html
134 We screenscrape the pages from the PCTB upload server. The actual
135 HTML returned from the upload server is left in these dropping
136 files for debugging etc.
140 When any of these tools overwrite one of the persistent dictionary
141 files, they temporarily write to <file>.tmp.
143 These files are all in the current working directory. There is not
144 yet any feature to have them be somewhere else. The helper programs
146 commod-results-processor
147 must (currently) also be in the current directory.
149 Future versions may have more helpers and more data files.
152 Installation requirements
153 -------------------------
155 This program has quite a few dependencies:
156 Package (Debian etch)
158 - For building, C compiler and build environment build-essential
159 - pnm library, including dev files for building libnetpbm10-dev
160 - pnm command line utilities for image manipulation netpbm
161 - X11 libraries, including dev files for building libx11-dev
162 - XTEST library, including dev files for building libxtst-dev
163 - Perl-compatible regexp library, including dev files libpcre3-dev
164 - Tk interpreter /usr/bin/wish tk8.4
165 - Perl module XML::Parser libxml-parser-perl
166 - Perl module JSON::Parser libjson-perl
167 - XTEST extension in the X server (part of X package)
168 - Perl interpreter and basic modules perl (usu.installed)
170 On other Linux distros the packages may have different names, but
171 these should be roughly right for Debian and its derivatives.
177 If you need to report a bug, for example an inability to recognise,
178 please be sure to remember the exact error message and circumstances.
179 Also, for recognition problems there will probably be a very useful
180 screenshot file called `#pages#.pnm'. This is likely to be very large
181 so don't just email it to me, but if you can put it up on a webpage
182 for me to download that will help. At least keep a copy of it.
184 If the problem is a failure to cope with some particular YPP client
185 display and is reproducible, try running:
186 ./ypp-commodities --raw-tsv --single-page
187 If this reproduces the problem, please email me the screenshot file
188 #pages#.pnm, which will consist only of the single screen, plus the
189 error messasge. I'll then be able to understand what's wrong,
196 The main purpose of this program is to connect to the PCTB server and
197 upload data. It will do that if you run it with --upload.
199 This program will also, by default, talk to the dictionary server I
200 have set up: to download updated image dictionaries, and to upload new
201 dictionary entries which you create with the PCTB client dictionary
202 GUI. This feature is mentioned in and controllable in the GUI itself,
203 so it won't happen without you knowing about it.
205 The uploads will by default mention your ocean and pirate name; if you
206 don't want that, pass the --dict-anon option, or untick the box in the
209 See README.privacy for full details.
214 ijackson@chiark.greenend.org.uk
215 Aristarchus on the Midnight ocean