X-Git-Url: http://www.chiark.greenend.org.uk/ucgi/~yarrgweb/git?a=blobdiff_plain;f=pctb%2FREADME.files;fp=pctb%2FREADME.files;h=76229136c61e8cc2850a69529533d14211f628a4;hb=fc71bd8c21ff11ded6888512ec30a8506d7ed1ec;hp=0000000000000000000000000000000000000000;hpb=c7280e7f26a6fd3685fe96d94c1d76ec77812e57;p=ypp-sc-tools.db-test.git diff --git a/pctb/README.files b/pctb/README.files new file mode 100644 index 0000000..7622913 --- /dev/null +++ b/pctb/README.files @@ -0,0 +1,82 @@ +Files we use and update +----------------------- + +The program reads and writes the following files: + + * #pages#.pnm + + Contains one or more images (as raw ppms, end-to-end) which are the + screenshots taken in the last run. This is (over)written whenever + we take screenshots from the YPP client. You can reprocess an + existing set of screenshots with the --same (aka --analyse-only) + option; in that case we just read the screenshots file. + + You can specify a different file with --screenshot-file. + + If you want to display the contents of this file, `display' can do + it. Don't try `display vid:#pages#.pnm' as this will consume + truly stupendous quantities of RAM - it wedged my laptop. + + * #master-newcommods#.txt #local-newcommods#.txt + + Dictionary of newly introduced commodities. When a new commodity + appears in Puzzle Pirates, the PCTB server operators need to add it + to their database for us to be able to upload data about it. + + It can sometimes take a few days to do this. In the meantime, it + is possible to upload partial data - data just omitting that + commodity. This is controlled by these files: they list + commodities which should be automatically ignored if the PCTB + server doesn't know about them. The master file is downloaded and + updated automatically from my server. You may create the local + file yourself. The format is simple: one commodity per line. + + Unrecognised commodities can also be due to OCR failure so + double-check what you're doing before overriding the uploader by + telling it to ignore an unrecognised commodity. + + * #master-reject#.txt #local-reject#.txt + + Dictionary of regexps which, when the OCR appears to match, we + reject instead. At the moment this is used to stop us thinking + that `Butterfly weed' is `Butterflyweed'. This happens if the + character set dictionary is missing the lowercase `y ' glyph. + See README.charset. + + * #master-char*#.txt #local-char*#.txt + #master-pixmap#.txt #local-pixmap#.txt + + Character set and image dictionaries. For the semantics of the + char* files README.charset. There is not currently any accurate + documentation of this dictionary format. + + #master-*#.txt contain the centrally defined and approved data. + They are downloaded automatically from the SC PCTB server and + updated each run. You can safely delete this file, if everything + is online, if you want to fetch a fresh copy. + + #local-*#.txt are a local copy of your submissions, so that they + will be used by your client pending approval by me. You can delete + this file if you think you may have made a mistake. + + See README.privacy for details of the communications with the SC + server about the contents of these dictionaries. + + * #commodmap#.tsv + + Map from commodity names to the numbers required by the PCTB + server. This is fetched and updated automatically as necessary. + It can safely be deleted as it will then be refetched. + + * #upload-1#.html #upload-2#.html + + We screenscrape the pages from the PCTB upload server. The actual + HTML returned from the upload server is left in these dropping + files for debugging etc. + + * .tmp + + When any of these tools overwrite one of the persistent dictionary + files, they temporarily write to .tmp. + +Future versions may have more helpers and more data files.