From cc332b59b99acabeae968d0b10dc12159bdf86b1 Mon Sep 17 00:00:00 2001 From: Ian Jackson Date: Mon, 24 Jan 2011 17:49:55 +0000 Subject: [PATCH] notes for compressed archive format --- yarrg/notes | 164 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 164 insertions(+) create mode 100644 yarrg/notes diff --git a/yarrg/notes b/yarrg/notes new file mode 100644 index 0000000..31596dc --- /dev/null +++ b/yarrg/notes @@ -0,0 +1,164 @@ +m.{153,155} Admiral, 3580s + +using only cols 2,3 (buy) + +changes in price,qty by island, commod, stall + 295 rows out of 4281 +changes in total qty by island, commod, price + 540 rows out of 2106 + +stall "Couchablanca" adds lots of offers (100ish), lots else + +---------------------------------------- + +m.{151,153} Admiral, 38s + +using only cols 2,3 (buy) + +changes in price,qty by island, commod, stall + 2 rows out of 4179 + +two stocks changed at one stall + + +m.{149,151} Admiral, 84s + +no changes + +---------------------------------------- + +m.{020,145} Turtle, 271944s + +by stall + 391 / 1056 53 stall changes, 128 commod changes, if sorted +by price + 334 / 797 + +---------------------------------------- + +m.{117,145} Turtle, 47797s + +by stall + 130 / 1023 +by price + 135 / 757 + + +======================================== + +very few price changes, mostly qty changes + + +==================== + +files + ARCHIVE.%{ocean}.lock.par never removed + ARCHIVE.%{ocean}.ocean.par updated by rename + ARCHIVE.%{ocean}s.%{isle}s.main.par updated by rename + ARCHIVE.%{ocean}s.%{isle}s.log.par appended, length in main + ARCHIVE.%{ocean}s.%{isle}s.z%d.par create/write, length in main + ocean file is always updated first so lockfree readers should open + main, then ocean + +integers are in LE byte order +vuint is one or more bytes with 7 bits each, BE first; top bit is "more bytes" + +format for an ocean file + magic Frame uint8*4 59a72671 + + for each commodity: + starting with commodity 0x0001 as zero is reserved for sentinels + commodity name uint16 name length + uint8*length name bytes + +format for a main file + magic Frame uint8*4 59a72672 + number of z files Frame uint32 + length of log file in bytes Frame uint32 + + single uncompressed diff + representing the change from nothing to + the most recent uploaded data + +format for a log or z file + + magic Frame uint8*4 59a72673 + + series of diffs most recent last + + for a log file, there may be some trailing garbage + not referred to in main file (see "length of log file") + + for a z file, the file length is definitive and the last + entry is always valid if the file is referred to in the main + file, but any z file not mentioned in the main file is + garbage + +format for a diff + Each diff records a change "going backwards", ie you apply the + diff to a more recent state to get an earlier state; the metadata + corresponds to the earlier state; diffs at the physical end, ie + logical start, of the file, contain information without previous + context (ie start from empty, no offers) + + diff format version Plain uint8 constant 0x01 + + timestamp delta + (amount by which this timestamp + is later than the previous upload in this file, + or later than 0 if there is no previous pload in this file) + Plain vuint time_t + + for each payload stream, ie: + for Meta, Stall, Commod, Price, Qty: + in an uncompressed diff: + uncompressed data Plain some number of bytes + the uncompressed data is in the order shown below + in a compressed diff: + compressed length Plain vuint + compressed data Plain that number of bytes + the compressed data for each stream forms a continuous + compression stream within each file, starting with + the last diff in the file and then running backwards + + diff length (reverse pointer) Frame uint32 + includes length of exactly the + data sections marked "Plain" + +format inside the payload streams + uncompressed streams, literally in this order + compressed streams: read each substream in order, but ordering + between substreams with a diff is semantic but not physical + + metadata excluding + ocean, island, timestamp Meta uint16 metadata length + Meta uint8*length metadata + + for each stall + + stall name Stall uint16 name length + Stall uint8*length name + + for each commod which has changed price + including ones which have been added + + commodid Commod uint16 commodity id + + buy price delta Price uint16 + sell price delta Price uint16 + in case of added offers, previous price + is taken to be best price from previous upload + at this island, or 0 if previously no offers + + sentinel commodid Commod uint16 constant 0x0000 + + for each commod which has changed qty + including ones which have been added or removed + + commodid Commod uint16 commodity id + buy qty delta Qty uint16 + sell qty delta Qty uint16 + + sentinel commodid Commod uint16 constant 0x0000 + + sentinel stall name Stall uint16 constant 0x0000 -- 2.30.2