chiark - git - cjwatson - blog.git/commit

documentation: parametrize the search binary data type sizes.

Needed in order to support more than 65k symbols or files larger than 16
MB. What I thought was "more than enough" during the initial design was
quickly stepped over by various projects, including my own Magnum Python
bindings.

To avoid having to either maintain two separate formats and two separate
en/decoders or needlessly inflate the format for everyone, certain data
types are parametrized based on how large the data is:

* RESULT_ID_BYTES describes how many bytes is needed to store result
   IDs. By default it's 2 (so 65536 results) but can be also 3 (16M
   results) or 4.
* FILE_OFFSET_BYTES describes how many bytes is needed to store file
   offsets. By default it's 3 (so 16 MB), but can be also 4.
* NAME_SIZE_BYTES describes how many bytes is needed to store various
   name lengths (prefix, suffix lengths etc). By default it's 1 (so 256
   bytes at most), but can be also 2.

At first I tried to preserve 32-bit alignment as much as possible, but
eventually realized this is completely unimportant in the browser
environment -- there's other much worse performance pitfalls than
reading an unaligned value. This is also why there are 24-bit integer
types, even though they're quite annoying to pack from Python.

Furthermore, the original hack to reserve 11 bits for result count at
the cost of having only 4 bits for child count was changed to instead
expand the result count to a 15-bit value if there's > 127 results. Some
endianness tricks involved, but much cleaner than before. I briefly
considered having a global RESULT_COUNT_BYTES parameter as well, but
considering >90% of result counts fit into 8 bits and this is only for
weird outliers like Python __init__(), it would be a giant waste of
precious bytes.

The minor differences in the test file sizes are due to:

* The header expanding symbol count from 16 to 32 bits (+2B)
* The header containing type description and associated padding (+4B)
* The result map no longer packing flags and offsets together, thus
   saving one byte from flags (-1B)

To ensure there's no hardcoded type size assumptions anymore, the tests
now go through all type size combinations.

author	Vladimír Vondruš <mosra@centrum.cz>
	Sat, 8 Jan 2022 19:49:26 +0000 (20:49 +0100)
committer	Vladimír Vondruš <mosra@centrum.cz>
	Sun, 9 Jan 2022 15:51:50 +0000 (16:51 +0100)
commit	b0cf44e4ddbf42ce79a8612563e84e00e8a75808
tree	8db744056778b82299ccf9b5e03324685e9b8d6a	tree \| snapshot
parent	c661a654d3d3837dfae92e48bfccde19ceea6dad	commit \| diff

documentation/_search.py		diff \| blob \| history
documentation/doxygen.py		diff \| blob \| history
documentation/python.py		diff \| blob \| history
documentation/search.js		diff \| blob \| history
documentation/test/_search_test_metadata.py		diff \| blob \| history
documentation/test/js-test-data/empty-ns1-ri2-fo3.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/empty-ns1-ri2-fo4.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/empty-ns1-ri3-fo3.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/empty-ns1-ri3-fo4.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/empty-ns1-ri4-fo3.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/empty-ns1-ri4-fo4.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/empty-ns2-ri2-fo3.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/empty-ns2-ri2-fo4.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/empty-ns2-ri3-fo3.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/empty-ns2-ri3-fo4.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/empty-ns2-ri4-fo3.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/empty-ns2-ri4-fo4.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/empty.bin	[deleted file]	blob \| history
documentation/test/js-test-data/manyresults-ns1-ri2-fo3.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/manyresults-ns1-ri2-fo4.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/manyresults-ns1-ri3-fo3.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/manyresults-ns1-ri3-fo4.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/manyresults-ns1-ri4-fo3.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/manyresults-ns1-ri4-fo4.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/manyresults-ns2-ri2-fo3.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/manyresults-ns2-ri2-fo4.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/manyresults-ns2-ri3-fo3.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/manyresults-ns2-ri3-fo4.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/manyresults-ns2-ri4-fo3.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/manyresults-ns2-ri4-fo4.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/manyresults.bin	[deleted file]	blob \| history
documentation/test/js-test-data/nested.bin		diff \| blob \| history
documentation/test/js-test-data/searchdata-ns1-ri2-fo3.b85	[new file with mode: 0644]	blob
documentation/test/js-test-data/searchdata-ns1-ri2-fo3.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/searchdata-ns1-ri2-fo4.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/searchdata-ns1-ri3-fo3.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/searchdata-ns1-ri3-fo4.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/searchdata-ns1-ri4-fo3.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/searchdata-ns1-ri4-fo4.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/searchdata-ns2-ri2-fo3.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/searchdata-ns2-ri2-fo4.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/searchdata-ns2-ri3-fo3.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/searchdata-ns2-ri3-fo4.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/searchdata-ns2-ri4-fo3.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/searchdata-ns2-ri4-fo4.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/searchdata.b85	[deleted file]	blob \| history
documentation/test/js-test-data/searchdata.bin	[deleted file]	blob \| history
documentation/test/js-test-data/short.bin		diff \| blob \| history
documentation/test/js-test-data/unicode.bin		diff \| blob \| history
documentation/test/js-test-data/wrong-magic.bin		diff \| blob \| history
documentation/test/js-test-data/wrong-result-id-bytes.bin	[new file with mode: 0644]	blob
documentation/test/js-test-data/wrong-version.bin		diff \| blob \| history
documentation/test/populate-js-test-data.py		diff \| blob \| history
documentation/test/test-search.js		diff \| blob \| history
documentation/test/test_search.py		diff \| blob \| history
documentation/test_doxygen/layout/pages.html		diff \| blob \| history
documentation/test_doxygen/layout_generated_doxyfile/index.html		diff \| blob \| history
documentation/test_doxygen/layout_minimal/index.html		diff \| blob \| history
documentation/test_doxygen/layout_search_binary/index.html		diff \| blob \| history
documentation/test_doxygen/layout_search_opensearch/index.html		diff \| blob \| history
documentation/test_doxygen/test_search.py		diff \| blob \| history
documentation/test_doxygen/test_undocumented.py		diff \| blob \| history
documentation/test_doxygen/undocumented/File_8h.html		diff \| blob \| history
documentation/test_doxygen/undocumented/annotated.html		diff \| blob \| history
documentation/test_doxygen/undocumented/classClass.html		diff \| blob \| history
documentation/test_doxygen/undocumented/dir_4b0d5f8864bf89936129251a2d32609b.html		diff \| blob \| history
documentation/test_doxygen/undocumented/files.html		diff \| blob \| history
documentation/test_doxygen/undocumented/group__group.html		diff \| blob \| history
documentation/test_doxygen/undocumented/namespaceNamespace.html		diff \| blob \| history
documentation/test_doxygen/undocumented/structNamespace_1_1ClassInANamespace.html		diff \| blob \| history
documentation/test_python/layout/index.html		diff \| blob \| history
documentation/test_python/layout_search_binary/index.html		diff \| blob \| history
documentation/test_python/layout_search_open_search/index.html		diff \| blob \| history
documentation/test_python/link_formatting/c.link_formatting.Class.Sub.html		diff \| blob \| history
documentation/test_python/link_formatting/c.link_formatting.Class.html		diff \| blob \| history
documentation/test_python/link_formatting/c.link_formatting.pybind.Foo.html		diff \| blob \| history
documentation/test_python/link_formatting/m.link_formatting.html		diff \| blob \| history
documentation/test_python/link_formatting/m.link_formatting.pybind.html		diff \| blob \| history
documentation/test_python/link_formatting/m.link_formatting.sub.html		diff \| blob \| history
documentation/test_python/link_formatting/p.page.html		diff \| blob \| history
documentation/test_python/link_formatting/s.classes.html		diff \| blob \| history
documentation/test_python/link_formatting/s.modules.html		diff \| blob \| history
documentation/test_python/link_formatting/s.pages.html		diff \| blob \| history
documentation/test_python/test_search.py		diff \| blob \| history