T E C H N I C A L  M E M O R A N D U M


Subject:        Acorn Library Format / Object Library Format

Reference:      PLG-ALF

Issue:          0.02/proto-1.00

Author:         Lee Smith, 2nd February 1989

Distribution:   Not restricted.


-----------------------------------------------------------------------------
Programming Languages Group, Acorn Computers Limited,
Fulbourn Road, Cherry Hinton, Cambridge, CB1 4JN, England.
-----------------------------------------------------------------------------


Copyright Acorn Computers Limited 1989


Neither the whole nor any part of the information contained in this technical
memorandum may be adapted or reproduced in any material form except with the
prior written approval of Acorn Computers Limited (Acorn).

The information contained in this technical memorandum relates to ongoing
developments. Whilst it is given in good faith by Acorn, it is acknowledged
that there may be errors or omissions.


                H I S T O R Y


    26-Oct-88   First written.
    02-Feb-89   Minor editing and revision;
                added description of 'old-style' libraries;
                added description of chunk files (from PLG-AOF);
                merged in description of Object Library Format;


Introduction
============

A library file contains a number of separate but related pieces of data.
In order to simplify access to these data, and to provide for a degree of
extensibility, the library file format is itself layered on another format
called "Chunk File Format", which provides a simple and efficient means of
accessing and updating distinct chunks of data within a single file. The
library file format defines four chunks: "Directory", "Time-stamp", "Version"
and "Data". There may be many "Data" chunks in a library.

The minimum size of a piece of data in both formats is four bytes or one
word. Each word is stored in a file in "litle-endian" format; that is the
least significant byte of the word is stored first.


Chunk File Format
=================

A chunk is accessed via a header at the start of the file. The header contains
the number, size, location and identity of each chunk in the file. The size of
the header may vary between different chunk files but is fixed for each file.
Not all entries in a header need be used, thus limited expansion of the number
of chunks is permitted without a wholesale copy. A chunk file can be copied
without knowledge of the contents of the individual chunks.

Graphically, the layout of a chunk file is as follows:-

             ----------------------------
             |       ChunkFileId        |
             ----------------------------
             |       maxChunks          |
             ----------------------------
             |       numChunks          |  3 words
             ============================
             |        entry1            |  4 words per entry
             |                          |
             ----------------------------
             |        entry2            |
             |                          |
             ----------------------------
                       ...
             ----------------------------
             |     entry "maxChunks"    |  End of Header
             |                          |  (3 + 4*maxChunks words)
             ============================
             |        chunk  1          |  Start of Data Chunks
             |                          |
             ----------------------------
                       ...
             ----------------------------
             |    chunk  "numChunks"    |
             |                          |
             ----------------------------

ChunkFileId marks the file as a chunk file. Its value is C3CBC6C5 hex.
The "maxChunks" field defines the number of the entries in the header, fixed
when the file is created. The "numChunks" field defines how many chunks are
currently used in the file, which can vary from 0 to "maxChunks". The value
of "numChunks" is redundant as it can be found by scanning the entries.


Each entry in the header comprises four words in the following order:

chunkId     a two word field identifying what data the chunk contains;

fileOffset  a one word field defining the byte offset within the file of
            the chunk (which must be divisible by four); an entry of zero
            indicates that the corresponding chunk is unused;

size        a one word field defining the exact byte size of the chunk
            (which need not be a multiple of four).

The "chunkId" field provides a conventional way of identifying what type of
data a chunk contains. It is split into two parts. The first four characters
(in the first word) contain a universally unique name allocated by a central
authority (Acorn). The remaining four characters (in the second word) can
be used to identify component chunks within this universal domain. In each
part, the first character of the name is stored first in the file, and so on.

For library files, the first part of each chunk's name is "LIB_"; the second
components are defined in the next section entitled "Library File Format".


Library File Format Types
=========================

There are two library file formats described here, termed "new-style" and
"old-style". The linker and the library management tools can all read both
formats, though no tool will actually generate an "old-style" library.

Currently, only the Fortran-77 compiler generates "old-style" libraries
(which it does instead of generating AOF object files). The linker handles
these libraries specially, including every member in the output image unless
explicitly instructed otherwise.

Old-style libraries are obsolescent and should no longer be generated.


Library File Format
===================

Each piece of a library file is stored in a separate, identifiable, chunk,
named as follows:

    Chunk          |  Chunk Name
    ---------------+------------
    Directory      |  LIB_DIRY
    Time-stamp     |  LIB_TIME
    Version        |  LIB_VSRN          -- new-style libraries only
    Data           |  LIB_DATA
    ---------------+------------
    Symbol table   |  OFL_SYMT          -- object code libraries only
    Time-stamp     |  OFL_TIME          -- object code libraries only

There may be many LIB_DATA chunks in a library, one for each library member.


LIB_DIRY
--------

The LIB_DIRY chunk contains a directory of all modules in the library each
of which is stored in a LIB_DATA chunk. The directory size is fixed when the
library is created. The directory consists of a sequence of variable length
entries, each an integral number of words long. The number of directory
entries is determined by the size of the LIB_DIRY chunk.
Pictorially:

                    +-> +--------------------+
                    |   |    ChunkIndex      |
                    |   +--------------------+
                    +------- EntryLength     |
                    |   +--------------------+
                 an |   |    DataLength ---------+
           integral |   +--------------------+ <-+
             number |   |                    |   | in an old-style library,
                 of |   .    Data            .   | may be an odd number of
              words |   .                    .   | bytes
                    |   .              /-----+ <-+
                    |   +-------------/      .
                    |   |    Padding         |
                    +-> +--------------------+


Chunkindex
----------
The ChunkIndex is a 0 origin index within the chunk file header of the
corresponding LIB_DATA chunk. The LIB_DATA chunk entry gives the offset and
size of the library module in the library file. A ChunkIndex of 0 means the
directory entry is not in use.

EntryLength
-----------
The number of bytes in this LIB_DIRY entry, always a multiple of 4.

DataLength
----------
The number of bytes used in the Data section of this LIB_DIRY entry. This
need not be a multiple of 4, though it always is in new-style libraries.

Data
----
The data section consists of a 0 terminated string followed by any other
information relevant to the library module. Strings should contain only
ISO-8859 non-control characters (i.e. codes [0-31], 127 and 128+[0-31] are
excluded). The string is the name used by the library management tools to
identify this library module. Typically this is the name of the file from
which the library member was created.

In new-style libraries, an 8-byte, word-aligned time-stamp follows the
member name. The format of this time-stamp is described in the sub-section
entitled "LIB_TIME". Its value is (an encoded version of) the time-stamp
(i.e. the last modified time) of the file from which the library member
was created.

Applications which create libraries or library members should ensure that
the LIB_DIRY entries they create contain valid time-stamps. Applications
which read LIB_DIRY entries should not rely on any data beyond the end of the
name string string being present unless the difference between the DataLength
field and the name-string length allows for it. Even then, the contents of a
time-stamp should be treated cautiously and not assumed to be sensible.

Applications which write LIB_DIRY or OFL_SYMT entries should ensure that
padding is done with NUL (0) bytes; applications which read LIB_DIRY or
OFL_SYMT entries should make no assumptions about the values of padding
bytes beyond the first, string-terminating NUL byte.


LIB_TIME
--------

The LIB_TIME chunk contains a 64 bit time-stamp recording when the library
was last modified, in the following format:

  High-address byte      low-address byte       
         |                    |
        +----------------+-----+
        |    TimeStamp   |     |
        +----------------+-----+
                .           .
               /|\         /|\
                |           +--- 2 byte micro second count, usually 0
                |
                +--------------- 6 bytes of centi-seconds since
                                                   1/1/1900 00:00 GMT


LIB_VSRN
--------

In new-style libraries, this chunk contains a 4-byte version number.
The current version number is 1. Old-style libraries do not contain
this chunk.


LIB_DATA
--------

A LIB_DATA chunk contains one of the library members indexed by the LIB_DIRY
chunk. No interpretation is placed on the contents of a member by the library
management tools. A member could itself be a file in chunk file format or even
another library.


Object Code Libraries
=====================

An object code library is a library file whose members are files in Acorn
Object Format (AOF - see document PLG-AOF for a description of this format).

Additional information is stored in two extra chunks, OFL_SYMT and OFL_TIME.

OFL_SYMT contains an entry for each external symbol defined by members of the
library, together with the index of the chunk containing the member defining
that symbol.

The OFL_SYMT chunk has exactly the same format as the LIB_DIRY chunk except
that the Data section of each entry contains only a string, the name of an
external symbol (and between 1 and 4 bytes of NUL padding). OFL_SYMT entries
do not contain time-stamps.

The OFL_TIME chunk records when the OFL_SYMT chunk was last modified and has
the same format as the LIB_TIME chunk (see above).