A PGI porthole -- generator or pipe -- operates within an operating system environment which allows it to communicate with other components of the PGI system. The definitions below are probably UNIX-specific. It is envisaged that these will be augmented by similar specifications in other environments.
A porthole may be in one of a number of modes. These are given below. In the following sections, the environment can be seen to operate differently depending on the current mode. Also described are the means of changing mode.
Initially we specify generators, modifications for pipes are given later.
In PGI-CGI mode the floowing sections, on standard input and standard output, do not apply. Instead the assembler operates as if a webserver invoking a CGI script. A full CGI environment is created, modified according to the manner in which the variables are modified in the following Environment section. The input from the request is placed on standard input, and the output on standard output is recorded, possibly filtered through the implicit pipe, and then used as output.
In modes other than PGI-CGI (the true modes), communication with a PGI script is by means of a line-oriented discourse carried out upon standard input and standard output. These lines occur in line blocks, as described in spec4. The exchange is half-duplex with respect to line blocks.
The first communication between the assembler and the PGI script is the establishing of a PGI mode. The PGI sends PGI-Mode: header line alone in a block, and awaits the assembler's response. This response will be of the form of a single line in a line-block. The assembler responds with PGI-Mode-Status: followed by will or wont. If wont is specified, the porthole may try again with a different mode.
An assembler which is incapable of wide mode because of the manner it is run (eg as a CGI script) must respond wont to a PGI-wide mode request. Even though an assembler may invoke even a PGI-wide porthole once per request, indistinguishable from PGI-normal mode, the wont signals to the script the inability of the assembler to support wide mode, possibly suggesting to it optimisations. The PGI should simply follow the failed PGI-wide request with a PGI-normal one.
A PGI must have established a satisfactory mode before it continues.
A request is composed of an environment, which is a set of key value pairs. In general, this will refer to a non-empty set of fragments to render for a particular page request. The next task is requested by the PGI sending a Request: next line in a block of its own to the assembler. The assembler responds with a block containing the environment of the request (detailed later).
In the CGI specification requests, such as POST or PUT, which carry information back to the server, have their input presented on standard input. A PGI requests this data, if needed, by sending a Request: input line alone in a block. The response is a Request-Status: line which will either contain okay, or an error message. If okay, this will be followed by a Content-Length: header in the same block, with a byte count, Content-Type:, and possibly further headers. Once the block is terminated, the number of bytes in Content-Length: is transmitted to the script.
First a series of headers are sent, including a Content-Length: and a Content-Type: header. After a terminating blank header line, exactly content-length bytes are sent of type content-type. The input is then done for this request, and remains open for future requests. If content-length is missing, no data follows and the blank header line marks the request-end. The headers are described in more detail in the environment section. Note that this spooling occurs only once per page request, not once per fragment.
For pipes, standard input is in the same format, except that the body is the output of the previous pipe or generator, including any header lines they generated.
When a PGI script is ready to produce output it sends a header line, Request: output. Also in that line block is a Content-Length: and Content-Type: header, and possibly other headers as defined in HTTP, or the CGI spec. In particular, yhe content-type, status and location headers, as specified by the CGI specification, may be added here. In PGI-normal and PGI-wide mode (optionally in others) a header must be specified PGI-Id which contains the ID of the fragment output being described, according to the pgi-id parameter in the PGI_REQUEST environment value. The headers burst and session are specified in spec2. The headers preferon-set, preferon-add, preferon-remove and sales-line are specified in spec3. There then follows the body of the output, exactly Content-Length bytes long.
The outermost porthole specifying a status is used as the final page status, or 200 if none is specified. Other statuses are assembled into successes and failures, failures are replaced by standard markup to indicate compnent failure to the user.
Similarly, for the location header, the outermost, if any, takes precedence. Inner locaiton headers are ignored. Content-type headers are also valid only for the outermost specifying porthole. Inner content-types are auto-converted, if possible, by the assembler (for example text/plain -> text/html), otherwise, content is marked as erroneous as per inner status failures above. This is modified by bursting, described in a subsequent document.
In PGI-single and PGI-nonpersist modes, a content-length header is optional. If present it must be used. If absent, a closing of the stdout fd acts in its place. In PGI-normal and PGI-wide modes, the content-length header is mandatory. After that many bytes, the stream reverts to the communcation stream for the next fragment.
A pipe has the same output format as a generator.
Standard error is connected to the assembler's error reporting mechanisms. The assembler must add the invocation context, and time, and then record the message in some manner. Messages are CRLF delimited.
The environemnt refers to the header lines supplied on stdin with each page request. The names are not case sensitive, and should be emitted in camel-caps for header lines, similarly there is no distinction between minus and underscore.
pgi-path contains the full path derived from pgi-name attributes (see later section). pgi-key contains the name of the key (porthole) to be rendered. Other attributes are the arguments to the porthole (which must be ignored in PGI-single mode), minus those with perpended paths. pgi-id contains the identifier to use when returning a fragment on stdout for PGI-normal and PGI-wide mode.
A PGI can further request various pieces of information through sending a line-block on stdout, and waiting for a response block on stdin. The principal use for these requests are in the telegraphics system specified elsewhere.
The byte sequence corresponding to the inclusion point varies in form between media types, though it contains the same semantic components. The syntax here is described for HTML (text/html), other syntaxes may be defined in time. Inclusion is disabled in PGI-CGI and PGI-single modes.
In HTML, the tag has name porthole and has a number of attributes. It can be closed explicitly, html style, or not at all. The close has no effect, inclusion is at the position of the open tag. Every inclusion must have a pgi-name attribute. This can be concatenated (using periods) to produce the path to the inclusion for a page. An inclusion must also include a pgi-key. The value of this attribute is a key which determines which porthole is to be invoked. Other attributes beginning pgi- are reserved.
Attributes not beginning pgi- are arguments. These are passed in the request to the subsequent PGI to render the inclusion. An attribute may be preceded by a path-part delimited by period. These attributes are not arguments to the current inclusion but if a subpath matching the subpath of the argument begins at the present inclusion, then the attribute becomes and argument to that porthole. For example, the attribute a.b.c.d for porthole with path m.n.p becomes the argument to m.n.p.a.b.c (if it exists) named d.
Information on line protocols, escaping, and so on, can be found in spec4.
main spec page | Basic Spec (spec1) | Fronting, Bursting and Telegraphics (spec 2) | Caching (spec 3) | Naming (spec 4)
This project was executed as an infrastructure component of an
EPSRC CTA award at CARET.