Hypertext Extensions to LATEX

This section describes how you can define hypertext entries in your HTML documents from within your LATEX source, as well as other effects available in HTML for which there need be no direct LATEX analog for a printed document. These are implemented as new LATEX commands which have special meaning during the translation by LATEX2HTML into HTML, but are mostly ignored when processed by LATEX.


The new commands described in the sections below are defined mainly in the html package, with LATEX definitions in the file html.sty, which is part of the LATEX2HTML distribution. It must be included in any LATEX document using these features, by one of the following methods:

It is not sufficient to load the style file via an \input or \include command, such as \input html.sty . This will load the required definitions for LATEX, but will not load the html.perl package file for LATEX2HTML.

Warning: Some of these features, but not all, are also available with LATEX 2.09.
Users of LATEX2HTML are strongly advised to upgrade their LATEX installations to LATEX2e.





Several new environments are defined, in particular for specifying large (or small) sections of the text which are appropriate to only one version of the document—either the HTML or the LATEX typeset version.
* \begin{rawhtml}
for including raw HTML tags and SGML-like markup.
* \begin{htmlonly}
for material intended for the HTML pages only.
* \begin{latexonly}
for material intended for the LATEX version only.
Note that any macro-definitions or changes to counter-values are local to within this environment.
* %begin{latexonly}
for material intended for the LATEX version only.
Macro-definitions and changes to counter-values are retained outside of this (pseudo-)environment.
* \begin{imagesonly}
for material intended to be used in the images.tex file only.
* \begin{comment}
for user-comments only, currently ignored in both the HTML and LATEX versions.
(To put HTML comments into the HTML files, use the rawhtml environment.)
* \begin{makeimage}
creates an image of its contents, as typeset by LATEX.
This is also used to prevent an image being made of the complete contents of a figure environment, allowing more natural processing.
* \begin{htmllist}
defined in htmllist.sty and htmllist.perl, this produces coloured balls tagging the items in a descriptive list, as used throughout this manual.

Warning: When using these environments it is important that the closing delimiter, \end{htmlonly} say, occurs on a line by itself with no preceding spaces, <tab>s or any other characters. (Otherwise LATEX will not recognise the intended end of the environment when processing for the .dvi version.) Similarly there should be nothing on the same line after the opening environment delimiter, \begin{htmlonly} say.





The following commands are defined for LATEX in html.sty.
Corresponding Perl implementations are either in html.perl or in the latex2html script itself.

* \latextohtml
expands to the name LATEX2HTML, of this translator;
* \htmladdnormallink
creates a (perhaps named) textual hyperlink to a specified <URL>;
* \htmladdnormallinkfoot
same as \htmladdnormallink, but LATEX also prints the <URL> in a footnote;
* \htmladdimg
places an image (perhaps aligned) on the HTML page;
ignored by LATEX.
* \hyperref
creates a textual hyperlink to where a \label command occurred within the same document.
This is the recommended substitute for LATEX's \ref command.
* \htmlref
creates a textual hyperlink to the place where a \label command occurred; no reference is printed in the LATEX version.
* \hypercite
creates a textual hyperlink to the bibliography page where citation details are shown.
This is the recommended substitute for LATEX's \cite command.
* \htmlcite
creates a textual hyperlink to the bibliography page where citation details are shown; no citation marker is printed in the LATEX version.
* \externalref
creates a textual hyperlink to where a \label command occurred within a different document that has also been processed by LATEX2HTML;
ignored in LATEX.
* \externalcite
creates a textual hyperlink to where a reference occurs in a bibliography page from a different document that has also been processed by LATEX2HTML; ignored in LATEX.
* \externallabels
allows hypertext links to a different document; ignored in LATEX.





The following commands, also defined for LATEX in html.sty, are normally used only when creating segmented documents, see a later page.

* \segment
directs that an \input file <file> should be regarded as a separate “segment” of a larger LATEX2HTML document. In LATEX the file is input as usual, after counter values have first been written to a file, named <file>.ptr .
* \startdocument
tells LATEX2HTML where the end of the preamble occurs for a document segment; ignored in LATEX.
(A segment cannot have a \begin{document} command, unless it is shielded from LATEX within an htmlonly environment.)
* \internal
reads internal information from another document, so that symbolic references can be treated as if part of the current document; ignored in LATEX.
* \htmlhead
places a sectional heading on a HTML page; used mainly with the document segmentation feature.
It is ignored in LATEX.
* \htmlnohead
suppresses the section-heading for a document segment; ignored in LATEX.
* \segmentcolor
read from the .ptr file, this sets the text color for a document segment;
ignored in LATEX.
* \segmentpagecolor
read from the .ptr file, this sets the background color for a document segment;
ignored in LATEX.





The following commands are shorthand forms for some of the “conditional” environments listed above.

* \html
for putting small pieces of text into the HTML version only;
* \latex
for putting small pieces of text into the LATEX version only;
* \latexhtml
puts one piece of text into the LATEX version, another into the HTML version.





The following commands implement effects on the HTML pages for which there is no direct LATEX counterpart. Most of these commands are discussed in detail in a later section.

* \HTMLcode
a general command for placing raw HTML tags, with attributes and contents;
tags and attributes are ignored in LATEX, but not the contents.
* \htmlrule
places a (perhaps styled) horizontal line on the HTML page;
ignored in LATEX.
* \strikeout
places text between <STRIKE>...</STRIKE> tags; ignored in LATEX.
* \htmlimage
used for fine control over the size of individual images, and other graphics effects (e.g. making a `thumbnail' version);
ignored in LATEX.
* \htmlborder
places a border around the contents of an environment, but placing the environment as a cell inside a <TABLE>;
ignored in LATEX.
* \tableofchildlinks
determines where the table of childlinks should be placed on the HTML page;
ignored in LATEX.
* \htmlinfo
determines where the “About this document...” information should be placed;
ignored in LATEX.
* \htmladdtonavigation
appends a button to the navigation panels; ignored in LATEX.
* \bodytext
allows the contents of the <BODY ...> tag to be set explicitly for the current and subsequent HTML pages;
ignored in LATEX.
* \htmlbody
allows an attribute to be added or changed within the <BODY ...> tag of HTML;
ignored in LATEX.
* \htmlbase
Allows a URL to be specified within the <BASE ...> tag for all the HTML pages produced;
ignored in LATEX.
* \htmltracing{<level>}
specifies that extra tracing messages be generated, according to the <level>;
ignored in LATEX.
* \htmltracenv{<level>}
same as \htmltracing except that this command is evaluated in sequence with environments;
ignored in LATEX.
* \HTMLset
programmer's device, allowing an arbitrary Perl variable to be set or changed dynamically during the LATEX2HTML processing;
ignored in LATEX.
* \HTMLsetenv
Same as the preceding \HTMLset command, except that this one is processed in order, as if it were an environment;
ignored in LATEX.





Most of the new environments listed above can also be used with delimiter macros \<env-name>...\end<env-name>. This alternative style, which is common with AMS-TEX, is discouraged for general LATEX usage (even by the AMS itself) in favour of the usual \begin{<env-name>}...\end{<env-name>} markup notation. (Safety features that are available with the usual \begin...\end mechanism may not always work in the best way with this alternative style of environment delimiter. These comments apply to both the LATEX and LATEX2HTML processing.)

* \rawhtml...\endrawhtml
old AMS-style variant of rawhtml environment.
* \htmlonly...\endhtmlonly
old AMS-style variant of htmlonly environment.
* \latexonly...\endlatexonly
old AMS-style variant of latexonly environment.
* \imagesonly...\endimagesonly
old AMS-style variant of imagesonly environment.
* \comment...\endcomment
old AMS-style variant of comment environment.

Warning: These `pseudo'-environments are not as reliable as their LATEX counterparts. In particular, the \begin<env-name> and \end<env-name> commands should appear on lines by themselves, preferably with no preceding spaces or <tab> characters. This requirement is analogous to the warning for conditional environments.


Hyper-links in LATEX

Subsections
Arbitrary hypertext references are created using the \htmladdnormallink and \htmladdimg commands. These have syntax:
\htmladdnormallink{<text>}{<URL>}
\htmladdnormallink[<name>]{<text>}{<URL>}

\htmladdimg{<URL>}
\htmladdimg[<align>]<URL>}

\htmladdnormallinkfoot{<text>}{<URL>}
\htmladdnormallinkfoot[<name>]{<text>}{<URL>}


\htmladdnormallink

The \htmladdnormallink command expects some text as the first argument and a URL as the second argument. When processed by LATEX (i.e. in the .dvi or .ps output files), the URL will have no effect. But when processed by the translator, the URL will be used to provide an active hypertext link (to another file, picture, sound-file, movie, etc.) e.g.
\htmladdnormallink{<URL>}
{http://www.ncsa.uiuc.edu/demoweb/url-primer.html}

The optional argument to \htmladdnormallink allows a name to be specified for the place in the document where the hyperlink occurs. This is done via the NAME="<name>" attribute for the <A ...> anchor tag in HTML . Such a name can be used as the target for a hyperlink using the \htmlref command, described later.


\htmladdimg

In a similar way, the argument of the \htmladdimg command should be a URL pointing to an image. This URL is ignored in the LATEX hard copy output. The optional argument to \htmladdimg allows an alignment for the image to be given: center, right or left. In the latter cases, the image is bound to the specified side of the browser's window. Subsequent text paragraphs `flow around' the other side of the image.


In fact any valid set of “attributes” for the <IMG> tag in HTML can be specified as the optional <align> parameter. In particular the WIDTH, HEIGHT and BORDER attributes can be set, perhaps overriding the natural size of the image.


\htmladdnormallinkfoot

The \htmladdnormallinkfoot command takes the same arguments, and when generating HTML has the same effect, as \htmladdnormallink. However when processed by LATEX it places the URL as a footnote.





Warning:  The tilde (˜) character is commonly used within hyperlink URLs. It is a quirk of TEX and LATEX that it must be generated via \~{}, else the ˜ will be interpreted as an accent on the following character.


Including Arbitrary HTML Mark-up and Comments

Subsections
LATEX2HTML provides the ability to include raw HTML tags and text within the HTML version of a document, without requiring corresponding material for the LATEX typeset version. This ability can be used to


\begin{rawhtml}

The simplest way to include raw HTML tags and/or text is by using the rawhtml environment. (An alternative way is to use the \HTML command, which allows macros to be expanded to give the required tags, attributes and contents.)

Note the warning concerning how the environment delimiters should be used in the LATEX source code.



A particularly good use of the rawhtml environment is in the creation of interactive electronic forms from within a LATEX document. When producing the paper (.dvi) version of a document the rawhtml environment is ignored.



Here is an example:

\begin{rawhtml}
<HR>
<FORM ACTION="http://cbl.leeds.ac.uk/nikos/doc/error.html">
<OL>
<LI> <INPUT TYPE="checkbox" NAME="wp" VALUE="word"> Word for
Windows.
<LI> <INPUT TYPE="checkbox" NAME="wp" VALUE="wp"> Word Perfect.
<LI> <INPUT TYPE="checkbox" NAME="wp" VALUE="latex"> LaTeX.
<LI> Plain Text Editors (Please Specify): <INPUT TYPE="text" NAME="other_ed">
</OL>
So, what do think (comments please): <BR>
<INPUT TYPE="text" SIZE=45 NAME="other_wp">

<INPUT TYPE="submit" VALUE="submit this form but don't expect much!">
</FORM>
<HR>
\end{rawhtml}
The result is shown below.

Figure 5: An electronic form. In the online version the form would be active.
 

  1. Word for Windows.
  2. Word Perfect.
  3. LaTeX.
  4. Plain Text Editors (Please Specify):
So, what do think (comments please):


\beginrawhtml...\endrawhtml

This is an alternative way to specify a chunk of raw HTML code, using the old AMS-style of delimiting environments. Use of this style is discouraged; the rawhtml environment is preferred.


\begin{comment}

This environment is simple for the convenience of “commenting-out” large sections of source code. The contents of this environment is completely ignored, both in the LATEX and HTML versions. Such an environment is already used in AMS-LATEX, and perhaps with other packages. It is defined here for its general utility.

To insert SGML-style comments into the HTML files, use the rawhtml environment as follows.

\begin{rawhtml}
<!--  this text is treated as a comment
      perhaps extending over several lines 
-->
\end{rawhtml}

Note the warning concerning how the environment delimiters should be used in the LATEX source code.


\comment...\endcomment

This is an alternative way to specify a chunk of material intended to be ignored in both the LATEX and HTML versions, using the old AMS-style of delimiting environments. Use of this style (though convenient for typing) is discouraged, since it is not as reliable as using the comment environment.


Arbitrary Tags and Attributes

For version 97.1 of LATEX2HTML there is a new command which provides an extremely flexible way to include HTML 3.2 tags, along with any values for the “attributes” of that tag, if desired.
\HTMLcode[<attribs>]{<tag>}
\HTMLcode[<attribs>]{<tag>}{<contents>}
When the <tag> also needs a closing tag (e.g <I>...</I>) the <contents> must be given, enclosed in braces. Both the opening and closing tags then will be placed correctly.

Warning: In version 97.1 this command was actually called \HTML. However style files may well define \HTML to mean something else, like a styled version of the HTML acronym. So in version 98.1 the name has been changed to \HTMLcode.

If no other definition of \HTML exists, then this command will be defined, to work the same as \HTMLcode.

An important aspect of this is that any of the <tag>, <attribs> and <contents> may be given wholly by expanding a LATEX macro, or may contain arbitrary macros, perhaps including other \HTMLcode commands. The following table was constructed using this feature; its LATEX source follows.


Figure 6: Example use of macros for raw HTML code.
 

A listing of the different text styles available in HTML 3.2
  • a simple test of “bold-face”, using <B> .
  • a simple test of “italics”, using <I> .
  • a simple test of “teletype-text”, using <TT> .
  • a simple test of “underlining”, using <U> .
  • a simple test of “strikeout”, using <STRIKE> .
  • a simple test of “emphasis style”, using <EM> .
  • a simple test of “strong style”, using <STRONG> .
  • a simple test of “code style”, using <CODE> .
  • a simple test of “citation style”, using <CITE> .
  • a simple test of “definition style”, using <DFN> .
  • a simple test of “sample style”, using <SAMP> .
  • a simple test of “keyboard style”, using <KBD> .
  • a simple test of “variable style”, using <VAR> .


\newcommand{\myalign}{center}
\newcommand{\mylist}{UL}
\newcommand{\myitem}[2]{\HTMLcode[disc]{LI}{\simpletest{#1}{#2}}}
\newcommand{\simpletest}[2]{%
 \HTMLcode{#1}{ a simple test of “#2”,} using \HTMLcode{CODE}{<#1>} .}
\newcommand{\tableopts}{10,border=5}

\newcommand{\tablelist}[4][left]{\HTMLcode[#1]{DIV}{
\HTMLcode[\tableopts]{TABLE}{
\HTMLcode[bottom]{CAPTION}{
#3
}\HTMLcode{TR}{\HTMLcode{TD}{
\HTMLcode{#2}{
#4
}}}
}}\HTMLcode[all]{BR}}

\tablelist[\myalign]{\mylist}{%
\textbf{A listing of the different text styles available in HTML 3.2}}{%
\myitem{B}{bold-face}
\myitem{I}{italics}
\myitem{TT}{teletype-text}
\myitem{U}{underlining}
\HTMLcode[circle]{LI}{\simpletest{STRIKE}{strikeout}}
\myitem{EM}{emphasis style}
\myitem{STRONG}{strong style}
\myitem{CODE}{code style}
\myitem{CITE}{citation style}
\myitem{DFN}{definition style}
\HTMLcode[square]{LI}{\simpletest{SAMP}{sample style}}
\HTMLcode[square]{LI}{\simpletest{KBD}{keyboard style}}
\myitem{VAR}{variable style}}


The above code demonstrates many aspects of the way \HTML commands can be used.

* nesting:
\HTML commands can be nested to arbitrary depth.
* macros:
Macros can be used to specify all or part of each argument.
* within macros:
\HTMLcode commands work correctly within the expansions of other macros.
* attribute values:
Information within <attribs> can be specified in a very loose way, as a comma-separated list of key/value pairs or as single values.
Not even the commas are necessary: space(s), <tab>s or newlines are equally effective. Indeed the horizontal rules preceding and following the table were specified by:
\HTMLcode[50\% 3 noshade center]{HR}
* attribute names:
Usually it is not necessary to know the names of the attributes to the tags that are to be used. It is sufficient just to give the values; these will be matched to the appropriate attribute, according to the type of data required. (If names are given, these are case-insensitive.)
* newlines:
Although LATEX ignores linebreaks within the source code, this is not so with LATEX2HTML. The strange spreading-out of the definition of the \tablelist command above was done with the purpose solely of making the code in the resulting HTML files more easily readable, to a human. (As most browsers ignore those newlines anyway, more compact code would have rendered the same on-screen.)


Some further aspects of the use of this \HTML command are not apparent from the above example.

* invalid <tag> :
If a <tag> is specified that is not part of the HTML 3.2 specifications, then it and its attributes are not placed into the HTML document created by LATEX2HTML. Any <contents> is included as ordinary data; i.e. as text in paragraphs, etc.

* required attributes:
Some tags have attributes which are required to have values, if that tag is to be included in an HTML document. Using the \HTML command, if any such attribute is not given an appropriate value then the tag is ignored. Any <contents> are included in the document, as ordinary character data.

* valid HTML :
Currently there is no checking that the <contents> of a <tag> contains only data (perhaps including other tags) allowed by the DTD for HTML 3.2.
The requirement to produce valid HTML currently rests with the user.
This issue will be addressed in forthcoming revisions of LATEX2HTML.

* extra attributes and values:
The list of attributes for a <tag> can include key-value pairs whose keys do not match any valid attribute for the <tag>. Such key-value pairs are simply ignored. Similarly extra data values are ignored, as are values that do not match the requirements for any valid attribute.

* attributes with similar data-types:
Several attributes to a <tag> may use values having the same or similar data-types. First any key-value pairs are processed. Remaining values are allocated to those attributes which do not already have a value. An ordering of the attributes is used, based on a perceived likelihood of each attribute being required to be changed from its default setting.


Conditional Text

Subsections


\begin{latexonly} and \begin{htmlonly}

Conditional text can be specified using the environments latexonly and htmlonly. These allow writing parts of a document which are intended only for electronic delivery or only for paper-based delivery.

This would be useful for example in adding a long description of a multi-media resource in the paper version of a document. Such a description would be redundant in the electronic version, as the user can have direct access to this resource.


Here is an example of the use of the latexonly environment, used earlier in this manual: \begin{latexonly}

\begin{figure}
    \begin{center}
    \fbox{\includegraphics[width=4in]{psfiles/eform}}
    \end{center}
    \caption{An electronic form. Of course in the online version of this
     document the form above would be active.}
\end{figure}
\end{latexonly}

Note the warning concerning how the environment delimiters should be used in the LATEX source code.


\htmlonly...\endhtmlonly

This is an alternative way to specify a chunk of material intended for the HTML version only, using the old AMS-style of delimiting environments. Use of this style is discouraged; the htmlonly environment is preferred.


\latexonly...\endlatexonly

This is an alternative way to specify a chunk of material intended for the LATEX typeset version only, using the old AMS-style of delimiting environments. Use of this style is discouraged; the latexonly environment or the unscoped %begin{latexonly} construction are preferred.

Note the warning concerning how the environment delimiters should be used in the LATEX source code.



\latex, \html and \latexhtml

There are also shorthand notations to accomplish the same thing as in the latexonly environment and htmlonly environment, but with less typing. Warning: Only small pieces of text work reliably in this way. With whole paragraphs or contained sub-environments, the “conditional” environments should be used instead.


%begin{latexonly}

Another variant of the latexonly environment is available, in which everything between %begin{latexonly} and %end{latexonly} is ignored by LATEX2HTML. The difference is that the latexonly environment puts the contents into a group, in which all definitions are local. There is no such scoping with the %begin...%end variant, since LATEX sees the initial %s simply as starting comments.



The following example should clarify what happens:

\newcommand{\A}{The letter A.}
\newcommand{\B}{The letter B.}
\begin{latexonly}
\renewcommand{\A}{Not the letter A.}
\end{latexonly}
%begin{latexonly}
\renewcommand{\B}{Not the letter B.}
%end{latexonly}
\begin{document}
\A \B
\end{document}
If you process this with LATEX, the result is:         The letter A. Not the letter B.

Note the warning concerning how the environment delimiters should be used in the LATEX source code.



Warning: Be careful when using LATEX commands which alter the values of counters (e.g. numbered figures or equations) in conditional text, because this may cause the counter values in the electronic version to lose synchronisation with the values of the corresponding counters in the LATEX version.




\begin{imagesonly}

This environment is used to put LATEX code into the images.tex file, to be used when generating images. Typically this is used to add commands to the preamble of images.tex, such as setting the text or background color. However code can be added at any other point as well; e.g. to change the background color of all images after a certain point in the document.

Note the warning concerning how the environment delimiters should be used in the LATEX source code.


\begin{makeimage}

This is a special environment which forces an image to be made of its contents. That is, one gets effectively a snapshot of a portion of a page that has been typeset using LATEX. Within the normal LATEX typeset version of the document, this environment is completely transparent, adding its contents to the page as usual.


One further important use of the makeimage environment is as follows. If a makeimage environment occurs as a sub-environment within a figure environment, then an image will not be made of the figure's contents. Instead, the contents are treated as normal text, each part being handled as if there were no figure at all, except that everything is placed within a single cell of a <TABLE>...</TABLE> construction in HTML 3.2. The contents of any \caption commands are placed between <CAPTION>...</CAPTION> tags for the <TABLE>.


Normally an image of the entire contents of the figure would be placed within the single cell of the <TABLE>. Now images are made of any subparts of those figure's contents that really need it, in particular the makeimage sub-environments. An empty makeimage sub-environment does not generate an image of itself, yet still it inhibits an image being made of the whole figure. These comments apply also to table environments.


Symbolic References shown as Hyperized Text

Subsections
In printed documents cross-references are shown through a numeric or symbolic indirection e.g. “see Figure 1” (numeric indirection), or “see section `Changes' ” (symbolic indirection). LATEX2HTML can mirror this mechanism using the same numeric or symbolic references, or when these are not appropriate by using iconic references.


In a hypertext document however, cross-references can be shown without any indirection, just by highlighting a relevant piece of text. This can make a document more readable as it removes unnecessary information.


\hyperref

A single new LATEX command \hyperref can be used for specifying how a cross-reference should appear, both in the printed document and in the hypertext version. For example, assuming that the label {sec:cond} is defined somewhere within a document, the command \hyperref, taking 4 arguments, can be used in that document as follows:
\emph{Is the concept of
\hyperref
               % This will be highlighted in the hypertext version
{conditional text}                      % argument #1
               % This will be shown in the printed version 
               % followed by a numeric reference ...      
{conditional text (see Section }        % argument #2
               % ... followed by this text
{ for more information)}                % argument #3
               % This is the common label 
{sec:cond}                              % argument #4
a good idea? }

Here is how it will be shown:

Is the concept of conditional text a good idea?

In the printed version what would appear is:

Is the concept of conditional text (see Section 4.2 for more information) a good idea?




An extended syntax for \hyperref uses an optional argument, which determines what information is to be placed in the LATEX version of the document. The value of this optional argument can also affect the number of required arguments. These forms are recognised:

\hyperref[ref]{<HTML-text>}{<LaTeX-text>}{<post-LaTeX>}{<label>}
\hyperref{<HTML-text>}{<LaTeX-text>}{<post-LaTeX>}{<label>}


\hyperref[pageref]{<HTML-text>}{<LaTeX-text>}{<post-LaTeX>}{<label>}
\hyperref[page]{<HTML-text>}{<LaTeX-text>}{<post-LaTeX>}{<label>}


\hyperref[noref]{<HTML-text>}{<LaTeX-text>}{<label>}
\hyperref[no]{<HTML-text>}{<LaTeX-text>}{<label>}

The first two are the defaults, where LATEX uses \ref{<label>}. With the next two LATEX uses \pageref{<label>}, while with the final two LATEX completely ignores the <label>, setting just the <LaTeX-text>.



For creating hyperlinks to other documents using symbolic reference <label>s, see also the \externalref command.


The preceding paragraph is an example of the use of the \hyperref[page] option. Its source code is:

For creating hyperlinks to other documents
using symbolic reference \Meta{label}s, 
see also the \Lc{externalref} 
\hyperref[page]{command}{command, described on page~}{}{externref}.
which appears in the LATEX typeset version as:
For creating hyperlinks to other documents using symbolic reference <label>s, see also the \externalref command, described on page 31.

In fact both \hyperref and the \htmlref command, to be described next, permit textual hyperlinks based on symbolic <label>s from external files.


\htmlref

Another command also defined in html.sty is \htmlref which has the same effect as \hyperref during the conversion to HTML. It takes two arguments, some text and a label. In the HTML version the text will be “hyperized”, pointing to the label. In the paper version the text will be shown as it is and the label will be ignored; e.g.
With \verb|\htmlref| \htmlref{it's easy to make links}{fig:example}.
which produces:
With \htmlref it's easy to make links.
In the LATEX typeset version it will appear simply as:
With \htmlref it's easy to make links.

Hypertext Links in Bibliographic References (Citations)

Subsections
If a report or a book that is cited (using the \cite command) is available (or there is information about it) on the World-Wide Web, then it is possible to add the appropriate hypertext links in your bibliographic database (the .bib) file.



Here is an example of a bibliographic entry for the original LATEX [1] blue book:

@string{tugURL="\htmladdnormallink
{http://www.tug.org/}{http://www.tug.org}"}

@string{danteURL="\htmladdnormallink
{http://www.dante.de/}{http://www.dante.de}"}

@book{lamp:latex,
title = "LaTeX User's Guide \& Reference Manual, 2nd edition",
year = 1994 ,
author = "Leslie Lamport",
Publisher = "Addison--Wesley Publishing Company, Inc.",
note = "Online information on {\TeX} and {\LaTeX} is available at "
 # tugURL # " and " # danteURL }
See the bibliography for how this will appear.
No other modifications are required; LATEX and BibTEX should work as normal. Note that it would be sensible to put the @string commands into a separate file, urls.bib say, loaded with the main file via
\bibliography{urls,...}.

The natbib package, written for LATEX by Patrick Daly, provides even more flexibility in the way a reference may be cited. All the features of this package are implemented for LATEX2HTML via the natbib.perl file. (Indeed there is even a mode whereby natbib handles the Harvard style of citation. This requires loading also the nharvard package.)


Thanks... to Martin Wilck for the bulk of the work in producing this extension, and to Ross Moore for necessary adjustments to allow it to work correctly with the document segmentation strategy.


\hypercite

Analogous to \hyperref is the \hypercite command, which allows a free-form textual hyperlink to the bibliography, whereas the LATEX typeset version contains the usual citation code. The allowed syntax is as follows.
\hypercite[int]{<HTML-text>}{<LaTeX-text>}{<opt-LaTeX>}{<label>}
\hypercite[cite]{<HTML-text>}{<LaTeX-text>}{<opt-LaTeX>}{<label>}
\hypercite{<HTML-text>}{<LaTeX-text>}{<opt-LaTeX>}{<label>}


\hypercite[nocite]{<HTML-text>}{<LaTeX-text>}{<label>}
\hypercite[no]{<HTML-text>}{<LaTeX-text>}{<label>}
\hypercite[ext]{<HTML-text>}{<LaTeX-text>}{<label>}
The first three forms are equivalent; LATEX uses \cite[<opt-LaTeX>]<label> , after placing the <LaTeX-text>. Note that {<opt-LaTeX>} must be specified, even if empty `{}'.

Similarly the latter three forms are equivalent, with LATEX using \nocite{<label>} , to force the particular reference to appear on the bibliography page, even though no explicit marker is placed at this point. (Thus there is no need for an optional <opt-LaTeX> argument.)
Within the HTML version a hyperlink is produced when the <HTML-text> is not empty. External label files are also searched, in order to match the symbolic <label>, see also \externalcite.


Earlier in this manual the following source code was used:
commands described in the \LaTeX{} \htmlcite{blue book}{lamp:latex}, 
...
as well as many other \LaTeX{} constructions, such as are described in 
the \LaTeX{} \hypercite{\emph{Companion}}{\emph{Companion}}{}{goossens:latex} 
and \LaTeX{} \hypercite{\emph{Graphics Companion} (e.g. \Xy-pic)}%
{\emph{Graphics Companion}}{\Xy-pic}{goossens:latexGraphics};
which produces:
commands described in the LATEX blue book,
  ...
as well as many other LATEX constructions, such as are described in the LATEX Companion and LATEX Graphics Companion (e.g. XY-pic);
whereas in the LATEX typeset version one sees:
commands described in the LATEX blue book,
  ...
as well as many other LATEX constructions, such as are described in the LATEX Companion[2] and LATEX Graphics Companion[3, XY-pic];


\htmlcite

Analogous to \htmlref is the \htmlcite command, which creates a textual hyperlink to a place on the document's bibliography page, but without displaying any reference marker in the LATEX typeset version. (See above for an example.)

The \externalcite command provides a similar facility when the bibliography page is “external”; that is, not part of the current document.


Symbolic References between Living Documents

Subsections The method of the previous section to generated symbolic hyperized links can easily be extended to external documents processed by LATEX2HTML. When LATEX2HTML processes a document, it generates a Perl file named <prefix>labels.pl which contains a list of all the symbolic labels that were defined, along with their locations. The <prefix> is empty unless otherwise specified, to allow different document segments to share the same directory.


\externallabels

Links to an external document are then possible once a connection is established to that document's labels.pl file. This connection is established by the \externallabels command:
\externallabels{<URL to directory of external document>}
{<local copy of external document labels.pl file>}

The first argument to \externallabels should be a URL to the directory containing the external document. The second argument should be the full path-name to the labels.pl file belonging to the external document. Note that for remote external documents it is necessary to copy the labels.pl file locally so that it can be read when processing a local document that uses it. The command \externallabels can be used once for each external document in order to import the external labelsinto the current document. A warning is given if labels.pl cannot be found.

If a symbolic reference made in either of the commands described on the previous page is not defined within the document itself, LATEX2HTML will look for that reference in one of the external files7. After any modifications in an external document (sections added/deleted, segmentation into different physical parts, etc.) a new labels.pl will be generated. If the \externallabels command in another document contains the correct address to an updated copy of the labels.pl file, then the cross-references will be re-aligned after running the local document through the translator.


There is also a mechanism analogous to the label–ref pairs of LATEX, which can be used only within a single document. These labels are called internal labels, as opposed to the external labels defined above. They are used extensively with the document segmentation strategy described later.

Either type of label is defined with a LATEX \label command. Labels can be referenced within a document using a \ref command. When processed by LATEX, each \ref command is replaced by the section number in which the corresponding \label occurred. When processed by the translator, each \ref is replaced by a hypertext link to the place where the corresponding \label occurred.


\externalref

This mechanism can be extended to external documents:
\externalref{<symbolic label in remote document>}
The argument to \externalref may be any symbolic label defined in the labels.pl file of any of the external documents. Such references to external symbolic labels are then translated into hyper-links pointing to the external document.


\externalcite

Analogous to \externalref, the \externalcite command is used to create a citation link, where the bibliography page is not part of the current document. As with \externalref symbolic labels for the bibliography page must have been loaded using \externallabels.

A particularly important use for this is in allowing multiple documents to access information in a common bibliographic listing. For example: all of an author's publications; a comprehensive listing of publications in a particular field; the (perhaps yearly) output of publications from a particular organisation or institution.


Thanks... to Uffe Engberg for suggesting this feature.


Cross-Referencing Example

To understand this mechanism better consider how you would maintain a link to this section (of the hypertext version of this document) from one of your documents, without using labels. Sure enough you can get the name of the physical file that this section is in. This however is quite likely to change, and any links to it would become invalid. To update your link, the name of the new file must be found and your link changed by hand. Also there is no general updating mechanism, so the only way to find out if your document is pointing to the right place is by actually following the link, then doing a manual update8.


Next consider how it could be done with symbolic labels. First you have to import the labels used in this document by copying the file labels.pl, saving it in /tmp/labels.pl say, then adding anywhere in your document:

\externallabels{http://cbl.leeds.ac.uk/nikos/tex2html/doc/manual}%
               {/tmp/labels.pl}
After that you can use the label `crossrefs' defined at the beginning of this section9 as follows:
\externalref{crossrefs}
This will be translated into the appropriate hyper-link to this page. If there are any changes in this document and you would like to bring your document up-to date, you have to copy labels.pl again and rerun the translator on your document. Of course if I move the directory containing the HTML files for this document somewhere else, then you would have to make a change in the argument of the \externallabels command to reflect this.

It is obvious that some level of collaboration is required between authors trying to maintain cross-references between different documents. Using symbolic labels makes this a lot easier (especially for documents written by the same author).


Miscellaneous commands for HTML effects

Subsections

The html package, through the LATEX input file html.sty, and its Perl counterpart html.perl, implements several new commands that are intended entirely for effects within the produced HTML files. In LATEX these commands, their arguments, and any optional arguments are completely ignored.


\htmlrule and \htmlrule*

One such device provided by html.sty, is the \htmlrule command. This puts a horizontal rule into the HTML file only; being ignored in the .dvi version. It is useful to provide extra visual separation between paragraphs, without creating a new HTML page, such as might warrant extra vertical space within the printed version.

Much variation can be obtained in the horizontal rule that is produced, using extended forms of the \htmlrule command:

\htmlrule
\htmlrule*
\htmlrule[<attribs>]
\htmlrule*[<attribs>]
Whereas a “break” tag <BR> normally precedes the <HR> generated by the \htmlrule command, this break is omitted when using the \htmlrule* variant.



Furthermore, the optional argument <attribs> can be used to specify attributes for both the <HR> and <BR> tags. More specifically, <attribs> should be a list of attribute-names and/or key-value pairs <key>=<value> separated by spaces or commas. This list is parsed to extract those attributes applicable to the <HR> tag, and those applicable to the <BR> (with the unstarred variant).




Using HTML 3.2, this allows variations to be specified for: Some examples of these effects appear on this page.


\strikeout{<text>}

With this command the <text> is processed as normal in the HTML version, then placed between <STRIKE>...</STRIKE> tags. Thus a horizontal line should be drawn through the middle of the <text>.
Currently the command and the <text> are ignored in the LATEX version.


\tableofchildlinks

As an extra aid to navigation within a long page, containing several (sub)subsections or deeper levels of sectioning, there is the \tableofchildlinks command. This does not generate anything new, for a table of the child links on or from a page is generated automatically by LATEX2HTML.

However if this command, or its variant \tableofchildlinks*, occurs within the source code to appear on a particular HTML page, then the child-links table will be placed at that point where the command occurs. Normally a break tag <BR> is inserted to separate the table of child-links from the surrounding text. The \tableofchildlinks* omits this extra break when it would result in too much space above the table.

For example throughout this section of the manual, all subsections in which several explicit commands have been discussed have their child-links table placed at the top of the page, using \tableofchildlinks*. This helps to quickly find the description of how the commands are used.


\htmlinfo

Normally an “About this document...” page is created at the end of the HTML document, containing technical information about how the document was created, by whom, or any other information contained in the $INFO variable. This information can be made to appear at any other place within the document by specifying \htmlinfo at the desired place in the source. For example, the information may be best suited for the title-page.

The variant \htmlinfo* places the information, but leaves out the standard “About this document...” header. Instead the \htmlhead command can be used to place an alternative heading, prior to the \htmlinfo* command. Neither this heading nor the $INFO contents appears in the LATEX typeset version.


\bodytext{<options>}

The text and background colors, and colors for the text of hypertext links can be set on an HTML page by giving appropriate attributes with the <BODY ...> tag. This is particularly easy to do using the \bodytext command, which simply inserts the <code> as the desired list of attributes.



Warning: Any previous settings for the <BODY ...> tag are discarded. Furthermore no checking is done to verify whether the given <options> indeed contains a list of attributes and values valid for the <BODY ...> tag.
When using \bodytext you are assumed to know precisely what you are doing!


Other packages contain commands which alter the contents of the <BODY ...> tag; notably the color.perl implementation of LATEX's color package, and the (prototype) frames package, by Martin Wilck and Ross Moore. In both these packages the requested information is checked for validity as an attribute within the <BODY ...> tag.


\htmlbody{<options>}

This is similar to the \bodytext command, except that it adds the value of an attribute, or allows an existing value to be changed. Thus it can be used to alter just a single one of the text and background colors, colors for the text of hypertext links or add a background pattern. The <options> are given as key-value pairs; some checking is done to ensure the validity of the attributes whose values are being set.


\htmlbase{<URL>}

This specifies that the given <URL> be included in the <HEAD> section of each HTML page via a tag: <BASE HREF="<URL>".
Such a feature is particularly useful...


\HTMLset{<which>}{<value>} and \HTMLsetenv{<which>}{<value>}

The \HTMLset command provides a mechanism whereby an arbitrary Perl variable can be assigned a value dynamically, during the LATEX2HTML processing. A variable having name `$<which>' is assigned the specified <value>, overwriting any value that may exist already. The \HTMLsetenv is for the same purpose, but it is expanded in order as if it were an environment, rather than a command.


Warning: This is intended for Perl programmers only. Use this command at your own risk!




\latextohtml

expands to the name LATEX2HTML, of this translator. Commands for parts of names of important LATEX packages are also included with LATEX2HTML: e.g. TEX, LATEX, AMS, XY . (This is to make it easy to refer to these products, in a consistent way within the HTML pages; you may still need LATEX definitions for the typeset version.)



Active Image Maps

Image maps are images with active regions in which a Web-surfer can click, to send him off to another sector of cyberspace. LATEX2HTML can design either inline “figures” or external ones (with or without a thumbnail version) to be image-maps. However HTML requires a URL of a HTML map-file, which associates the coordinates of each active region in the map with a destination URL. Usually this map file is kept on the server machine, however HTML 3.2 also allows it to reside on the client side for faster response. Both configurations are supported by LATEX2HTML through the \htmlimage options `map=' and `usemap=' respectively.


Keeping such a map file up to date manually can be tedious, especially with dynamic documents under revision. An experimental program makemap helps automate this process. This program (which is really a Perl script) takes one mandatory argument and an optional argument. The mandatory argument is the name of a user-map file, defined below. The optional argument is the name of the directory where the HTML map file(s) are to be placed.


The best way of describing how this works is by example. Suppose a document has two figures designated to become active image-maps. The first figure includes a statement like:

\begin{figure}
\htmlimage{map=/cgi-bin/imagemap/BlockDiagram.map,...}
. . .
\end{figure}
The second figure has a line like:

\begin{figure}
\htmlimage{map=/cgi-bin/imagemap/FlowChart.map,...}
. . .
\end{figure}





A typical user-map file, named report.map, might contain the following information10:
#
#  Define the location(s) of the labels.pl file(s):
#
+report/ <URL>
#
#  Define map #1:
#
BlockDiagram.map:       
label1  rect    288,145 397,189
label2  rect    307,225 377,252
label2  default
#
#  Define map #2
#
FlowChart.map:
label3  circle  150,100 200,100
label4  default

In this file, comments are denoted by a #-sign in column 1. The line beginning with +report states that the symbolic labels are to be found in the labels.pl contained in the directory report/, and that its associated URL is as stated. Any number of external labels.pl files may be so specified. The block diagram image has two active regions. The first is a rectangle bounded by corners (288, 145) and (397, 189), while the second is a rectangle bounded by corners (307, 225) and (377, 252). These coordinates can be obtained with the aid of a program such as xv. If the user clicks in the first rectangle, it will cause a branch to the URL associated with symbolic label label1 defined in the labels.pl file found in directory report/. The single active region in the flow chart figure is a circle centred at (150, 100) and passing through point (200, 100). Clicking in this region will cause a branch to symbolic label label3. Labels label2 and label4 will be visited if the user clicks anywhere outside of the explicit regions. If any labels are not defined in any of the labels.pl files mentioned, they will be interpreted as URLs without translation.

The HTML image-maps are generated and placed in directory report/ by invoking the command: makemap report.map report .


Document Segmentation11

Subsections

One of the greatest appeals of the World-Wide Web is its high connectivity through hyper-links. As we have seen, the LATEX author can provide these links either manually or symbolically. Manual links are more tedious because a URL must be provided by the author for every link, and updated every time the target documents change. Symbolic links are more convenient, because the translator keeps track of the URLs. Earlier releases of LATEX2HTML required the entire document to be processed together if it was to be linked symbolically. However it was easy for large documents to overwhelm the memory capacities of moderate-sized computers. Furthermore, processing time could become prohibitively high, if even a small change required the entire document to be reprocessed.


For these reasons, program segmentation was developed. This feature enables the author to subdivide his document into multiple segments. Each segment can be processed independently by LATEX2HTML. Hypertext links between segments can be made symbolically, with references shared through auxiliary files. If a single segment changes, only that segment needs to be reprocessed (unless a label is changed that another segment requires). Furthermore, the entire document can be processed without modification by LATEX to obtain the printed version.


The top level segment that LATEX reads is called the parent segment.
The others are called child segments.


Document segmentation does require a little more work on the part of the author, who will now have to undertake some of the book-keeping formerly performed by LATEX2HTML. The following four LATEX extensions carry out segmentation:

* \segment{<file>}{<sec-type>}{<heading>}
This command indicates the start of a new program segment. The segment resides in <file>.tex, represents the start of a new LATEX sectional unit of type <sec-type> (e.g., \section, \chapter, etc.) and has a heading of <heading>. (A variation \segment* of this command, is provided for segments that are not to appear in the table of contents.)
These commands perform the following operations in LATEX:
  1. The specified sectioning command is executed.
  2. LATEX will write its section and equation counters into an auxiliary file, named <file>.ptr. It will also write an \htmlhead command to this file. This information will tell LATEX2HTML how to initialise itself for the new document segment.
  3. LATEX will then proceed to input and process the file <file>.tex.
The \segment and \segment* commands are ignored by LATEX2HTML.

* \internal[<type>]{<prefix>}
This command directs LATEX2HTML to load inter-segment information of type <type> from the file <prefix><type>.pl . Each program segment must be associated with a unique filename-prefix, specified either through a command-line option, or through the installation variable $AUTO_PREFIX . The information <type> must be one of the following:
* internals
This is the default type, which need not be given. It specifies that the internal labels from the designated segment are to be input and made available to the current segment.
* contents
The table of contents information from designated segment are to be made available to the current segment.
* sections
Sectioning information is to be read in. Note that the segment containing the table of contents requires both contents and sections information from all other program segments.
* figure
Lists of figures from other segments are to be read.
* table
Lists of tables from other segments are to be read.
* index
Index information from other segments is to be read.
* images
Allows images generated in other segments to be reused with the current segment.

Note: If extensive indexing is to be used, then it is advisable to keep each <prefix> quite short. This is because the hyper-links in the index have text strings constructed from this <prefix>, when using the makeidx package. Having long names with multiply-indexed items results in an extremely inelegant, cumbersome index. See the section on indexing for more details.

* \startdocument
The \begin{document} and \end{document} statements are contained in the parent segment only. It follows that the child segments cannot be processed separately by LATEX without modification. However they can be processed separately by LATEX2HTML, provided it is told where the end of the LATEX preamble is; this is the function of the \startdocument directive. It substitutes for \begin{document} in child segments, but is otherwise ignored by both LATEX and LATEX2HTML.

* \htmlhead{<sec-type>}{<heading>}
This command is generated automatically by a \segment command. It is not normally placed in the document at all; instead it facilitates information being passed from parent to child via the <file>.ptr file.
It identifies to LATEX2HTML that the current segment is a LATEX sectional unit of type <sec-type>, with the specified heading.
This command is ignored by LATEX. From version V97.1 , it is possible to use this command to insert extra section-headings, for use in the HTML version only.

* \htmlnohead
When placed at the top of the preamble of a document segment, the \htmlnohead command discards everything from the current page that has been placed already. Usually this will be just the section-head, from the \htmlhead command in the .ptr file. Numbering and color information is unaffected.
This allows an alternative heading to be specified, or no heading at all in special circumstances; e.g. the page contains a single large table with a caption.

* \segmentcolor{<model>}{<color>}
This command is generated automatically by a \segment command. It is not normally placed in the document at all; instead it facilitates information being passed from parent to child via the <file>.ptr file.
It specifies to LATEX2HTML that text in the document should have the color <color> .

* \segmentpagecolor{<model>}{<color>}
This command is generated automatically by a \segment command. It is not normally placed in the document at all; instead it facilitates information being passed from parent to child via the <file>.ptr file.
It specifies to LATEX2HTML that the background of in the document should have the color <color> .

The use of the segmenting commands is best illustrated by the example below. You might want to check your segmented document for consistency using the -unsegment command line option.


A Segmentation Example

The best way to illustrate document segmentation is through a simple example. Suppose that a document is to be segmented into one parent and two child segments. Let the parent segment be report.tex, and the the two child segments be sec1.tex and sec2.tex. The latter are translated with filename prefixes of s1 and s2, respectively. This example is included with recent distributions of LATEX2HTML, having more prolific comments than are shown here.





The text of report.tex is as follows:
\documentclass{article}          % Must use LaTeX 2e
\usepackage{html,makeidx,color}

\internal[figure]{s1}            % Include internal information
\internal[figure]{s2}            % from children
\internal[sections]{s1}
\internal[sections]{s2}
\internal[contents]{s1}
\internal[contents]{s2}
\internal[index]{s1}
\internal[index]{s2}

\begin{document}                 % The start of the document
\title{A Segmentation Example}
\date{\today}
\maketitle
\tableofcontents
\listoffigures

% Process the child segments:

\segment{sec1}{section}{Section 1 title}
\segment{sec2}{section}{Section 2 title}
\printindex
\end{document}
This file obtains the information necessary to build an index, a table of contents and a list of figures from the child segments. It then proceeds to typeset these.





The first child segment sec1.tex is as follows:
\documentclass{article}
\usepackage{html,color,makeidx}
\input{sec1.ptr}
\internal{s2}
\startdocument
Here is some text.
\subsection{First subsection}
Here is subsection 1\label{first}.
\begin{figure}
\colorbox{red}{Some red text\index{Color text}}
\caption[List of figure caption]{Figure 1 caption}
\end{figure}
Reference\index{Reference} to \ref{second}.
The first thing this child segment does is establish the LATEX packages it requires, then loads the counter information that was written by the \segment command that invoked it. Since this segment contains a symbolic reference (second) to the second segment, it must load the internal labels from that segment.





The final segment sec2.tex is as follows:
\documentclass{article}
\usepackage{html,makeidx}
\input{sec2.ptr}
\internal{s1}
\startdocument
Here is another section\label{second}.
Plus another\index{Reference, another} reference\ref{first}.
\begin{figure}
\fbox{The figure}
\caption{The caption}
\end{figure}

This segment needs to load internal labels from the first one, because of the reference to `first'. These circular dependencies (two segments referencing each other) are either not allowed or handled incorrectly by the Unix utility make, without resorting to time stamps and some trickery. A time-stamp is a zero-length file whose only purpose is to record its creation time. Besides evaluating segment interdependence, another function of make is to provide inter-segment navigation information.





A sample Makefile is included in the distribution. This correctly generates the fully-linked document. The first time it is invoked, it runs: Proper operation of make depends on the fact that LATEX2HTML updates its own internal label file only if something in its current program segment causes the labels to change from the previous run. This ensures that LATEX2HTML is not run unnecessarily. It is also usual for the information page to be suppressed by specifying -info 0 for all but the top-level document.

In the above example, all segments are built within the same sub-directory report/ of the directory containing the LATEX source files. This is achieved simply by using the option -dir report with each. All the images and <prefix><type>.pl files are created and stored within this directory.


Sometimes it is desirable to build one or more segments within separate sub-directories. This is especially so when a segment has a large number of images, or if it is required to be part of more than one combined document. In this case the -dir <dir> options can be different, or omitted entirely. For inter-segment referencing to work, a “relative path” must be included as part of the <prefix> with each \internal command; e.g.

\internal[figure]{../sect1/s1}