Special Features


This section describes major features available for processing documents using LATEX2HTML. Firstly the means whereby LATEX2HTML can be configured to produce output for the different versions of HTML is discussed. Following this is a description of how to use languages other than English. The options available with the creation and reuse of images, are presented, for those situations where a textual representation is inadequate or undesirable.

There are several strategies available for the presentation of mathematics according to the desired version of HTML. These are discussed in some detail. Environments such as figure, table, tabular and minipage have special features. Other supported packages are listed.


Variation with HTML Versions


The Hypertext Mark-up Language (HTML) is an evolving standard, with different versions supporting different features. In order to make your documents viewable by the widest possible audience, you should use the most advanced HTML version with widely-accepted usage.


Sometimes it is known that the audience, for which a specific document is intended, has limited browser capabilities. Or perhaps special extended capabilities are known to be available. The LATEX2HTML translation may be customised to suit the available functionality.


Other HTML versions and extensions supported by LATEX2HTML are described below. See the description of the -html_version command-line option switch.

* Version 2.0
This provides only the functionality of the HTML 2.0 standard. There is little provision for aligning headings, paragraphs or images nor for super/sub-scripts to be generated. Images are created for tables and other environments that use <TABLE> tags with HTML 3.2; e.g. eqnarray and equation with equation numbering.

* i18n (internationalised fonts) Version 2.1
This extension (formerly known as HTML version 2.1) provides extensions for internationalisation. Most importantly, the default character set is no longer ISO–8859–1 but ISO–10646 (Unicode). This is a 16-bit character set and can thus display a much larger set of characters. There are also provisions for bidirectional languages (e.g. in Arabic the text is written from right to left, but numerals from left to right), and provisions in HTML to determine the character set and the language used.

Not all of the symbols are available in TEX, LATEX2HTML, or any browser yet available. However the `i18n' extension to LATEX2HTML is in preparation for when such browsers do become available, and such characters will be required in Web-accessible documents.

* math (HTML3 model) Version 3.1
This extension (formerly referred to as HTML version 3.1) adds support for the HTML-Math model, originally part of the proposed HTML 3.0 standard, see above. The only available browser which can display this mark-up is Arena. Originally developed by the World Wide Web Consortium as a test-bed browser, it is no longer supported by them.

There has been a recent proposal for a Mathematical Markup Language (MathML) from the W3C Math Working Group. This would suggest that the HTML-Math model is unlikely ever to be adopted; better things being expected in the near future using MathML.

See also another page for a discussion the the mechanisms available with LATEX2HTML for handling mathematical equations and expressions.


Internationalisation


A special variable $LANGUAGE_TITLES in the initialisation or configuration files determines the language in which some section titles will appear. For example setting it to
$LANGUAGE_TITLES = 'french';
will cause LATEX2HTML to produce “Table des matières” instead of “Table of Contents”. Furthermore, the value of the \today command is presented in a format customary in that language.


Languages currently supported are finnish, french, english, german and spanish. It is trivial to add support for another language by creating a file in the styles/ subdirectory, or by adding to the file latex2html.config. As a guide, here is the entry for French titles:

sub french_titles {
    $toc_title = "Table des mati\\`eres";
    $lof_title = "Liste des figures";
    $lot_title = "Liste des tableaux";
    $idx_title = "Index";
    $ref_title = "R\\'ef\\'erences";
    $bib_title = "R\\'ef\\'erences";
    $abs_title = "R\\'esum\\'e";
    $app_title = "Annexe";
    $pre_title = "Pr\\'eface";
    $fig_name = "Figure";
    $tab_name = "Tableau";
    $part_name = "Partie";
    $prf_name = "Preuve";
    $child_name = "Sous-sections";
    $info_title = "\\`Apropos de ce document...";
    @Month = (”, 'janvier', "f\\'evrier", 'mars', 'avril', 'mai',
              'juin', 'juillet', "ao\\^ut", 'septembre', 'octobre',
              'novembre', "d\\'ecembre");
    $GENERIC_WORDS = "a|au|aux|mais|ou|et|donc|or|ni|car|l|la|le|les"
        . "|c|ce|ces|un|une|d|de|du|des";
}
Notice how the backslash needs to be doubled, when a macro is needed (for accented characters, say). Also, the $GENERIC_WORDS are a list of short words to be excluded when filenames are specially requested to be created from section-headings. In order to provide full support for another language you may also replace the navigation buttons which come with LATEX2HTML (by default in English) with your own. As long as the new buttons have the same file-names as the old ones, there should not be a problem.


Alternate Character Encodings


By default, LATEX2HTML assumes that input files are Unicode encoded with UTF8, and produces Unicode UTF8 output.

LATEX2HTML can handle input files in other encodings, indicated by including the inputenc package in the source:

 \usepackage[latin5]{inputenc}
In this case, LATEX2HTML will produce output in the same encoding, and will indicate the encoding in the HTML headers. The input encodings that are recognised are listed in the following table.


Table 1: Supported Font-encodings
extension notes encoding
unicode (default) ISO–10646 (Unicode)
latin1   ISO–8859–1 (ISO-Latin-1)
latin2   ISO–8859–2 (ISO-Latin-2)
latin3   ISO–8859–3 (ISO-Latin-3)
latin4   ISO–8859–4 (ISO-Latin-4)
latin5   ISO–8859–9 (ISO-Latin-5)
latin6   ISO–8859–10 (ISO-Latin-6)
koi8-r   RFC 1489 (Russian)



Multi-lingual documents, using Images

Some multi-lingual documents can be constructed, when all the languages can be presented using characters from a single font-encoding, as discussed in the previous section.

Another way to present multiple languages within a Web document is to create images of individual letters, words, sentences, paragraphs or even larger portions of text, which cannot be displayed within the chosen font-encoding. This is a technique that is used with IndicTEX/HTML, for presenting traditional Indic language scripts within Web pages. For these the LATEX source that is to be presented as an image needs special treatment using a “pre-processor”. For the special styles defined in IndicTEX/HTML, running the preprocessor is fully automated, so that it becomes just another step within the entire image-generation process.


The technique of using images, can be used with any font whose glyphs can be typeset using TEX or LATEX. Using TEX's \font command, a macro is defined to declare the special font required; e.g. for Cyrillic characters, using the Univ. of Washington font:

 \font\wncyr = wncyr10

Now use this font switch immediately surrounded by braces:

 published by {\wncyr Rus\-ski\char26\ \char23zyk}.
to get:
published by {\wncyr Rus\-ski\char26\ \char23zyk}.


Mathematics

There are various different ways in which LATEX2HTML can handle mathematical expressions and formulas: Which is the most appropriate normally depends on the context, or importance of the mathematics within the entire document. What LATEX2HTML will produce depends upon
  1. the version of HTML requested;
  2. whether or not the special `math' has been loaded;
  3. whether the -no_math command-line option has been specified, or (equivalently) the $NO_SIMPLE_MATH variable has been set in an initialisation file.
The strategies used to translate math expressions are summarised in the table below for HTML 3.0+ and the subsequent table for HTML 2.0.


Table 2: Mathematics translation strategies, for HTML versions 3.0 and 3.2,
using <SUP> and <SUB> tags and <TABLE>s
`math' switch strategy adopted 
not loaded textual representation where possible,
else image of whole expressions
not loaded -no_math always generates an image of the whole
expression/environment
loaded uses entities and <MATH> tags; e.g. for HTML-Math (or MathML in future)
loaded -no_math textual representation where possible, with images of sub-expressions


Using the -no_math switch is best for having a consistent style used for all mathematical expressions, whether inline or in displays. The images are of especially good quality when “anti-aliasing” is being used (see here), provided the browser is set to have a light background colour. (When set against a gray or dark background, these images can become rather faint and hard to read.)

The final strategy above, using -no_math is the preferred method for good quality mathematics with HTML version 3.2 . It combines the browser's built-in fonts with the best quality images, when needed. To obtain it use the command-line option switches:

-no_math -html_version 3.2,math
This is what was used when creating this manual. Examples below show how to generate an image of a whole environment, even with these options in force.


Since the HTML 2.0 standard does not include superscripts and subscripts, via the <SUP> and <SUB> tags, the options are more limited. In this case creating images of sub-expressions is not so attractive, since virtually the whole expression would consist of images in all but the simplest of cases.


Table 3: Mathematics translation strategies, for HTML version 2.0
`math' switch strategy adopted 
not loaded textual representation where possible,
else image of whole expressions
not loaded -no_math always generates an image of
the whole expression or environment
loaded entities and <MATH> tags for HTML-Math
loaded -no_math always generates an image of the whole
expression or environment






Here are some examples of mathematical expressions and environments processed by LATEX2HTML using different strategies. They are automatically numbered ...

Φl+1, m, n = $\displaystyle \Bigl($Φ + h$\displaystyle {\frac{{\partial\Phi}}{{\partial x}}}$ + $\displaystyle {\frac{{1}}{{2}}}$h2$\displaystyle {\frac{{\partial^2\Phi}}{{\partial x^2}}}$ + $\displaystyle {\frac{{1}}{{6}}}$h3$\displaystyle {\frac{{\partial^3\Phi}}{{\partial x^3}}}$ +  … $\displaystyle \Bigr)_{{l,m,n}}^{}$ (1)
... with some gratuitously ácçënted text in-between ...
$\displaystyle {\frac{{\Phi_{l+1,m,n}-2\Phi_{l,m,n}+\Phi_{l-1,m,n}}}{{h^{2}}}}$ + $\displaystyle {\frac{{\Phi_{l,m+1,n}-2\Phi_{l,m,n}+\Phi_{l,m-1,n}}}{{h^{2}}}}$      
+ $\displaystyle {\frac{{\Phi_{l,m,n+1}-2\Phi_{l,m,n}+\Phi_{l,m,n-1}}}{{h^{2}}}}$ = - Il, m, n(v)  .     (2)

The latter example uses an eqnarray environment and the \nonumber command to suppress the equation number on the upper line.


Notice how simple alphabetic characters that are not part of fractions appear in the (italiced) text-font selected using the browser's controls. This may appear slightly different from the same symbol being used within a fraction, or other mathematical construction requiring an image to be generated. This is most apparent with the letter `h' in the first equation and the subscripts at the end of the second equation.

By inserting an \htmlimage{} command into a math, equation or displaymath environment, a single image will be created for the whole environment. For an eqnarray environment, this will lead to having a single separate image for each of the aligned portions. The argument to \htmlimage need not be empty, but may contain information which is used to affect characteristics of the resulting image. An example of how this is used is given below, and a fuller discussion of the allowable options is given in the next section.


Scale-factors for Mathematics.

When an image is to be made of a mathematical formula or expression, it is generally made at a larger size than is normally required on a printed page. This is to compensate for the reduced resolution of a computer screen compared with laser-print. The amount of this scaling is given by the value of a configuration variable $MATH_SCALE_FACTOR, by default set to 1 in latex2html.config. A further variable $DISP_SCALE_FACTOR is used with `displayed math' equations and formulas. This value multiplies the $MATH_SCALE_FACTOR to give the actual scaling to be used. The main purpose of this extra scaling is to allow some clarity in super/subscripts etc.


Anti-aliased Images.

Here are the same equations as previously, this time as images of the complete contents of the equation environment, and complete aligned parts of rows in an eqnarray. For a comparison, the second group of images use anti-aliasing effects, whereas the first image does not; a 600 dpi printing is probably necessary to appreciate the difference in quality. Compare these images with those in a later section.

Note: To generate anti-aliased images using Ghostscript requires version 4.03 or later.

Figure 1: Images of equation displays, at normal screen resolution
 

\begin{displaymath}
\Phi_{l+1,m,n} = \Bigl(\Phi+h\frac{\partial\Phi}{\partial x}...
...\frac{\partial^3\Phi}{\partial x^3} + \,\ldots\,\Bigr)_{l,m,n}
\end{displaymath} (3)

\begin{eqnarray}
\frac{\Phi_{l+1,m,n}-2\Phi_{l,m,n}+\Phi_{l-1,m,n}}{h^{2}} +
\fr...
...{l,m,n+1}-2\Phi_{l,m,n}+\Phi_{l,m,n-1}}{h^{2}} = -I_{l,m,n}(v)\;.
\end{eqnarray}


These images of the whole environment were created using the \htmlimage command, to suppress the extended parsing that usually occurs when the `math' extension is loaded; viz.

\begin{equation}
\htmlimage{no_antialias}
\Phi_{l+1,m,n} = \Bigl(\Phi+h\frac{\partial\Phi}{\partial x} +
...
\end{equation}
%
\begin{eqnarray}
\htmlimage{}
\frac{\Phi_{l+1,m,n}-2\Phi_{l,m,n}+\Phi_{l-1,m,n}}{h^{2}} +
...
\end{eqnarray}
Further aspects of the options available when generating images are discussed in the next section, in particular with regard to the quality of printed images.

The \mbox command.

Another way to force an image to be created of a mathematical expression, when global settings are not such as to do this anyway, is via the \mbox command having math delimiters within its argument.

Normally \mbox is used to set a piece of ordinary text within a mathematics environment. It is not usual to have math delimiters $...$ or \(...\) within the argument of an \mbox. Whereas earlier versions of LATEX2HTML simply ignored the \mbox command (treating its argument as normal text), the presence of such delimiters now results in an image being generated of the entire contents of the \mbox. It is not necessary for there to be any actual mathematics inside the \mbox's contents;
e.g. \mbox{...some text...${}$} will cause an image to be created of the given text.

The \parbox command.

The \parbox[<align>]{<width>}{<text>} command also generates an image of its contents, except when used within a tabular environment, or other similar table-making environment. Here the important aspect is the width specified for the given piece of text, and any special line-breaks or alignments that this may imply. Hence to get the best effect, LATEX is used to typeset the complete \parbox, with its specified width, alignment and contents, resulting in an image.


The heqn package.

If you need HTML 2.0 compatible Web pages, and have a document with a great many displayed equations, then you might try using the heqn package. Inclusion of the heqn.sty file has absolutely no effect on the printed version of the article, but it does change the way in which LATEX2HTML translates displayed equations and equation arrays. It causes the equation numbers of the equation environment to be moved outside of the images themselves, so that they become order-independent and hence recyclable. Images that result from the eqnarray environment are also recyclable, so long as their equation numbers remain unchanged from the previous run.


The \nonumber command is recognised in each line of the equation array, to suppress the equation number. A side-effect of this approach is that equation numbers will appear on the left side of the page. The heqn package requires the html package.

Using HTML Version 3.2 the heqn package is quite redundant, since equation numbers are placed in a separate <TABLE> cell to the mathematical expressions themselves. It is not required and should not be requested, since this will override some of the improved functionality already available.


Figures and Image Conversion

Subsections

LATEX2HTML converts equations, special accents, external PostScript files, and LATEX environments it cannot directly translate into inlined images. This section describes how it is possible to control the final appearance of such images. For purposes of discussion ...


These parameters apply only to bitmapped image types, and have no effect with the default SVG image type. The size of all “small images” depends on a configuration variable $MATH_SCALE_FACTOR which specifies how much to enlarge or reduce them in relation to their original size in the PostScript version of the document. For example a scale-factor of 0.5 will make all images half as big, while a scale-factor of 2 will make them twice as big. Larger scale-factors result in longer processing times and larger intermediate image files. A scale-factor will only be effective if it is greater than 0. The configuration variable $FIGURE_SCALE_FACTOR performs a similar function for “figures”. Both of these variables are initially set to have value 1.

A further variable $DISP_SCALE_FACTOR is used with `displayed math' equations and formulas; this value multiplies the $MATH_SCALE_FACTOR to give the actual scaling used. Values greater than 1 can be used to counteract readability problems with bitmapped images. Accordingly this manual actually uses values of 1.4 and 1.2 respectively, for $MATH_SCALE_FACTOR and $DISP_SCALE_FACTOR. These go well with the browser's text-font set at 14 pt. The next larger size of 17 pt is then used for the <LARGE> tags in displayed equations.


A further variable $EXTRA_IMAGE_SCALE allows images to be created at a larger size than intended for display. The browser itself scales them down to the intended size, but has the extra information available for a better quality print. This feature is also available with single images. It is discussed, with examples, on the next page.


\htmlimage{<options>}

For finer control, several parameters affecting the conversion of a single image can be controlled with the command \htmlimage, which is defined in html.sty. With version V97.1 use of this command has been extended to allow it to control whether an image is generated or not for some environments, as well as specifying effects to be used when creating this image.

If an \htmlimage command appears within any environment for which creating an image is a possible strategy (though not usual, due to loading of extensions, say), then an image will indeed be created. Any effects requested in the <options> argument will be used. Having empty <options> still causes the image to be generated.

This ability has been used within this manual, for example with the mathematics images in the previous section.


The <options> argument is a string separated by commas.
Allowable options are:


In order to be effective the \htmlimage command and its options must be placed inside the environment on which it will operate. Environments for alignment and changing the font size do not generate images of their contents. Any \htmlimage command may affect the surrounding environment instead; e.g. within a table or figure environment, but does not apply to a minipage.

When the \htmlimage command occurs in an inappropriate place, the following message is printed among the warnings at the end of processing. The actual command is shown, with its argument; also the environment name and identifying number, if there is one.

The command "\htmlimage" is only effective inside an environment 
which may generate an image (e.g. "{figure}", "{equation}")
 center92: \htmlimage{ ... }


An Embedded Image Example


The effect of the LATEX commands below can be seen in the thumbnail sketch of Figure 2. A 5 pt border has also been added around the thumbnail, using \htmlborder command; this gives a pseudo-3D effect in some browsers.
\begin{figure}
    \htmlimage{thumbnail=0.5}
    \htmlborder{5}
    \centering \includegraphics[width=5in]{psfiles/figure}
    \latex{\addtocounter{footnote}{-1}}
    \caption{A sample figure showing part of a page generated by
       \latextohtml{} containing a customised navigation panel 
       (from the 
        CSEP project).}\label{fig:example}
\end{figure}

Figure 2: A sample figure showing part of a page generated by LATEX2HTML containing a customised navigation panel (from the CSEP project).
\begin{figure}\centering\includegraphics[width=5in]{psfiles/figure}
%
\end{figure}

The \htmlimage command is also often useful to cancel-out the effect of the configuration variable $FIGURE_SCALE_FACTOR. For example to avoid resizing a color screen snap despite the value of $FIGURE_SCALE_FACTOR it is possible to use \htmlimage{scale=0} .


Image Sharing and Recycling

It is not hard too see how reasonably sized papers, especially scientific articles, can require the use of many hundreds of external images. For this reason, image sharing and recycling is of critical importance. In this context, “sharing” refers to the use of one image in more than one place in an article. “Recycling” refers to the use of an image left over from a previous run of LATEX2HTML. Without this ability, every instance of an image would have to be regenerated each time even the slightest change were made to the document.


All types of images can be shared. These include “small images” and figures with or without thumbnails and image-maps. Furthermore, most images can also be reused. The only exception are those which are order-sensitive, meaning that their content depends upon their location. Examples of order-sensitive images are equation and eqnarray environments, when -html_version 2.0 has been specified; this is because their figure numbers are part of the image.


Figures and tables with captions, on the other hand, are order-insensitive because the figure numbers are not part of the image itself.Similarly when HTML 3.2 code is being produced, equation numbers are no longer part of the image. Instead they are placed in a separate cell of a <TABLE>. So most images of mathematical formulas can be reused also.


Quality of Printed Images

Since it is often desirable to get a good quality print on paper directly from the browser, here are the same equations as earlier. This time the `extrascale=' option has been used with a value of 1.5 . More than twice the number of pixels are available, for a cost of approximately 1.7 times the disk-space5.

Figure 3: Displayed math environments with extra-scale of 1.5
 

\begin{displaymath}
\Phi_{l+1,m,n} = \Bigl(\Phi+h\frac{\partial\Phi}{\partial x}...
...\frac{\partial^3\Phi}{\partial x^3} + \,\ldots\,\Bigr)_{l,m,n}
\end{displaymath} (4)

\begin{eqnarray}
\frac{\Phi_{l+1,m,n}-2\Phi_{l,m,n}+\Phi_{l-1,m,n}}{h^{2}} +
\fr...
...hi_{l,m,n+1}-2\Phi_{l,m,n}+\Phi_{l,m,n-1}}{h^{2}} = -I_{l,m,n}(v)
\end{eqnarray}


On-screen these images appear slightly blurred or indistinct. However there can be marked improvement in the print quality, when printed from some browsers; others may show no improvement at all. The “anti-aliasing” helps on-screen. In the printed version jagged edges are indeed softened, but leave an overall fuzziness.

Here are the same equations yet again; this time with `extrascale=2.0'. Now there are 4 times the pixels at a cost of roughly 2.45 times the disk space. Compared with the previous images (having 1.5 times extra-scaling), there is little difference in the on-screen images. Printing at 300 dpi shows only a marginal improvement; but at 600 dpi the results are most satisfying, especially when scaled to be comparable with normal 10 pt type.

Figure 4: Displayed math environments with extra-scale of 2.0
 

\begin{displaymath}
\Phi_{l+1,m,n} = \Bigl(\Phi+h\frac{\partial\Phi}{\partial x}...
...\frac{\partial^3\Phi}{\partial x^3} + \,\ldots\,\Bigr)_{l,m,n}
\end{displaymath} (5)

\begin{eqnarray}
\frac{\Phi_{l+1,m,n}-2\Phi_{l,m,n}+\Phi_{l-1,m,n}}{h^{2}} +
\fr...
...{l,m,n+1}-2\Phi_{l,m,n}+\Phi_{l,m,n-1}}{h^{2}} = -I_{l,m,n}(v)\;.
\end{eqnarray}



Figures, Tables and Arbitrary Images


This section is to explain how the translator handles figures, tables and other environments. Compare the paper with the online version.

When the common version of HTML was only 2.0, then almost all complicated environments were represented using images. However with HTML 3.2, there is scope for sensible layout of tables, and proper facilities for associating a caption with a figure or table. To take advantage of this, the figure environment now has its contents placed within <TABLE> tags; any caption is placed as its <CAPTION>.

For consistency with former practice, the contents of the figure environment are usually represented by generating an image. This is frequently exactly what is required; but not always. In another section it is described how to use the makeimage environment, defined in the html.sty package, to determine just which parts (if any) of a figure environment's contents should be made into images, the remainder being treated as ordinary text, etc.



table and tabular environments.

Similarly the makeimage environment can be used within a table, though usually this is used with a tabular or other table-making environment, such as tabbing or longtable or supertabular. Here is a simple example, from the LATEX `blue book'.


Table 4: A sample table taken from [1]
gnats gram $13.65
  each .01
gnu stuffed 92.50
emur   33.33
armadillo frozen 8.99


When using -html_version 2.0 to get code compatible with the HTML 2.0 standard, an image is made of the table, as follows:

Table 5: Alternate view of the table from [1]
\begin{tabular}{\vert\vert l\vert lr\vert\vert} \hline
gnats & gram & \$13.65 \\...
...{3-3}
emur & & 33.33 \\ \hline
armadillo & frozen & 8.99 \\ \hline
\end{tabular}




minipage environments.

The special feature of minipage environments is in the way \footnote and \footnotemark commands are handled. These are numbered separately from the rest of the footnotes throughout the document, and the notes themselves are collected together to be displayed at the end of the minipage's contents.


Variable Meaning
none none
Jacobi m-step Jacobi iteration1
SSOR m-step SSOR iteration1
IC Incomplete Cholesky factorization2
ILU Incomplete LU factorization2
1 one footnote
2 another footnote


The code used for this example was as follows6

\begin{minipage}{.9\textwidth}
\renewcommand{\thempfootnote}{\alph{mpfootnote}}
\begin{tabular}{|l|l|} \hline
\textbf{Variable} & \textbf{Meaning} \\ \hline
none      & none                   \\
Jacobi    & $m$-step Jacobi iteration\footnote[1]{one footnote} \\
SSOR      & $m$-step SSOR iteration\footnotemark[1] \\
IC        & Incomplete Cholesky factorization\footnote[2]{another footnote} \\
ILU       & Incomplete LU factorization\footnotemark[2] \\ \hline
\end{tabular}
\end{minipage}


Warning: With some figures, especially when containing graphics imported using \includegraphics or other special macros, the background color may come out as a shade of grey, rather than white or transparent. This is due to a setting designed to enhance anti-aliasing of text within images; e.g. for mathematics. To alleviate this possible problem, the -white command-line option can be used, to ensure a white background for images of figure environments. Alternatively, set the $WHITE_BACKGROUND variable.


Document Classes and Options

Subsections In general the standard LATEX document-classes: article, report, book, letter, slides are translated by LATEX2HTML in the same way. Currently the only real difference is with the display of section-numbering, when the -show_section_numbers switch is used, and when numbering of theorem-like environments is linked to section-numbering.

These differences are achieved using a mechanism that automatically loads a file: article.perl, report.perl, book.perl, letter.perl, slides.perl according to the requested document-class. These files contain Perl code and are located in the styles/ directory. If a file of the same name exists in the working directory, this will be loaded instead.

Typically such files <class>.perl contain code to define subroutines or sets values for variables that will affect how certain translations are performed. There can be code that is executed only when specific class-options are specified along with the chosen document-class. For example, the foils.perl implementation of FoilTeX's foils class defines code create a new sub-section for each `foil'. It also has code which allows LATEX2HTML to ignore those of FoilTeX's special formatting commands that have no relevance when constructing an HTML page.


Any options given on the \documentclass or \documentstyle line may also cause a file containing Perl code to be loaded. Such a file is named <option>.perl for the appropriate <option>. When such a file exists, in the local directory or in the styles/ directory, it typically contains Perl code to define subroutines or set values for variables that will affect how certain translations are performed. There can be code that is executed only for specific document-classes.

Since the files for class-options are loaded after those for the document-class, it is possible for the <option>.perl file to contain code that overrides settings made within the document-class file.


If a file named <class>_<option>.perl happens to exist for a given combination of document-class <class> and class-option <option>, then this will be loaded. When such a file exists, reading and executing its contents is done, rather than executing any <class>_<option> specific information that may be contained in <class>.perl or <option>.perl .


Currently there are no special option or class-option files provided with the LATEX2HTML distribution. It is hoped that users will identify ways that specific features can be improved or adapted to specific classes of documents, and will write such files themselves, perhaps submitting them for general distribution.


Note: This mechanism for handling code specific to different document classes and class-options is more general than that employed by LATEX2e. New options can be defined for document-classes generally, or for specific classes, without the need to have corresponding .sty or .clo files. LATEX simply notes the existence of unusupported options—processing is not interrupted.


Packages and Style-Files

Subsections
Similar to the document-class mechanism described in the previous section, LATEX2HTML provides a mechanism whereby the code to translate specific packages and style-files is automatically loaded, if such code is available. For example, when use of a style such as german.sty is detected in a LATEX source document, either by the translator looks for a corresponding .perl file having the same file-name prefix; e.g. the file $LATEX2HTMLDIR/styles/german.perl. If such a .perl file is found, then its code will be incorporated with the main script, to be used as required.


This mechanism helps to keep the core script smaller, as well as making it easier for others to contribute and share solutions on how to translate specific style-files. The current distribution includes the files to support the styles listed in the table below. These provide good examples of how you can create further extensions to LATEX2HTML.

Table 6: Supported LATEX2HTML packages and style-files.
.perl file Description 
alltt Supports the LATEX2e's alltt package.
amsfonts provides recognition of the special AMS font symbols.
amsmath same as amstex.perl.
amssymb same as amsfonts.perl.
amstex Supports much of the AMS-LATEX package (not yet complete).
babel Interface to german.perl via the babel package.
changebar Provides rudimentary change-bar support.
chemsym defines the standard atomic symbols.
color Causes colored text to be processed as ordinary text by LATEX2HTML.
colordvi supports the Crayola colors.
enumerate supports structured labels for enumerate environments.
epsbox Processes embedded figures not enclosed in a figure environment.
epsfig Processes embedded figures not enclosed in a figure environment.
finnish Support for the Finnish language.
floatfig Processes floating figures.
floatflt Processes floating figures and tables.
foils Supports FoilTeX system.
frames Provides separate frames for navigation and footnotes.
francais Support for the French language, same as french.perl.
french Support for the French language.
german Support for the German language.
germanb Support for the German language, same as german.perl.
graphics Supports commands in the graphics package.
graphicx Supports the alternate syntax of graphics commands.
harvard Supports the harvard style of citation (same as fnnharvard.perl).
heqn Alters the way displayed equations are processed.
hthtml gives an alternative syntax for specifying hyperlinks, etc.
htmllist Provides support for fancy lists.
justify supports paragraph alignment—no longer needed.
latexsym supports the LATEX symbol font.
lgrind macros for nice layout of computer program code.
longtable supports use of long tables, as a single table.
makeidx provides more sophisticated indexing.
multicol suppresses requests for multi-columns.
natbib Supports many different styles for citations and bibliographies.
nharvard Supports harvard-style citations, using natbib.
seminar for creation of overhead-presentation slides.
spanish Support for the Spanish language.
supertabular supports use super-tables, as an ordinary table.
texdefs Supports some raw TEX commands.
verbatim Supports verbatim input of files.
verbatimfiles Supports verbatim input of files, also with line-numbering.
wrapfig Supports wrapped figures.
xspace Supports use of the xspace package and \xspace command.
xy Supports use of the XY-pic graphics package.

The problem however, is that writing such extensions requires an understanding of Perl programming and of the way the processing in LATEX2HTML is organised. Interfaces that are more “user-friendly” are being investigated. Some of the techniques currently used are explained in a later section.


Fancy List-Markers

An optional style-file htmllist.sty has been provided which produces fancier lists in the electronic version of the document, such as this. This file defines a new LATEX environment htmllist, which causes a user-defined item-mark to be placed at each new item of the list, and which causes the optional description to be displayed in bold letters. The filename prefix for the item-mark image can be given as an optional parameter; see example below. The images distributed with LATEX2HTML for this purpose are listed with the description of the \htmlitemmark command, which provides an alternative means of choosing the item-mark, and allows the image to be changed for different items in the list.


The mark is determined by the \htmlitemmark{<item-mark>} command. This command accepts either a mnemonic name for the <item-mark>, from a list of icons established at installation, or the URL of a mark not in the installation list. The command \htmlitemmark must be used inside the htmllist environment in order to be effective, and it may be used more than once to change the mark within the list. The item-marks supplied with LATEX2HTML are BlueBall, RedBall, OrangeBall, GreenBall, PinkBall, PurpleBall, WhiteBall and YellowBall. The htmllist environment is identical to the description environment in the printed version.

An example of its usage is:

\begin{htmllist}[WhiteBall]
\item[Item 1:] This will have a white ball.
\item[Item 2:] This will also have a white ball.
\htmlitemmark{RedBall}%
\item[Item 3:] This will have a red ball.
\end{htmllist}

This will produce:

* Item 1:
This will have a white ball.
* Item 2:
This will also have a white ball.
* Item 3:
This will have a red ball.


One can also obtain LATEX2e style-files floatfig.sty and wrapfig.sty, which provide support for the floatingfigure and wrapfigure environments, respectively. These environments allow text to wrap around a figure in the printed version, but are treated exactly as an ordinary figures in the electronic version. They are described in The LATEX Companion.


Support for FoilTeX

The FoilTeX system presents some additional problems for LATEX2HTML: The package foils.perl deals with these problems. It treats foils as starred subsections and ignores FoilTeX-specific commands that have no meaning for HTML, like \LogoOn. The header \documentclass[+options]{foils} in the images.tex file is substituted by the header \documentclass[$FOILOPTIONS]{$FOILCLASS}, where the variables $FOILOPTIONS and $FOILCLASS can be set in the configuration file (by default they are '10pt' and 'article' correspondingly). A further variable $FOILHEADLEVEL holds the level of sectioning at which a `foil' is to correspond; the default level is 4 (sub-section).

The LATEX style file foilhtml.sty in the texinputs/ directory provides some additional features for FoilTeX. It implements structural markup commands like \section, \tableofcontents for foils. See the directory docs/foilhtml/ for the details.


Indicating Differences between Document Versions

LATEX2HTML supports the LATEX2e changebar.sty package, written by Johannes Braams <JLBraams@cistron.nl>, for inserting change-bars in a document in order to indicate differences from previous versions. This is a very primitive form of version control and there is much scope for improvement.

Within the LATEX version of this manual two thicknesses of change-bar have been used. Thicker bars indicate changes introduced with version V97.1 , while thinner bars indicate earlier additions since V96.1 .
Within the HTML version the change-bars clearly indicate the different revisions with explicit numbering.Within the HTML version, the graphic icons representing the changebars can be followed by some text indicating the new version. This is used repeatedly throughout this manual. It is achieved using the command \cbversion{<version>}, immediately following the \begin{changebar}. This sets a variable $cb_version to be used both at the beginning and end of the environment. The value of this variable is retained, to be used with other changebar environments, unless changed explicitly by another occurrence of $cb_version.

Warning: LATEX2HTML will not correctly process changebar environments that contain sectioning commands, even when the (sub)sections or (sub)paragraphs are to occur on the same HTML page. If this is required, use a separate changebar environment within each (sub)section or (sub)paragraph.


Indexing

LATEX2HTML automatically produces an Index consisting of the arguments to all \index commands encountered, if there are any. A hyperlink is created to that point in the text where the \index command occurred.

More sophisticated indexing is available by loading the makeidx package. Most of the features described in [1, Appendix A] become available. This includes:

* styled entries, using `@' :
Entries of the form \index{<sort-key>@<styled-text>} produce <styled-text> as the entry, but sorted according to <sort-key>.

* hierarchical entries, using `!' :
Entries of the form \index{<item>!<sub-item>} set the <sub-item> indented below the <item>. Unlimited levels of hierarchy are possible, even though LATEX is limited to only 3 levels. The <sort-key>@<styled-text> can be used at each level.

* explicit ranges, using `|(' and `|)' :
This is perhaps more useful in the LATEX version. In the HTML version these simply insert words “from” and “to”, respectively, prior to the hyperlink to where the index-entry occurs.

* |see{<index-entry>} :
provides a textual reference to another indexed word or phrase, by inserting the word “see”. This can be used in conjunction with \htmlref to create a hyperlink to the <index-entry>; viz.
\index{latexe@\LaTeXe |see{\htmlref{\LaTeX}{IIIlatex}}}
where a \label has been specified in some other index-entry, as follows:
\index{latex@\LaTeX\label{IIIlatex}}

* |emph :
is recognised but ignored; other |<command> commands are not processed by LATEX2HTML, with the following exception... is handled correctly, by applying \emph to the text of the generated hyperlink.

* |<style> :
where <style> is the name of LATEX style-changing command, without the initial `\'; e.g. `emph', `textbf', `textit', etc. The corresponding LATEX command is applied to the text of the generated hyperlink.

* blank lines and alphabetization:
Having precisely a single space-character after the | (e.g. \index{A| }) places a blank line before the index entry and omits the hyperlink. This is used mainly for visual formatting; it allows a break before the entries starting with each letter, say. Using a printable-key, as in \index{Q@Q, R| }, is appropriate when there are no indexed words starting with `Q', say.

* quoted delimiters:
The three special delimiters can be used within the printable portion, if preceded by the double-quote character: "@, "|, "! and also "" for the quote character itself. Also \" produces an umlaut accent on the following character, when appropriate, else is ignored.


Furthermore, the printable part of an index entry can contain HTML anchors; that is, hyperlinks and/or \label{...}s. This allows index entries to contain cross-links to other entries, for example, as well as allowing index-entries to be the target of hyperlinks from elsewhere within the document.

The next section describes how this feature is used within this manual to create a Glossary, containing a short description of all file-names, configuration-variables and application software mentioned within the manual, integrated with the Index. All occurrences of the technical names can be easily found, starting from any other.

When a single item is indexed many times, it is sufficient to have a \label command appearing within the printable portion of the first instance of an \index{...} command for that item, within a single document segment.


If the index-entries are in different segments of a segmented document, it is sufficient to have the \index{...@...\label{...}} appearing within that segment, in which the item is indexed, whose indexing information is loaded earliest via a \internal[index]{...} command. When in doubt, include one \index{...@...\label{...}} per segment in which the item is indexed.

For cross-links to work effectively within segmented documents, the indexing command \index{...@...\label{...}} must occur earlier in the same segment than any use of \index{...@...\htmlref{...}{...}} intended to create a link to that label. If the \label occurs in a different segment, then a \internal[index]{...} command for that segment, may be needed at the beginning of the segment with the \htmlref . When this is done incorrectly, the resulting link will be to the segment where the indexed item occurred, rather than staying within the Index.



Since use of section-names, as the text for hyperlinks, can lead to a very long and cumbersome Index, especially when single items have been indexed many times, a further feature is provided to obtain a more compact Index.


Use of the command-line option -short_index causes a codified representation of the sectioning to be used, rather than the full section-name. The differences are as follows.

These features can also be obtained by setting the variable $SHORT_INDEX to have value `1', in a configuration or initialisation file; provided, of course, that the document loads the makeidx package.


Integrated Glossary and Index

A large number of different pieces of software are required to make LATEX2HTML work effectively, as well as many files containing data or code to work with parts of this software. For this reason, a Glossary is included with this manual. It contains the names of all files, configuration variables, application software and related technical terms, with a short description of what it is, or does, and perhaps a URL for further reference.


In the printed version each item in the Glossary is accompanied by the page-numbers on which the item is mentioned, somewhat like in the Index. For the HTML version, each glossary-item contains a hyperlink to an index-entry, which then has links to each occurrence. These extra index-entries do not appear in the printed version; indeed they also contain a hyperlink back to the corresponding glossary-entry.

This feature is currently available only when using the makeidx package, and needs also the html and htmllist packages. It was developed for version 96.1f by Ross Moore, incorporating an extensive revision of makeidx.perl, as well as additions to LATEX2HTML so that all aspects of indexing work correctly with segmented documents.


Since LATEX provides no guidelines for how a Glossary should be constructed, the technique used here will be explained in detail, for both the printed and HTML versions.