User Manual


LATEX2HTML is a program for creating hyperlinked sets of HTML pages from a document marked-up using LATEX commands. Previous sections have discussed the results of specific LATEX commands. In this section we discuss instead the extensive range of command-line switches and options, and other aspects of Perl code, that affect the way the translation is performed.


To use LATEX2HTML to translate a file <file>.tex containing LATEX commands, type:

latex2html <file>.tex


This will create a new directory called <file> which will contain the generated HTML files, some log files and possibly some images. To view the result use an HTML browser, such as NCSA Mosaic or Netscape Navigator, on the main HTML file which is <file>/<file>.html . The file will contain navigation links to the other parts of the generated document. The .tex suffix is optional and will be supplied by the program if it is omitted by the user. Other suffixes are acceptable also, such as .doc .



It is possible to customise the output from LATEX2HTML using a number of command-line options with which you can specify:



The LATEX2HTML script includes a short manual which can be viewed with the command:

  perldoc latex2html


Developing Documents using LATEX2HTML

Although any document containing LATEX commands can be translated by the LATEX2HTML translator, the best results are obtained when that document is itself a valid LATEX document. Indeed it is generally a good idea to develop documents so that they produce good readable results in both the LATEX typeset version as well as a set of HTML pages. This is not just a nicety; there are several good practical reasons for doing this.

* LATEX macros:
The macro commands that LATEX2HTML recognises are based upon corresponding commands for LATEX. If one tries to use syntax that is incorrect for LATEX then there is no reason why LATEX2HTML should be able to “get it right”, by somehow recognising the true intent.

* error checking:
Processing the document first using LATEX is the easiest, and quickest, way to check for valid syntax.
Whereas LATEX stops at each error (when run in interactive mode), allowing a fix to be made “on the spot” or a “stop-fix-restart”, LATEX2HTML does not stop when it detects an error in LATEX syntax. Useful messages are given concerning missing or unmatched braces, but other apparent anomalies generate only warning messages, which are saved to the end. (Some warnings are also shown immediately when the $VERBOSITY variable is set to at least 3.)
In practice it can be much quicker to test for invalid syntax using LATEX before attempting to use the LATEX2HTML translator.

Furthermore, LATEX warns of cross-reference labels that have not been defined. This is useful to help avoid having hyperlinks which point to nowhere.


The case of missing braces, or an unmatched opening brace, is an error that LATEX2HTML actually handles better than LATEX (or rather, the underlying TEX processor). Whereas TEX only detects an error when something else goes wrong later in the processing, LATEX2HTML shows where the unmatched brace itself occurs.

* auxiliary file:
Some information that LATEX2HTML might need is normally read from the .aux file for the document being processed; or perhaps from the .aux file of a different document, of which the current document is just a portion. Clearly a valid LATEX run is required to produce the .aux file.


Of course if no information in the .aux file is actually used, then this LATEX run can be neglected. Also, if the .aux file has already been created, and edits are made on the source which do not alter the information stored within the .aux file, then there is no need for a fresh LATEX run (except for the purposes of error-checking).

* bibliography:
Suppose the document requires a bibliography, or list of references, which is to be prepared using BibTEX. LATEX2HTML reads citation information from the .aux file, and can import the bibliography itself from the .bbl file. However these must first be created using LATEX.

* document segmentation:
With the document segmentation technique, discussed fully in a later section, it is vitally important that the full document processes correctly in LATEX. The desired effect is that of a single large document, whereas the pieces will actually be processed separately. To achieve this, LATEX writes vital information into special .ptr files. Like the .aux file, these files are read by LATEX2HTML to get section-numbering and other information to be used while processing each segment.


* print quality:
When a document contains automatically-generate images, these images are usually bitmaps designed for viewing on-screen. In general the resolution of these is too poor to give a good result when printed on a high-resolution laser-printer. Thus if it is likely that the reader will want to obtain a printed version of your document, then it is best to include a hyperlink to the typeset .dvi version, or to a PostScript conversion of the .dvi file. (In either case, a link to a compressed version is even better.)


Command-Line Options


The command-line options described here can be used to change the default behaviour of LATEX2HTML. Alternatively, often there is a corresponding environment variable whose value may be set or changed within a .latex2html-init initialisation file, in order to achieve the same result.
There are so many options that they are listed here in groups, according to the nature of the effects they control. When a large number of such options are required for the processing of a document, it is usual to store the switches and their desired settings within a Makefile, for use with the Unix make utility. Now a simple command such as:
  make mydocument
can initiate a call to latex2html that would otherwise take many lines of typing. Indeed it could instigate several such calls to LATEX2HTML, or to other programs such as LATEX and BibTEX, dvips and others. The document segmentation feature, discussed in another section, uses this technique extensively.


Options controlling Titles, File-Names and Sectioning

* -t <top-page-title>
Same as setting: $TITLE = "<top-page-title>";
Name the document using this title.

* -short_extn
Same as setting: $SHORTEXTN = 1;
Use a filename prefix of .htm for the produced HTML files. This is particularly useful for creating pages to be stored on CD-ROM or other media, to be used with operating systems that require a 3-character extension.

* -long_titles <num>
Same as setting: $LONG_TITLES = <num>;
Instead of the standard names: node1.html, node2.html,... the filenames for each HTML page are constructed from the first <num> words of the section heading for that page, separated by the `_' character.
Commas and common short words (a an to by of and for the) are omitted from both title and word-count.

Warning: Use this switch with great caution. Currently there are no checks for uniqueness of names or overall length. Very long names can easily result from using this feature.

* -custom_titles
Same as setting: $CUSTOM_TITLES = 1;
Instead of the standard names: node1.html, node2.html, ... the filenames for each HTML page are constructed using a Perl subroutine named custom_title_hook . The user may define his/her own version of this subroutine, within a .latex2html-init file say, to override the default (which uses the standard names). This subroutine takes the section-heading as a parameter and must return the required name, or the empty string (default).

* -dir <output-directory>
Same as setting: $DESTDIR = "<output-directory>";
Redirect the output to the specified directory.
The default behaviour is to create (or reuse) a directory having the same name as the prefix of the document being processed.

* -no_subdir
Same as setting: $NO_SUBDIR = 1;
Place the generated HTML files into the current directory. This overrides any $DESTDIR setting.

* -prefix <filename-prefix>
Same as setting: $PREFIX = "<filename-prefix>";
The <filename-prefix> will be prepended to all .gif, .pl and .html files produced, except for the top-level .html file; it may include a (relative) directory path. This will enable multiple products of LATEX2HTML to peacefully coexist in the same directory. However, do not attempt to simultaneously run multiple instances of LATEX2HTML using the same output directory, else various temporary files will overwrite each other.

* -auto_prefix
Same as setting: $AUTO_PREFIX = 1;
Constructs the prefix as `<title>–' to be prepended to all the files produced, where <title> is the name of the LATEX file being processed. (Note the `–' in this prefix.)
This overrides any $PREFIX setting.

* -no_auto_link
Same as setting: $NO_AUTO_LINK = 1;
If $NO_AUTO_LINK is empty and variables $LINKPOINT and $LINKNAME are defined appropriately (as is the default in the latex2html.config file), then a hard link to the main HTML page is produced, using the name supplied in $LINKNAME. Typically this is index.html; on many systems a file of this name will be used, if it exists, when a browser tries to view a URL which points to a directory. On other systems a different value for $LINKNAME may be appropriate. Typically $LINKPOINT has value $FILE.html, but this may also be changed to match whichever HTML page is to become the target of the automatic link.
Use of the -no_auto_link switch cancels this automatic linking facility, when not required for a particular document.

* -split <num>
Same as setting: $MAX_SPLIT_DEPTH = <num>; (default is 8)
Stop splitting sections into separate files at this depth. Specifying -split 0 will put the entire document into a single HTML file. See below for the different levels of sectioning. Also see the next item for how to set a “relative” depth for splitting.

* -split +<num>
Same as setting: $MAX_SPLIT_DEPTH = -<num>; (default is 8)
The level at which to stop splitting sections is calculated “relative to” the shallowest level of sectioning that occurs within the document. For example, if the document contains \section commands, but no \part or \chapter commands, then -split +1 will cause splitting at each \section but not at any deeper level; whereas -split +2 or -split +3 also split down to \subsection and \subsubsection commands respectively. Specifying -split +0 puts the entire document into a single HTML file.

* -link <num>
Same as setting: $MAX_LINK_DEPTH = <num>; (default is 4)
For each node, create links to child nodes down to this much deeper than the node's sectioning-level.
Specifying -link 0 will show no links to child nodes from that page,
-link 1 will show only the immediate descendents, etc.
A value at least as big as that of the -split <num> depth will produce a mini table-of-contents (when not empty) on each page, for the tree structure rooted at that node.

When the page has a sectioning-level less than the -split depth, so that the a mini table-of-contents has links to other HTML pages, this table is located at the bottom of the page, unless placed elsewhere using the \tableofchildlinks command.

On pages having a sectioning-level just less than the -split depth the mini table-of-contents contains links to subsections etc. occurring on the same HTML page. Now the table is located at the top of this page, unless placed elsewhere using the \tableofchildlinks command.

* -toc_depth <num>
Same as setting: $TOC_DEPTH = <num>; (default is 4)
Sectioning levels down to <num> are to be included within the Table-of-Contents tree.

* -toc_stars
Same as setting: $TOC_STARS = 1;
Sections created using the starred-form of sectioning commands are included within the Table-of-Contents. As with LATEX, normally such sections are not listed.

* -show_section_numbers
Same as setting: $SHOW_SECTION_NUMBERS = 1;
Show section numbers. By default section numbers are not shown, so as to encourage the use of particular sections as stand-alone documents. In order to be shown, section titles must be unique and must not contain inlined graphics.

* -unsegment
Same as setting: $UNSEGMENT = 1;
Treat a segmented document (see the section about document segmentation) like it were not segmented. This will cause the translator to concatenate all segments and process them as a whole. You might find this useful to check a segmented document for consistency.


For all documents the sectioning levels referred to above are:

0 document
1 part
2 chapter
3 section
4 subsection
5 subsubsection
6 paragraph
7 subparagraph
8 subsubparagraph
These levels apply even when the document contains no sectioning for the shallower levels; e.g. no \part or \chapter commands is most common, especially when using LATEX's article document-class.


Options controlling Extensions and Special Features


The switches described here govern the type of HTML code that can be generated, and how to choose between the available options when there are alternative strategies for implementing portions of LATEX code.

* -html_version (2.0|3.2|4.0|5.0)[,(math|i18n)]*
 
Same as setting: $HTML_VERSION = "... ";
This specifies both the HTML version to generate, and any extra (non-standard) HTML features that may be required.
The version number corresponds to a published DTD for an HTML standard. A corresponding Perl file in the versions/ subdirectory is loaded; these files are named `html<num>.pl'.

Following the version number, a comma-separated list of extensions can be given. Each corresponds to a file `<name>.pl' also located in the versions/ subdirectory. When such a file is loaded the resulting HTML code can no longer be expected to validate with the specified DTD. An exception is math when the -no_math switch is also used, which should still validate.

Currently, versions 2.0, 3.2. 4.0 and 5.0 are available. The default version is usually set to be `5.0', within latex2html.config .

* -no_tex_defs
Same as setting: $TEXDEFS = 0; (default is 1)
When $TEXDEFS is set (default) the file texdefs.perl will be read. This provides code to allow common TEX commands like \def, \newbox, \newdimen and others, to be recognised, especially within the document preamble. In the case of \def, the definition may even be fully interpreted, but this requires the pattern-matching to be not too complicated.

If $TEXDEFS is `0' or empty, then texdefs.perl will not be loaded; the translator will make no attempt to interpret any raw TEX commands. This feature is intended to enable sophisticated authors the ability to insert arbitrary TEX commands in environments that are destined to be processed by LATEX anyway; e.g. figures, theorems, pictures, etc. However this should rarely be needed, as now there is better support for these types of environment. There are now other methods to specify which chunks of code are to be passed to LATEX for explicit image-generation; see the discussion of the makeimage environment.

* -external_file <filename>
Same as setting: $EXTERNAL_FILE = "<filename>";
Specifies the prefix of the .aux file that this document should read. The .aux extension will be appended to this prefix to get the complete filename, with directory path if needed.
This file could contain necessary information regarding citations, figure, table and section numbers from LATEX and perhaps other information also. Use of this switch is vital for document segments, processed separately and linked to appear as if generated from a single LATEX document.

* -font_size <size>
Same as setting: $FONT_SIZE = "<size>";
This option provides better control over the font size of environments made into images using LATEX. <size> must be one of the font sizes that LATEX recognizes; i.e. `10pt', `11pt', `12pt', etc. Default is `10pt', or whatever option may have been specified on the \documentclass or \documentstyle line.
Whatever size is selected, it will be magnified by the installation variables $MATH_SCALE_FACTOR, $FIGURE_SCALE_FACTOR and $DISP_SCALE_FACTOR as appropriate.

Note: This switch provides no control over the size of text on the HTML pages. Such control is subject entirely to the user's choices of settings for the browser windows.

* -scalable_fonts
Same as setting: $SCALABLE_FONTS = 1;
This is used when scalable fonts, such as PostScript versions of the TEX fonts, are available for image-generation.
It has the effect of setting $PK_GENERATION to `1', and $DVIPS_MODE to be empty, overriding any previous settings for these variables.

* -no_math
Same as setting: $NO_SIMPLE_MATH = 1;
Ordinarily simple mathematical expressions are set using the ordinary text font, but italiced. When part of the expression can not be represented this way, an image is made of the whole formula. This is called “simple math”. When $NO_SIMPLE_MATH is set, then all mathematics is made into images, whether simple or not.

However, if the math extension is loaded, using the -html_version switch described earlier, then specifying -no_math produces a quite different effect. Now it is the special <MATH> tags and entities which are cancelled. In their place a sophisticated scheme for parsing mathematical expressions is used. Images are made of those sub-parts of a formula which cannot be adequately expressed using (italiced) text characters and <SUB> and <SUP> tags. See the subsection on mathematics for more details.

* -local_icons
Same as setting: $LOCAL_ICONS = 1;
A copy of each of the icons actually used within the document is placed in the directory along with the HTML files and generated images. This allows the whole document to be fully self-contained, within this directory; otherwise the icons must be retrieved from a (perhaps remote) server. It is also the default behavior if $ICONSERVER is not set.

The icons are normally copied from a subdirectory of the $LATEX2HTMLDIR, set within latex2html.config. An alternative set of icons can be used by specifying a (relative) directory path in $ALTERNATIVE_ICONS to where the customised images can be found.

* -init_file <file>
Load the specified initialisation file. This Perl file will be loaded after loading $HOME/.latex2html-init , or .latex2html-init in the local directory, if either file exists. It is read at the time the switch is processed, so the contents of the file may change any of the values of any of the variables that were previously established, as well as any default options. More than one initialisation file can be read in this way.

* -no_fork
Same as setting: $NOFORK = 1;
When set this disables a feature in the early part of the processing whereby some memory-intensive operations are performed by `forked' child processes. Some single-task operating systems, such as DOS, do not support this feature. Having $NOFORK set then ensures that unnecessary file-handles that are needed with the forked processes, are not consumed unnecessarily, perhaps resulting in a fatal Perl error.

* -iso_language <type>
This enables you to specify a different language type than 'EN' to be used in the DTD entries of the HTML document, e.g. 'EN.US'.

* -short_index
Same as setting: $SHORT_INDEX = 1;
Creates shorter Index listings, using codified links; this is fully compatible with the makeidx package.

* -no_footnode
Same as setting: $NO_FOOTNODE = 1;
Suppresses use of a separate file for footnotes; instead these are placed at the bottom of the HTML pages where the references occur.

When this option is used, it is frequently desirable to change the style of the marker used to indicate the presence of a footnote. This is done as in LATEX, using code such as follows.

\renewcommand{\thefootnote}{\arabic{footnote}}
All the styles \arabic, \alph, \roman, \Alph and \Roman are available.

* -numbered_footnotes
Same as setting: $NUMBERED_FOOTNOTES = 1;
If this is set you will get every footnote applied with a subsequent number, to ease readability.

* -address <author-address>
Same as setting: $ADDRESS = "<author-address>";
Sign each page with this address.
See latex2html.config for an example using Perl code to automatically include the date.

A user-defined Perl subroutine called &custom_address can be used instead, if defined; it takes the value of $ADDRESS as a parameter, which may be used or ignored as desired. At the time when this subroutine will be called, variables named $depth, $title, $file hold the sectioning-level, title and filename of the HTML page being produced; $FILE holds the name of the filename for the title-page of the whole document.

* -info <string>
Same as setting: $INFO = "<string>";
Generate a new section “About this document” containing information about the document being translated. The default is to generate such a section with information on the original document, the date, the user and the translator. An empty string (or the value `0') disables the creation of this extra section.
If a non-empty string is given, it will be placed as the contents of the “About this document” page instead of the default information.

Switches controlling Image Generation

These switches affect whether images are created at all, whether old images are reused on subsequent runs or new ones created afresh, and whether anti-aliasing effects are used within the images themselves.

* -image_type
Specify the type of images to be generated. Depending on your setup, LaTeX2HTML can generate svg, png or gif images. Vector formats such as svg look better at high resolution, while bitmap formats such as png or gif are generally faster to download and to render.

* -use_dvipng
Use the dvipng program to generate png images, rather than dvips followed by gs. This method produces better alignment of math formulas which extend significantly above or below the line of text in which they are contained. An example of this behavior can be seen in the file tests/eq_line_spacing.tex. The dvipng method also eliminates the ugly crop marks that appear with 12pt documents. It does not respect the $MATH_SCALE_FACTOR option.

* -(no)use_pdftex
By default, pdflatex is used to process input files. Specify -nouse_pdftex for documents that rely on standard, dvi-producing latex.

The pdflatex method uses the pdflatex program followed by pdfcrop and gs to generate images, rather than latex followed by dvips. This method can be useful for pdfLaTeX documents which cannot be translated by latex. The pdflatex method generally produces better alignment of math formulas and eliminates ugly crop marks. It does not respect the $MATH_SCALE_FACTOR option.

The pdflatex method uses the pdfwrite GhostScript driver by default. If called together with the -use_dvipng option, it will use the png16m driver and produce slightly different math alignment.

* -use_luatex
Use the lualatex program followed by pdfcrop and gs to generate images, rather than latex followed by dvips. This method can be useful for LuaLaTeX documents which cannot be translated by latex or pdflatex. This method can sometimes produce slightly better alignment of math formulas and eliminate ugly crop marks. It does not respect the $MATH_SCALE_FACTOR option.

This method uses the pdfwrite GhostScript driver by default. If called together with the -use_dvipng option, it will use the png16m driver and produce slightly different math alignment.

An example of using the -use_luatex option can be seen on the file tests/polyglossia.tex.

* -use_luadvi
Use the dvilualatex program instead of latex to generate images. This method can be useful for dvilualatex documents which cannot be translated by latex. It can be combined with the -use_dvipng option as usually.

* -ascii_mode
Same as setting: $ASCII_MODE = $EXTERNAL_IMAGES = 1;
Use only ASCII characters and do not include any images in the final output. With -ascii_mode the output of the translator can be used on character-based browsers, such as lynx, which do not support inlined images (via the <IMG> tag).

* -nolatex
Same as setting: $NOLATEX = 1;
Disable the mechanism for passing unknown environments to LATEX for processing. This can be thought of as “draft mode” which allows faster translation of the basic document structure and text, without fancy figures, equations or tables.

(This option has been superseded by the -no_images option, see below.)

* -external_images
Same as setting: $EXTERNAL_IMAGES = 1;
Instead of including any generated images inside the document, leave them outside the document and provide hypertext links to them.

* -ps_images
Same as setting: $PS_IMAGES = $EXTERNAL_IMAGES = 1;
Use links to external PostScript files rather than inlined images in the chosen graphics format.

* -discard
Same as setting: $DISCARD_PS = 1;
The temporary PostScript files are discarded immediately after they have been used to create the image in the desired graphics format.

* -no_images
Same as setting: $NO_IMAGES = 1;
Do not attempt to produce any inlined images. The missing images can be generated “off-line” by restarting LATEX2HTML with the option -images_only .

* -images_only
Same as setting: $IMAGES_ONLY = 1;
Try to convert any inlined images that were left over from previous runs of LATEX2HTML.

* -reuse <reuse_option>
Same as setting: $REUSE = <reuse_option>;
This switch specifies the extent to which image files are to be shared or recycled.
There are three valid options:
* 0
Do not ever share or recycle image files.
This choice also invokes an interactive session prompting the user about what to do about a pre-existing HTML directory, if it exists.
* 1
Recycle image files from a previous run if they are available,
but do not share identical images that must be created in this run.
* 2
Recycle image files from a previous run and share identical images from this run.
This is the default.
A later section provides additional information about image-reuse.

* -no_reuse
Same as setting: $REUSE = 0;
Do not share or recycle images generated during previous translations. This is equivalent to -reuse 0 . (This will enable the initial interactive session during which the user is asked whether to reuse the old directory, delete its contents or quit.)

* -antialias
Same as setting: $ANTI_ALIAS = 1; (Default is 0.)
Generated images of figure environments and external PostScript files should use anti-aliasing. By default anti-aliasing is not used with these images, since this may interfere with the contents of the images themselves.

* -antialias_text
Same as setting: $ANTI_ALIAS_TEXT = 1; (Default is 1.)
Generated images of typeset material such as text, mathematical formulas, tables and the content of makeimage environments, should use anti-aliasing effects.
The default is normally to use anti-aliasing for text, since the resulting images are much clearer on-screen. However the default may have been changed locally.

* -no_antialias
Same as setting: $ANTI_ALIAS = 0; (Default is 0.)
Generated images of figure environments and external PostScript files should not use anti-aliasing with images, though the local default may have been changed to use it.

* -no_antialias_text
Same as setting: $ANTI_ALIAS_TEXT = 0; (Default is 1.)
Generated images of typeset material should not use anti-aliasing effects. Although on-screen images of text are definitely improved using anti-aliasing, printed images can be badly blurred, even at 300dpi. Higher resolution printers do a much better job with the resulting grey-scale images.

* -white
Same as setting: $WHITE_BACKGROUND = 1; (Default is 1.)
Ensures that images of figure environments have a white background. Otherwise transparency effects may not work correctly.

* -no_white
Same as setting: $WHITE_BACKGROUND = ”; (Default is 1.)
Cancels the requirement that figure environments have a white background.

* -ldump
Same as setting: $LATEX_DUMP = 1; (Default is 0.)
Use this if you want to speed up image processing during the 2nd and subsequent runs of LATEX2HTML on the same document. The translator now produces a LATEX format-dump of the preamble to images.tex which is used on subsequent runs. This significantly reduces the startup time when LATEX reads the images.tex file for image-generation.

This process actually consumes additional time on the first run, since LATEX is called twice — once to create the format-dump, then again to load and use it. The pay-off comes with the faster loading on subsequent runs. Approximately 1 Meg of disk space is consumed by the dump file.


Switches controlling Navigation Panels

The following switches govern whether to include one or more navigation panels on each HTML page, also which buttons to include within such a panel.

* -no_navigation
Same as setting: $NO_NAVIGATION = 1;
Disable the mechanism for putting navigation links in each page.
This overrides any settings of the $TOP_NAVIGATION, $BOTTOM_NAVIGATION and $AUTO_NAVIGATION variables.

* -top_navigation
Same as setting: $TOP_NAVIGATION = 1;
Put navigation links at the top of each page.

* -bottom_navigation
Same as setting: $BOTTOM_NAVIGATION = 1;
Put navigation links at the bottom of each page as well as the top.

* -auto_navigation
Same as setting: $AUTO_NAVIGATION = 1;
Put navigation links at the top of each page. Also put one at the bottom of the page, if the page exceeds $WORDS_IN_PAGE number of words (default = 450).

* -next_page_in_navigation
Same as setting: $NEXT_PAGE_IN_NAVIGATION = 1;
Put a link to the next logical page in the navigation panel.

* -previous_page_in_navigation
 
Same as setting: $PREVIOUS_PAGE_IN_NAVIGATION = 1;
Put a link to the previous logical page in the navigation panel.

* -contents_in_navigation
Same as setting: $CONTENTS_IN_NAVIGATION = 1;
Put a link to the table-of-contents in the navigation panel if there is one.

* -index_in_navigation
Same as setting: $INDEX_IN_NAVIGATION = 1;
Put a link to the index-page in the navigation panel if there is an index.


Switches for Linking to other documents

When processing a single stand-alone document, the switches described in this section should not be needed at all, since the automatically generated navigation panels, described on the previous page should generate all the required navigation links. However if a document is to be regarded as part of a much larger document, then links from its first and final pages, to locations in other parts of the larger (virtual) document, need to be provided explicitly for some of the buttons in the navigation panel.

The following switches allow for such links to other documents, by providing the title and URL for navigation panel hyperlinks. In particular, the “Document Segmentation” feature necessarily makes great use of these switches. It is usual for the text and targets of these navigation hyperlinks to be recorded in a Makefile, to avoid tedious typing of long command-lines having many switches.

* -up_url <URL>
Same as setting: $EXTERNAL_UP_LINK = "<URL>";
Specifies a universal resource locator (URL) to associate with the “UP” button in the navigation panel(s).

* -up_title <string>
Same as setting: $EXTERNAL_UP_TITLE = "<string>";
Specifies a title associated with this URL.

* -prev_url <URL>
Same as setting: $EXTERNAL_PREV_LINK = "<URL>";
Specifies a URL to associate with the “PREVIOUS” button in the navigation panel(s).

* -prev_title <string>
Same as setting: $EXTERNAL_PREV_TITLE = "<string>";
Specifies a title associated with this URL.

* -down_url <URL>
Same as setting: $EXTERNAL_DOWN_LINK = "<URL>";
Specifies a URL for the “NEXT” button in the navigation panel(s).

* -down_title <string>
Same as setting: $EXTERNAL_DOWN_TITLE = "<string>";
Specifies a title associated with this URL.

* -contents <URL>
Same as setting: $EXTERNAL_CONTENTS = "<URL>";
Specifies a URL for the “CONTENTS” button, for document segments that would not otherwise have one.

* -index <URL>
Same as setting: $EXTERNAL_INDEX = "<URL>";
Specifies a URL for the “INDEX” button, for document segments that otherwise would not have an index.

* -biblio <URL>
Same as setting: $EXTERNAL_BIBLIO = "<URL>";
Specifies the URL for the bibliography page to be used, when not explicitly part of the document itself.

Warning: On some systems it is difficult to give text-strings <string> containing space characters, on the command-line or via a Makefile. One way to overcome this is to use the corresponding variable. Another way is to replace the spaces with underscores (_).

Switches for Help and Tracing

The first two of the following switches are self-explanatory. When problems arise in processing a document, the switches -debug and -verbosity will each cause LATEX2HTML to generate more output to the screen. These extra messages should help to locate the cause of the problem.

* -tmp <path>
Define a temporary directory to use for image generation. If <path> is 0, the standard temporary directory /tmp is used.
* -h(elp)
Print out the list of all command-line options.

* -v
Print the current version of LATEX2HTML.

* -debug
Same as setting: $DEBUG = 1;
Run in debug-mode, displaying messages and/or diagnostic information about files read, and utilities called by LATEX2HTML. Shows any messages produced by these calls.

More extensive diagnostics, from the Perl debugger, can be obtained by appending the string `-w-' to the 1st line of the latex2html (and other) Perl script(s).

* -verbosity <num>
Same as setting: $VERBOSITY = <num>;
Display messages revealing certain aspects of the processing performed by LATEX2HTML on the provided input file(s). The <num> parameter can be an integer in the range 0 to 8. Each higher value adds to the messages produced.

0.  No special tracing; as for versions of LATEX2HTML prior to V97.1 .

1.  (This is the default.) Show section-headings and the corresponding HTML file names, and indicators that major stages in the processing have been completed.

2.  Print environment names and identifier numbers, and new theorem-types. Show warnings as they occur, and indicators for more stages of processing. Print names of files for storing auxiliary data arrays.

3.  Print command names as they are encountered and processed; also any unknown commands encountered while pre-processing. Show names of new commands, environments, theorems, counters and counter-dependencies, for each document partition.

4.  Indicate command-substitution the pre-process of math-environments. Print the contents of unknown environments for processing in LATEX, both before and after reverting to LATEX source. Show all operations affecting the values of counters. Also show links, labels and sectioning keys, at the stages of processing.

5.  Detail the processing in the document preamble. Show substitutions of new environments. Show the contents of all recognised environments, both before and after processing. Show the cached/encoded information for the image keys, allowing two images to be tested for equality.

6.  Show replacements of new commands, accents and wrapped commands.

7.  Trace the processing of commands in math mode; both before and after.

8.  Trace the processing of all commands, both before and after.

The command-line option sets an initial value only. During processing the value of $VERBOSITY can be set dynamically using the \htmltracing{...} command, whose argument is the desired value, or by using the more general \HTMLset command as follows: \HTMLset{VERBOSITY}{<num>} .

Other Configuration Variables, without switches

The configuration variables described here do not warrant having a command-line switch to assign values. Either they represent aspects of LATEX2HTML that are specific to the local site, or they govern properties that should apply to all documents, rather than something that typically would change for the different documents within a particular sub-directory.

Normally these variables have their value set within the latex2html.config file. In the following listing the defaults are shown, as the lines of Perl code used to establish these values. If a different value is required, then these can be assigned from a local .latex2html-init initialisation file, without affecting the defaults for other users, or documents processed from other directories.

* $dd
 holds the string to be used in file-names to delimit directories; it is set internally to `/', unless the variable has already been given a value within latex2html.config .

Note: This value cannot be set within a .latex2html-init initialisation file, since its value needs to be known in order to find such a file.

* $LATEX2HTMLDIR
Read by the install-test script from latex2html.config, its value is inserted into the latex2html Perl script as part of the installation process.

* $LATEX2HTMLSTYLES = "$LATEX2HTMLDIR/styles";
Read from the latex2html.config file by install-test , its value is checked to locate the styles/ directory.

* $LATEX2HTMLVERSIONS = "$LATEX2HTMLDIR/versions";
The value of this variable should be set within latex2html.config to specify the directory path where the version and extension files can be found.

* $ALTERNATIVE_ICONS = ”;
This may contain the (relative) directory path to a set of customised icons to be used in conjunction with the -local_icons switch.

* $TEXEXPAND = "$LATEX2HTMLDIR/texexpand";
Read by the install-test Perl script from latex2html.config, its value is used to locate the texexpand Perl script.

* $PSTOIMG = "$LATEX2HTMLDIR/pstoimg";
Read by the install-test Perl script from latex2html.config, its value is used to locate the pstoimg Perl script.

* $IMAGE_TYPE = '<image-type>';
Set in latex2html.config, the currently supported <image-type>s are: svg, png and gif.

* $DVIPS = 'dvips';
Read from latex2html.config by install-test, its value is checked to locate the dvips program or script.

There could be several reasons to change the value here:

If automatic generation of fonts is required, using Metafont, the following configuration variables are important.

* $PK_GENERATION = 1;
This variable must be set, to initiate font-generation; otherwise fonts will be scaled from existing resources on the local system.
In particular this variable must not be set, if one wishes to use PostScript fonts or other scalable font resources (see the -scalable_fonts switch).

* $DVIPS_MODE = 'toshiba';
The mode given here must be available in the modes.mf file, located with the Metafont resource files, perhaps in the misc/ subdirectory.

* $METAFONT_DPI = 180;
The required resolution, in dots-per-inch, should be listed specifically within the MakeTeXPK script, called by dvips to invoke Metafont with the correct parameters for the required fonts.

* $LATEX = 'latex';
Read from latex2html.config by install-test, its value is checked to locate the latex program or script.

If LATEX is having trouble finding style-files and/or packages, then the default command can be prepended with other commands to set environment variables intended to resolve these difficulties;
e.g. $LATEX = 'setenv TEXINPUTS <path to search> ; latex' .

There are several variables to help control exactly which files are read by LATEX2HTML and by LATEX when processing images:

* $TEXINPUTS
This is normally set from the environment variable of the same name. If difficulties occur so that styles and packages are not being found, then extra paths can be specified here, to resolve these difficulties.

* $DONT_INCLUDE
This provides a list of filenames and extensions to not include, even if requested to do so by an \input or \include command.
(Consult latex2html.config for the default list.)

* $DO_INCLUDE = ”;
List of exceptions within the $DONT_INCLUDE list. These files are to be read if requested by an \input or \include command.

* $ICONSERVER = '<URL>';
This is used to specify a URL to find the standard icons, as used for the navigation buttons.
Names for the specific images size, as well as size information, can be found in latex2html.config. The icons themselves can be replaced by customised versions, provided this information is correctly updated and the location of the customised images specified as the value of $ICONSERVER.
When the -local_icons switch is used, so that a copy of the icons is placed with the HTML files and other generated images, the value of $ICONSERVER is not needed within the HTML files themselves.

* $NAV_BORDER = <num>;
The value given here results in a border, measured in points, around each icon.
A value of `0' is common, to maintain strict alignment of inactive and active buttons in the control panels.

* $LINKNAME = '"index.$EXTN"';
This is used when the $NO_AUTO_LINK variable is empty, to allow a URL to the working directory to be sufficient to reach the main page of the completed document. It specifies the name of the HTML file which will be automatically linked to the directory name.
The value of $EXTN is .html unless $SHORTEXTN is set, in which case it is .htm .

* $LINKPOINT = '"$FILE$EXTN"';
This specifies the name of the HTML file to be duplicated, or symbolically linked, with the name specified in $LINKNAME.
At the appropriate time the value of $FILE is the document name, which usually coincides with the name of the working directory.

* $CHARSET = 'iso_8859_1';
This specifies the character set used within the HTML pages produced by LATEX2HTML. If no value is set in a configuration or initialisation file, the default value will be assumed. The lowercase form $charset is also recognised, but this is overridden by the uppercase form.

* $ACCENT_IMAGES = 'large';
Accented characters that are not part of the ISO-Latin fonts can be generated by making an image using LATEX. This variable contains a (comma-separated) list of LATEX commands for setting the style to be used when these images are made. If the value of this variable is empty then the accent is simply ignored, using an un-accented font character (not an image) instead.


Within the color.perl package, the following variables are used to identify the names of files containing specifications for named colors. Files having these names are provided, in the $LATEX2HTMLSTYLES directory, but they could be moved elsewhere, or replaced by alternative files having different names. In such a case the values of these variables should be altered accordingly.
 $RGBCOLORFILE = 'rgb.txt';
 $CRAYOLAFILE = 'crayola.txt';


The following variables may well be altered from the system defaults, but this is best done using a local .latex2html-init initialisation file, for overall consistency of style within documents located at the same site, or sites in close proximity.

* $default_language = 'english';
This establishes which language code is to be placed within the <!DOCTYPE ... > tag that may appear at the beginning of the HTML pages produced. Loading a package for an alternative language can be expected to change the value of this variable.
See also the $TITLES_LANGUAGE variable, described next.

* $TITLES_LANGUAGE = 'english';
This variable is used to specify the actual strings used for standard document sections, such as “Contents”, “References”, “Table of Contents”, etc.
Support for French and German titles is available in corresponding packages. Loading such a package will normally alter the value of this variable, as well as the $default_language variable described above.

* $WORDS_IN_NAVIGATION_PANEL_TITLES = 4;
Specifies how many words to use from section titles, within the textual hyperlinks which accompany the navigation buttons.

* $WORDS_IN_PAGE = 450;
Specifies the minimum page length required before a navigation panel is placed at the bottom of a page, when the $AUTO_NAVIGATION variable is set.

* $CHILDLINE = "<BR><HR>\n";
This gives the HTML code to be placed between the child-links table and the ordinary contents of the page on which it occurs.

* $NETSCAPE_HTML = 0;
When set, this variable specifies that HTML code may be present which does not conform to any official standard. This restricts the contents of any <!DOCTYPE ... > tag which may be placed at the beginning of the HTML pages produced.

* $BODYTEXT = ”;
The value of this variable is used within the <BODY ... > tag; e.g. to set text and/or background colors.
It's value is overridden by the \bodytext command, and can be added-to or parts changed using the \htmlbody command or \color and \pagecolor from the color package.

* $INTERLACE = 1;
When set, interlaced images should be produced.
This requires graphics utilities to be available to perform the interlacing operation.

* $TRANSPARENT_FIGURES = 1;
When set, the background of images should be made transparent; otherwise it is white.
This requires graphics utilities to be available which can specify the color to be made transparent.

* $FIGURE_SCALE_FACTOR = 1;
Scale factor applied to all images of figure and other environments, when being made into a bitmapped image.

Note that this does not apply to recognised mathematics environments, which instead use the contents of $MATH_SCALE_FACTOR and $DISP_SCALE_FACTOR to specify scaling.

* $MATH_SCALE_FACTOR = 1;
Scale factor applied to all images of mathematics, both inline and displayed. A value of 1.4 is a good alternative, with anti-aliased images.

* $DISP_SCALE_FACTOR = 1;
Extra scale factor applied to images of displayed math environments.
When set, this value multiplies $MATH_SCALE_FACTOR to give the total scaling. A value of `1.2' is a good choice to accompany $MATH_SCALE_FACTOR = 1.4; .

* $EXTRA_IMAGE_SCALE
This may hold an extra scale factor that can be applied to all generated images.
When set, it specifies that a scaling of $EXTRA_IMAGE_SCALE be applied when images are created, but to have their height and width recorded as the un-scaled size. This is to coax browsers into scaling the (usually larger) images to fit the desired size; when printed a better quality can be obtained. Values of `1.5' and `2' give good print quality at 600dpi.

* $PAPERSIZE = 'a5';
Specifies the size of a page for typesetting figures or displayed math, when an image is to be generated.
This affects the lengths of lines of text within images. Since images of text or mathematics should use larger sizes than when printed, else clarity is lost at screen resolutions, then a smaller paper-size is generally advisable. This is especially so if both the $MATH_SCALE_FACTOR and $DISP_SCALE_FACTOR scaling factors are being used, else some images may become excessively large, including a lot of blank space.

* $LINE_WIDTH = 500;
Formerly specified the width of an image, when the contents were to be right- or center-justified. (No longer used.)


The following variables are used to access the utilities required during image-generation. File and program locations on the local system are established by the configure-pstoimg Perl script and stored within $LATEX2HTMLDIR/local.pm as Perl code, to be read by pstoimg when required.

After running the configure-pstoimg Perl script it should not be necessary to alter the values obtained. Those shown below are what happens on the author's system; they are for illustration only and do not represent default values.

The following variables are no longer needed, having been replaced by the more specific information obtained using the Perl script configure-pstoimg.
 $USENETPBM = 1;
 $PBMPLUSDIR = '/usr/local/bin';


Extending the Translator

Subsections

In an earlier section is was seen how the capabilities of LATEX2HTML can be extended to cope with LATEX commands defined in packages and style-files. This is in addition to defining simple macros and environments, using \newcommand and \newenvironment. The problem however, is that writing such extensions for packages requires an understanding of Perl programming and of the way the processing in LATEX2HTML is organised.

The default behaviour for unrecognised commands is for their arguments to remain in the HTML text, whereas the commands themselves are passed on to LATEX, in an attempt to generate an image. This is far from ideal, for it is quite likely to lead to an error in LATEX due to not having appropriate arguments for the command.

Currently there are several related mechanisms whereby a user can ask for particular commands and their arguments to be either

The string beginning &.... is the name of the Perl subroutine that controls how the LATEX commands are to be subsequently treated during processing by LATEX2HTML. Towards the end of the latex2html script, one finds a list of LATEX commands to be handled by each of these subroutines. These lists even include some common TEX commands.

Analogous lists occur in most of the package extension files. In many cases the commands are for fine-tuning the layout on a printed page. They should simply be ignored; but any parameters must not be allowed to cause unwanted characters to appear on the HTML pages. Customised extensions using these mechanisms may be included in the $LATEX2HTMLDIR/latex2html.config file, or in a personal $HOME/.latex2html-init initialisation file, as described next.


Asking the Translator to Ignore Commands


Commands that should be ignored may be specified in the .latex2html-init file as input to the &ignore_commands subroutine. Each command which is to be ignored should be on a separate line followed by compulsory or optional argument markers separated by #'s e.g.12:
<cmd_name1> # {} # [] # {} # [] ...
<cmd_name2> # «pattern» # [] ...
{}'s mark compulsory arguments and []'s optional ones, while «pattern» denotes matching everything up to the indicated string-pattern, given literally (e.g. \\endarray); spaces are ignored. Special characters such as $ , & , \ itself and perhaps some others, need to be “escaped” with a preceding \ .


Some commands may have arguments which should be left as text even though the command should be ignored (e.g. \hbox, \center, etc.). In these cases arguments should be left unspecified. Here is an example of how this mechanism may be used:

&ignore_commands( <<_IGNORED_CMDS_);
documentstyle # [] # {}
linebreak # []
mbox
<add your commands here>
_IGNORED_CMDS_


Asking the Translator to Pass Commands to LATEX


Commands that should be passed on to LATEX for processing because there is no direct translation to HTML may be specified in the .latex2html-init file as input to the process_commands_in_tex subroutine. The format is the same as that for specifying commands to be ignored. Here is an example:
&process_commands_in_tex (<<_RAW_ARG_CMDS_);
fbox # {}
framebox # [] # [] # {}
<add your commands here>
_RAW_ARG_CMDS_


Handling “order-sensitive” Commands

Some commands need to be passed to LATEX, but using the &process_commands_in_tex subroutine gives incorrect results. This may occur when the command is “order-sensitive”, using information such as the value of a counter or a boolean expression (or perhaps requiring a box to have been constructed and saved). Try using the &process_commands_inline_in_tex subroutine instead. Commands declared this way are first “wrapped” within a dummy environment, which ensures that they are later processed in correct order with other environments and order-sensitive commands.


Other commands may need to be passed to LATEX, not to create an image themselves, but to affect the way subsequent images are created. For example a color command such as \color{red} should set the text-colour to `red' for all subsequent text and images. This must be sent to LATEX so that it is processed at exactly the right time; i.e. before the first image required to be `red' but following any images that are not intended to be affected by this colour-change.
The subroutine process_commands_nowrap_in_tex is designed specifically to meet such requirements.


Commands can be order-sensitive without having to be passed to LATEX. Indeed even if a Perl subroutine has been carefully written to process the command, it may still give wrong results if it is order-sensitive, depending on the values of counters, say. To handle such cases there is the &process_commands_wrap_deferred subroutine. This also “wraps” the command within a dummy environment, but when that environment is processed the contents are not sent to LATEX, as in the previous case. All of the standard LATEX commands to change, set or read the values of counters are handled in this way.


Customising the Navigation Panels


The navigation panels are the strips containing “buttons” and text that appears at the top and perhaps at the bottom of each generated page and provides hypertext links to other sections of a document. Some of the options and variables that control whether and where it should appear are presented in an earlier section.


A simple mechanism for appending customised buttons to the navigation panel is provided by the command \htmladdtonavigation. This takes one argument which LATEX2HTML appends to the navigation panel. For example,

\htmladdtonavigation{\htmladdnormallink
   {\htmladdimg{http://server/mybutton.gif}}{http://server/link}}
will add an active button mybutton.gif pointing to the specified location.


Apart from these facilities it is also possible to specify completely what appears in the navigation panels and in what order. As each section is processed, LATEX2HTML assigns relevant information to a number of global variables. These variables are used by the subroutines top_navigation_panel and bottom_navigation_panel, where the navigation panel is constructed as a string consisting of these variables and some formatting information.


These subroutines can be redefined in a system- or user-configuration file (respectively, $LATEX2HTMLDIR/latex2html.config and $HOME/.latex2html-init). Any combination of text, HTML tags, and the variables mentioned below is acceptable.


The control-panel variables are:



Iconic links (buttons):

* $UP
Points up to the “parent” section;
* $NEXT
Points to the next section;
* $NEXT_GROUP
Points to the next “group” section;
* $PREVIOUS
Points to the previous section;
* $PREVIOUS_GROUP
Points to the previous “group” section;
* $CONTENTS
Points to the contents page if there is one;
* $INDEX
Points to the index page if there is one.


Textual links (section titles):

* $UP_TITLE
Points up to the “parent” section;
* $NEXT_TITLE
Points to the next section;
* $NEXT_GROUP_TITLE
Points to the next “group” section;
* $PREVIOUS_TITLE
Points to the previous section;
* $PREVIOUS_GROUP_TITLE
Points to the previous “group” section.
If the corresponding section exists, each iconic button will contain an active link to that section. If the corresponding section does not exist, the button will be inactive. If the section corresponding to a textual link does not exist then the link will be empty.

The “next group” and “previous group” are rarely used, since it is usually possible to determine which are the next/previous logical pages in a document. However these may be needed occasionally with segmented documents, when the segments have been created with different values for the $MAX_SPLIT_DEPTH variable. This is quite distinct from the segmented document effect in which the first page of one segment may have its `PREVIOUS' button artificially linked to the first page of the previous segment, rather than the last page.


The number of words that appears in each textual link is controlled by the variable $WORDS_IN_NAVIGATION_PANEL_TITLES which may also be changed in the configuration files.


Below is an example of a navigation panel for the bottom of HTML pages. (Note that the “.” is Perl's string-concatenation operator and “#” signifies a comment.)

sub bot_navigation_panel {
    #  Start with a horizontal rule and descriptive comment
    "<HR>\n" . "<!--Navigation Panel-->".
    # Now add a few buttons, with a space between them
       "$NEXT $UP $PREVIOUS $CONTENTS $INDEX $CUSTOM_BUTTONS" . 
    # Line break    
       "\n<BR>" .       
    # If “next” section exists, add its title to the navigation panel
       ($NEXT_TITLE ? "\n<B> Next:</B> $NEXT_TITLE" : undef) .   
    # Similarly with the “up” title ...
       ($UP_TITLE ? "\n<B>Up:</B> $UP_TITLE\n" : undef) . 
    # ... and the “previous” title
       ($PREVIOUS_TITLE ? "\n<B> Previous:</B> $PREVIOUS_TITLE\n" : undef) . 
}
Note that extra buttons may be included by defining suitable code for the container $CUSTOM_BUTTONS . The use of explicit `newline' (\n) characters is not necessary for the on-screen appearance of the navigation panel within a browser window. However it maintains an orderly organisation within the .html files themselves, which is helpful if any hand-editing is later required, or simply to read their contents. The corresponding subroutine for a navigation-panel at the top of a page need not use the rule <HR>, and would require a break (<BR>) or two at the end, to give some visual separation from the following material.