Next: , Previous: , Up: Top   [Contents][Index]

6 The iconv library

The recode library itself contains most code and tables from the portable iconv library, written by Bruno Haible. In fact, many capabilities of the recode library are duplicated because of this merging, as the older recode and iconv libraries share many charsets. We discuss, here, the issues related to this duplication, and other peculiarities specific to the iconv library. The plan is to remove duplications and better merge specificities, as recode evolves.

As implemented, if a recoding request can be satisfied by the recode library both with and without its iconv library part, it is likely that the iconv library will be used. To sort out if the iconv is indeed used of not, just use the ‘-v’ or ‘--verbose’ option, see Recoding.

The :libiconv: charset represents a conceptual pivot charset within the iconv part of the recode library (in fact, this pivot exists, but is not directly reachable). This charset has a mere : (a colon) for an alias. It is not allowed to recode from or to this charset directly. But when this charset is selected as an intermediate, usually by automatic means, then the iconv part of the recode library is called to handle the transformations. By using an ‘--ignore=:libiconv:’ option on the recode call or equivalently, but more simply, ‘-x:’, recode is instructed to fully avoid this charset as an intermediate, with the consequence that the iconv part of the library is defeated. Consider these two calls:

recode l1..1250 < input > output
recode -x: l1..1250 < input > output

Both should transform input from ISO-8859-1 to CP1250 on output. The first call uses the iconv part of the library, while the second call avoids it. Whatever the path used, the results should normally be identical. However, there might be observable differences. Most of them might result from reversibility issues, as the iconv engine, which the recode library directly uses for the time being, does not address reversibility. Even if much less likely, some differences might result from slight errors in the tables used, such differences should then be reported as bugs.

Other irregularities might be seen in the area of error detection and recovery. The recode library usually tries to detect canonicity errors in input, and production of ambiguous output, but the iconv part of the library currently does not. Input is always validated, however. The recode library may not always react properly when its iconv part has no translation for a given character.

Within a collection of names for a single charset, the recode library distinguishes one of them as being the genuine charset name, while the others are said to be aliases. When recode lists all charsets, for example with the ‘-l’ or ‘--list’ option, the list integrates all iconv library charsets. The selection of one of the aliases as the genuine charset name is an artifact added by recode, it does not come from iconv. Moreover, the recode library dynamically resolves some conflicts when it initialises itself at runtime. This might explain some discrepancies in the table below, as for what is the genuine charset name.


Next: , Previous: , Up: Top   [Contents][Index]