Next: ASCII misc, Previous: libiconv, Up: Top [Contents][Index]
An important part of the tabular charset knowledge in recode
comes from RFC 1345 or, alternatively, from the chset tools,
both maintained by Keld Simonsen. The RFC 1345 document:
“Character Mnemonics & Character Sets”, K. Simonsen, Request for Comments no. 1345, Network Working Group, June 1992.
defines many character mnemonics and character sets. The recode
library implements most of RFC 1345, however:
dk-us and us-dk. However, See Mixed.
ANSI_X3.110-1983, ISO_6937-2-add,
T.101-G2, T.61-8bit, iso-ir-90 and
videotex-suppl.
GB_2312-80,
JIS_C6226-1978, JIS_C6226-1983, JIS_X0212-1990 and
KS_C_5601-1987.
isoir91 as NATS-DANO (alias
iso-ir-9-1), not as JIS_C6229-1984-a (alias
iso-ir-91). It also interprets the charset isoir92
as NATS-DANO-ADD (alias iso-ir-9-2), not as
JIS_C6229-1984-b (alias iso-ir-92). It might be better
just avoiding these two alias names.
Keld Simonsen keld@dkuug.dk did most of RFC 1345 himself, with some funding from Danish Standards and Nordic standards (INSTA) project. He also did the character set design work, with substantial input from Olle Jaernefors. Keld typed in almost all of the tables, some have been contributed. A number of people have checked the tables in various ways. The RFC lists a number of people who helped.
Keld and the recode maintainer have an arrangement by which any new
discovered information submitted by recode users, about tabular
charsets, is forwarded to Keld, eventually merged into Keld’s work,
and only then, reimported into recode. Neither the recode
program nor its library try to compete, nor even establish themselves as
an alternate or diverging reference: RFC 1345 and its new drafts stay the
genuine source for most tabular information conveyed by recode.
Keld has been more than collaborative so far, so there is no reason that
we act otherwise. In a word, recode should be perceived as the
application of external references, but not as a reference in itself.
Internally, RFC 1345 associates which each character an unambiguous
mnemonic of a few characters, taken from ISO 646, which is a minimal
ASCII subset of 83 characters. The charset made up by these mnemonics
is available in recode under the name RFC1345. It has
mnemonic and 1345 for aliases. As implemened, this charset
exactly corresponds to mnemonic+ascii+38, using RFC 1345
nomenclature. Roughly said, ISO 646 characters represent themselves,
except for the ampersand (&) which appears doubled. A prefix of a
single ampersand introduces a mnemonic. For mnemonics using two characters,
the prefix is immediately by the mnemonic. For longer mnemonics, the prefix
is followed by an underline (_), the mmemonic, and another underline.
Conversions to this charset are usually reversible.
Currently, recode does not offer any of the many other possible
variations of this family of representations. They will likely be
implemented in some future version, however.
ANSI_X3.4-1968367, ANSI_X3.4-1986, ASCII, CP367, IBM367, ISO646-US, ISO_646.irv:1991, US-ASCII, iso-ir-6 and us are aliases for this charset.
Source: ISO 2375 registry.
ASMO_449ISO_9036, arabic7 and iso-ir-89 are aliases for this charset.
Source: ISO 2375 registry.
BS_4730ISO646-GB, gb, iso-ir-4 and uk are aliases for this charset.
Source: ISO 2375 registry.
BS_viewdataiso-ir-47 is an alias for this charset.
Source: ISO 2375 registry.
CP12501250, ms-ee and windows-1250 are aliases for this charset.
Source: UNICODE 1.0.
CP12511251, ms-cyrl and windows-1251 are aliases for this charset.
Source: UNICODE 1.0.
CP12521252, ms-ansi and windows-1252 are aliases for this charset.
Source: UNICODE 1.0.
CP12531253, ms-greek and windows-1253 are aliases for this charset.
Source: UNICODE 1.0.
CP12541254, ms-turk and windows-1254 are aliases for this charset.
Source: UNICODE 1.0.
CP12551255, ms-hebr and windows-1255 are aliases for this charset.
Source: UNICODE 1.0.
CP12561256, ms-arab and windows-1256 are aliases for this charset.
Source: UNICODE 1.0.
CP12571257, WinBaltRim and windows-1257 are aliases for this charset.
Source: CEN/TC304 N283.
CSA_Z243.4-1985-1ISO646-CA, ca, csa7-1 and iso-ir-121 are aliases for this charset.
Source: ISO 2375 registry.
CSA_Z243.4-1985-2ISO646-CA2, csa7-2 and iso-ir-122 are aliases for this charset.
Source: ISO 2375 registry.
CSA_Z243.4-1985-griso-ir-123 is an alias for this charset.
Source: ISO 2375 registry.
CSN_369103KOI-8_L2, iso-ir-139 and koi8l2 are aliases for this charset.
Source: ISO 2375 registry.
CWICWI-2 and cp-hu are aliases for this charset.
Source: Computerworld Sza’mita’stechnika vol 3 issue 13 1988-06-29.
DEC-MCSdec is an alias for this charset.
VAX/VMS User’s Manual, Order Number: AI-Y517A-TE, April 1986.
DIN_66003ISO646-DE, de and iso-ir-21 are aliases for this charset.
Source: ISO 2375 registry.
DS_2089DS2089, ISO646-DK and dk are aliases for this charset.
Source: Danish Standard, DS 2089, February 1974.
EBCDIC-AT-DESource: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.
EBCDIC-AT-DE-ASource: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.
EBCDIC-CA-FRSource: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.
EBCDIC-DK-NOSource: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.
EBCDIC-DK-NO-ASource: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.
EBCDIC-ESSource: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.
EBCDIC-ES-ASource: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.
EBCDIC-ES-SSource: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.
EBCDIC-FI-SESource: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.
EBCDIC-FI-SE-ASource: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.
EBCDIC-FRSource: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.
EBCDIC-IS-FRISSfriss is an alias for this charset.
Source: Skyrsuvelar Rikisins og Reykjavikurborgar, feb 1982.
EBCDIC-ITSource: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.
EBCDIC-PTSource: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.
EBCDIC-UKSource: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.
EBCDIC-USSource: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.
ECMA-cyrillicECMA-113, ECMA-113:1986 and iso-ir-111 are aliases for this charset.
Source: ISO 2375 registry.
ESISO646-ES and iso-ir-17 are aliases for this charset.
Source: ISO 2375 registry.
ES2ISO646-ES2 and iso-ir-85 are aliases for this charset.
Source: ISO 2375 registry.
GB_1988-80ISO646-CN, cn and iso-ir-57 are aliases for this charset.
Source: ISO 2375 registry.
GOST_19768-87ST_SEV_358-88 and iso-ir-153 are aliases for this charset.
Source: ISO 2375 registry.
IBM037037, CP037, ebcdic-cp-ca, ebcdic-cp-nl, ebcdic-cp-us and ebcdic-cp-wt are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM038038, CP038 and EBCDIC-INT are aliases for this charset.
Source: IBM 3174 Character Set Ref, GA27-3831-02, March 1990.
IBM10041004, CP1004 and os2latin1 are aliases for this charset.
Source: CEN/TC304 N283, 1994-02-04.
IBM10261026 and CP1026 are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM10471047 and CP1047 are aliases for this charset.
Source: IBM Character Data Representation Architecture.
Registry SC09-1391-00 p 150.
IBM256256, CP256 and EBCDIC-INT1 are aliases for this charset.
Source: IBM Registry C-H 3-3220-050.
IBM273273 and CP273 are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM274274, CP274 and EBCDIC-BE are aliases for this charset.
Source: IBM 3174 Character Set Ref, GA27-3831-02, March 1990.
IBM275275, CP275 and EBCDIC-BR are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM277EBCDIC-CP-DK and EBCDIC-CP-NO are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM278278, CP278, ebcdic-cp-fi and ebcdic-cp-se are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM280280, CP280 and ebcdic-cp-it are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM281281, CP281 and EBCDIC-JP-E are aliases for this charset.
Source: IBM 3174 Character Set Ref, GA27-3831-02, March 1990.
IBM284284, CP284 and ebcdic-cp-es are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM285285, CP285 and ebcdic-cp-gb are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM290290, CP290 and EBCDIC-JP-kana are aliases for this charset.
Source: IBM 3174 Character Set Ref, GA27-3831-02, March 1990.
IBM297297, CP297 and ebcdic-cp-fr are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM420420, CP420 and ebcdic-cp-ar1 are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM NLS RM p 11-11.
IBM423423, CP423 and ebcdic-cp-gr are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM424424, CP424 and ebcdic-cp-he are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM437437 and CP437 are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM500500, 500V1, CP500, ebcdic-cp-be and ebcdic-cp-ch are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM850850 and CP850 are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
Source: UNICODE 1.0.
IBM851851 and CP851 are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM852852, CP852, pcl2 and pclatin2 are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM855855 and CP855 are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM857857 and CP857 are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM860860 and CP860 are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM861861, CP861 and cp-is are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM862862 and CP862 are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM863863 and CP863 are aliases for this charset.
Source: IBM Keyboard layouts and code pages, PN 07G4586 June 1991.
IBM864864 and CP864 are aliases for this charset.
Source: IBM Keyboard layouts and code pages, PN 07G4586 June 1991.
IBM865865 and CP865 are aliases for this charset.
Source: IBM DOS 3.3 Ref (Abridged), 94X9575 (Feb 1987).
IBM868868, CP868 and cp-ar are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM869869, CP869 and cp-gr are aliases for this charset.
Source: IBM Keyboard layouts and code pages, PN 07G4586 June 1991.
IBM870870, CP870, ebcdic-cp-roece and ebcdic-cp-yu are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM871871, CP871 and ebcdic-cp-is are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM875875, CP875 and EBCDIC-Greek are aliases for this charset.
Source: UNICODE 1.0.
IBM880880, CP880 and EBCDIC-Cyrillic are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM891891 and CP891 are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM903903 and CP903 are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM904904 and CP904 are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM905905, CP905 and ebcdic-cp-tr are aliases for this charset.
Source: IBM 3174 Character Set Ref, GA27-3831-02, March 1990.
IBM918918, CP918 and ebcdic-cp-ar2 are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IEC_P27-1iso-ir-143 is an alias for this charset.
Source: ISO 2375 registry.
INISiso-ir-49 is an alias for this charset.
Source: ISO 2375 registry.
INIS-8iso-ir-50 is an alias for this charset.
Source: ISO 2375 registry.
INIS-cyrilliciso-ir-51 is an alias for this charset.
Source: ISO 2375 registry.
INVARIANTiso-ir-170 is an alias for this charset.
ISO-8859-1819, CP819, IBM819, ISO8859-1, ISO_8859-1, ISO_8859-1:1987, iso-ir-100, l1 and latin1 are aliases for this charset.
Source: ISO 2375 registry.
ISO-8859-10ISO8859-10, ISO_8859-10, ISO_8859-10:1993, L6, iso-ir-157 and latin6 are aliases for this charset.
Source: ISO 2375 registry.
ISO-8859-13ISO8859-13, ISO_8859-13, ISO_8859-13:1998, iso-baltic, iso-ir-179a, l7 and latin7 are aliases for this charset.
Source: ISO 2375 registry.
ISO-8859-14ISO8859-14, ISO_8859-14, ISO_8859-14:1998, iso-celtic, iso-ir-199, l8 and latin8 are aliases for this charset.
Source: ISO 2375 registry.
ISO-8859-15ISO8859-15, ISO_8859-15, ISO_8859-15:1998, iso-ir-203, l9 and latin9 are aliases for this charset.
Source: ISO 2375 registry.
ISO-8859-2912, CP912, IBM912, ISO8859-2, ISO_8859-2, ISO_8859-2:1987, iso-ir-101, l2 and latin2 are aliases for this charset.
Source: ISO 2375 registry.
ISO-8859-3ISO8859-3, ISO_8859-3, ISO_8859-3:1988, iso-ir-109, l3 and latin3 are aliases for this charset.
Source: ISO 2375 registry.
ISO-8859-4ISO8859-4, ISO_8859-4, ISO_8859-4:1988, iso-ir-110, l4 and latin4 are aliases for this charset.
Source: ISO 2375 registry.
ISO-8859-5ISO8859-5, ISO_8859-5, ISO_8859-5:1988, cyrillic and iso-ir-144 are aliases for this charset.
Source: ISO 2375 registry.
ISO-8859-6ASMO-708, ECMA-114, ISO8859-6, ISO_8859-6, ISO_8859-6:1987, arabic and iso-ir-127 are aliases for this charset.
Source: ISO 2375 registry.
ISO-8859-7ECMA-118, ELOT_928, ISO8859-7, ISO_8859-7, ISO_8859-7:1987, greek, greek8 and iso-ir-126 are aliases for this charset.
Source: ISO 2375 registry.
ISO-8859-8ISO8859-8, ISO_8859-8, ISO_8859-8:1988, hebrew and iso-ir-138 are aliases for this charset.
Source: ISO 2375 registry.
ISO-8859-9ISO8859-9, ISO_8859-9, ISO_8859-9:1989, iso-ir-148, l5 and latin5 are aliases for this charset.
Source: ISO 2375 registry.
ISO_10367-boxiso-ir-155 is an alias for this charset.
Source: ISO 2375 registry.
ISO_2033-1983e13b and iso-ir-98 are aliases for this charset.
Source: ISO 2375 registry.
ISO_5427iso-ir-37 is an alias for this charset.
Source: ISO 2375 registry.
ISO_5427-extISO_5427:1981 and iso-ir-54 are aliases for this charset.
Source: ISO 2375 registry.
ISO_5428ISO_5428:1980 and iso-ir-55 are aliases for this charset.
Source: ISO 2375 registry.
ISO_646.basicISO_646.basic:1983 and ref are aliases for this charset.
Source: ISO 2375 registry.
ISO_646.irvISO_646.irv:1983, irv and iso-ir-2 are aliases for this charset.
Source: ISO 2375 registry.
ISO_6937-2-25iso-ir-152 is an alias for this charset.
Source: ISO 2375 registry.
ISO_8859-suppiso-ir-154 and latin1-2-5 are aliases for this charset.
Source: ISO 2375 registry.
ITISO646-IT and iso-ir-15 are aliases for this charset.
Source: ISO 2375 registry.
JIS_C6220-1969-jpJIS_C6220-1969, iso-ir-13, katakana and x0201-7 are aliases for this charset.
Source: ISO 2375 registry.
JIS_C6220-1969-roISO646-JP, iso-ir-14 and jp are aliases for this charset.
Source: ISO 2375 registry.
JIS_C6229-1984-ajp-ocr-a is an alias for this charset.
Source: ISO 2375 registry.
JIS_C6229-1984-bISO646-JP-OCR-B and jp-ocr-b are aliases for this charset.
Source: ISO 2375 registry.
JIS_C6229-1984-b-addiso-ir-93 and jp-ocr-b-add are aliases for this charset.
Source: ISO 2375 registry.
JIS_C6229-1984-handiso-ir-94 and jp-ocr-hand are aliases for this charset.
Source: ISO 2375 registry.
JIS_C6229-1984-hand-addiso-ir-95 and jp-ocr-hand-add are aliases for this charset.
Source: ISO 2375 registry.
JIS_C6229-1984-kanaiso-ir-96 is an alias for this charset.
Source: ISO 2375 registry.
JIS_X0201X0201 is an alias for this charset.
JUS_I.B1.002ISO646-YU, iso-ir-141, js and yu are aliases for this charset.
Source: ISO 2375 registry.
JUS_I.B1.003-maciso-ir-147 and macedonian are aliases for this charset.
Source: ISO 2375 registry.
JUS_I.B1.003-serbiso-ir-146 and serbian are aliases for this charset.
Source: ISO 2375 registry.
KOI-7Source: Andrey A. Chernov <ache@nagual.pp.ru>.
KOI-8GOST_19768-74 is an alias for this charset.
Source: Andrey A. Chernov <ache@nagual.pp.ru>.
KOI8-RSource: RFC1489 via Gabor Kiss <kissg@sztaki.hu>. And Andrey A. Chernov <ache@nagual.pp.ru>.
KOI8-RUSource: http://cad.ntu-kpi.kiev.ua/multiling/koi8-ru/.
KOI8-USource: RFC 2319. Mibenum: 2088. Source: http://www.net.ua/KOI8-U/.
KSC5636ISO646-KR is an alias for this charset.
Latin-greek-1iso-ir-27 is an alias for this charset.
Source: ISO 2375 registry.
MSZ_7795.3ISO646-HU, hu and iso-ir-86 are aliases for this charset.
Source: ISO 2375 registry.
NATS-DANOiso-ir-9-1 is an alias for this charset.
Source: ISO 2375 registry.
NATS-DANO-ADDiso-ir-9-2 is an alias for this charset.
Source: ISO 2375 registry.
NATS-SEFIiso-ir-8-1 is an alias for this charset.
Source: ISO 2375 registry.
NATS-SEFI-ADDiso-ir-8-2 is an alias for this charset.
Source: ISO 2375 registry.
NC_NC00-10ISO646-CU, NC_NC00-10:81, cuba and iso-ir-151 are aliases for this charset.
Source: ISO 2375 registry.
NF_Z_62-010ISO646-FR, fr and iso-ir-69 are aliases for this charset.
Source: ISO 2375 registry.
NF_Z_62-010_(1973)ISO646-FR1 and iso-ir-25 are aliases for this charset.
Source: ISO 2375 registry.
NS_4551-1ISO646-NO, iso-ir-60 and no are aliases for this charset.
Source: ISO 2375 registry.
NS_4551-2ISO646-NO2, iso-ir-61 and no2 are aliases for this charset.
Source: ISO 2375 registry.
NeXTSTEPnext is an alias for this charset.
Source: Peter Svanberg - psv@nada.kth.se.
PTISO646-PT and iso-ir-16 are aliases for this charset.
Source: ISO 2375 registry.
PT2ISO646-PT2 and iso-ir-84 are aliases for this charset.
Source: ISO 2375 registry.
SEN_850200_BFI, ISO646-FI, ISO646-SE, SS636127, iso-ir-10 and se are aliases for this charset.
Source: ISO 2375 registry.
SEN_850200_CISO646-SE2, iso-ir-11 and se2 are aliases for this charset.
Source: ISO 2375 registry.
T.61-7bitiso-ir-102 is an alias for this charset.
Source: ISO 2375 registry.
balticiso-ir-179 is an alias for this charset.
Source: ISO 2375 registry.
&g1esc x2d56 &g2esc x2e56 &g3esc x2f56.
greek-ccittiso-ir-150 is an alias for this charset.
Source: ISO 2375 registry.
greek7iso-ir-88 is an alias for this charset.
Source: ISO 2375 registry.
greek7-oldiso-ir-18 is an alias for this charset.
Source: ISO 2375 registry.
hp-roman8r8 and roman8 are aliases for this charset.
Source: LaserJet IIP Printer User’s Manual,.
HP part no 33471-90901, Hewlet-Packard, June 1989.
latin-greekiso-ir-19 is an alias for this charset.
Source: ISO 2375 registry.
mac-ismacintoshmac is an alias for this charset.
Source: The Unicode Standard ver 1.0, ISBN 0-201-56788-1, Oct 1991.
macintosh_cemacce is an alias for this charset.
Source: Macintosh CE fonts.
samiiso-ir-158, lap and latin-lap are aliases for this charset.
Source: ISO 2375 registry.
Next: ASCII misc, Previous: libiconv, Up: Top [Contents][Index]