Next: ASCII misc, Previous: libiconv, Up: Top [Contents][Index]
An important part of the tabular charset knowledge in recode
comes from RFC 1345 or, alternatively, from the chset
tools,
both maintained by Keld Simonsen. The RFC 1345 document:
“Character Mnemonics & Character Sets”, K. Simonsen, Request for Comments no. 1345, Network Working Group, June 1992.
defines many character mnemonics and character sets. The recode
library implements most of RFC 1345, however:
dk-us
and us-dk
. However, See Mixed.
ANSI_X3.110-1983
, ISO_6937-2-add
,
T.101-G2
, T.61-8bit
, iso-ir-90
and
videotex-suppl
.
GB_2312-80
,
JIS_C6226-1978
, JIS_C6226-1983
, JIS_X0212-1990
and
KS_C_5601-1987
.
isoir91
as NATS-DANO
(alias
iso-ir-9-1
), not as JIS_C6229-1984-a
(alias
iso-ir-91
). It also interprets the charset isoir92
as NATS-DANO-ADD
(alias iso-ir-9-2
), not as
JIS_C6229-1984-b
(alias iso-ir-92
). It might be better
just avoiding these two alias names.
Keld Simonsen keld@dkuug.dk did most of RFC 1345 himself, with some funding from Danish Standards and Nordic standards (INSTA) project. He also did the character set design work, with substantial input from Olle Jaernefors. Keld typed in almost all of the tables, some have been contributed. A number of people have checked the tables in various ways. The RFC lists a number of people who helped.
Keld and the recode
maintainer have an arrangement by which any new
discovered information submitted by recode
users, about tabular
charsets, is forwarded to Keld, eventually merged into Keld’s work,
and only then, reimported into recode
. Neither the recode
program nor its library try to compete, nor even establish themselves as
an alternate or diverging reference: RFC 1345 and its new drafts stay the
genuine source for most tabular information conveyed by recode
.
Keld has been more than collaborative so far, so there is no reason that
we act otherwise. In a word, recode
should be perceived as the
application of external references, but not as a reference in itself.
Internally, RFC 1345 associates which each character an unambiguous
mnemonic of a few characters, taken from ISO 646, which is a minimal
ASCII subset of 83 characters. The charset made up by these mnemonics
is available in recode
under the name RFC1345
. It has
mnemonic
and 1345
for aliases. As implemened, this charset
exactly corresponds to mnemonic+ascii+38
, using RFC 1345
nomenclature. Roughly said, ISO 646 characters represent themselves,
except for the ampersand (&) which appears doubled. A prefix of a
single ampersand introduces a mnemonic. For mnemonics using two characters,
the prefix is immediately by the mnemonic. For longer mnemonics, the prefix
is followed by an underline (_), the mmemonic, and another underline.
Conversions to this charset are usually reversible.
Currently, recode
does not offer any of the many other possible
variations of this family of representations. They will likely be
implemented in some future version, however.
ANSI_X3.4-1968
367
, ANSI_X3.4-1986
, ASCII
, CP367
, IBM367
, ISO646-US
, ISO_646.irv:1991
, US-ASCII
, iso-ir-6
and us
are aliases for this charset.
Source: ISO 2375 registry.
ASMO_449
ISO_9036
, arabic7
and iso-ir-89
are aliases for this charset.
Source: ISO 2375 registry.
BS_4730
ISO646-GB
, gb
, iso-ir-4
and uk
are aliases for this charset.
Source: ISO 2375 registry.
BS_viewdata
iso-ir-47
is an alias for this charset.
Source: ISO 2375 registry.
CP1250
1250
, ms-ee
and windows-1250
are aliases for this charset.
Source: UNICODE 1.0.
CP1251
1251
, ms-cyrl
and windows-1251
are aliases for this charset.
Source: UNICODE 1.0.
CP1252
1252
, ms-ansi
and windows-1252
are aliases for this charset.
Source: UNICODE 1.0.
CP1253
1253
, ms-greek
and windows-1253
are aliases for this charset.
Source: UNICODE 1.0.
CP1254
1254
, ms-turk
and windows-1254
are aliases for this charset.
Source: UNICODE 1.0.
CP1255
1255
, ms-hebr
and windows-1255
are aliases for this charset.
Source: UNICODE 1.0.
CP1256
1256
, ms-arab
and windows-1256
are aliases for this charset.
Source: UNICODE 1.0.
CP1257
1257
, WinBaltRim
and windows-1257
are aliases for this charset.
Source: CEN/TC304 N283.
CSA_Z243.4-1985-1
ISO646-CA
, ca
, csa7-1
and iso-ir-121
are aliases for this charset.
Source: ISO 2375 registry.
CSA_Z243.4-1985-2
ISO646-CA2
, csa7-2
and iso-ir-122
are aliases for this charset.
Source: ISO 2375 registry.
CSA_Z243.4-1985-gr
iso-ir-123
is an alias for this charset.
Source: ISO 2375 registry.
CSN_369103
KOI-8_L2
, iso-ir-139
and koi8l2
are aliases for this charset.
Source: ISO 2375 registry.
CWI
CWI-2
and cp-hu
are aliases for this charset.
Source: Computerworld Sza’mita’stechnika vol 3 issue 13 1988-06-29.
DEC-MCS
dec
is an alias for this charset.
VAX/VMS User’s Manual, Order Number: AI-Y517A-TE, April 1986.
DIN_66003
ISO646-DE
, de
and iso-ir-21
are aliases for this charset.
Source: ISO 2375 registry.
DS_2089
DS2089
, ISO646-DK
and dk
are aliases for this charset.
Source: Danish Standard, DS 2089, February 1974.
EBCDIC-AT-DE
Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.
EBCDIC-AT-DE-A
Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.
EBCDIC-CA-FR
Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.
EBCDIC-DK-NO
Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.
EBCDIC-DK-NO-A
Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.
EBCDIC-ES
Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.
EBCDIC-ES-A
Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.
EBCDIC-ES-S
Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.
EBCDIC-FI-SE
Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.
EBCDIC-FI-SE-A
Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.
EBCDIC-FR
Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.
EBCDIC-IS-FRISS
friss
is an alias for this charset.
Source: Skyrsuvelar Rikisins og Reykjavikurborgar, feb 1982.
EBCDIC-IT
Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.
EBCDIC-PT
Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.
EBCDIC-UK
Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.
EBCDIC-US
Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.
ECMA-cyrillic
ECMA-113
, ECMA-113:1986
and iso-ir-111
are aliases for this charset.
Source: ISO 2375 registry.
ES
ISO646-ES
and iso-ir-17
are aliases for this charset.
Source: ISO 2375 registry.
ES2
ISO646-ES2
and iso-ir-85
are aliases for this charset.
Source: ISO 2375 registry.
GB_1988-80
ISO646-CN
, cn
and iso-ir-57
are aliases for this charset.
Source: ISO 2375 registry.
GOST_19768-87
ST_SEV_358-88
and iso-ir-153
are aliases for this charset.
Source: ISO 2375 registry.
IBM037
037
, CP037
, ebcdic-cp-ca
, ebcdic-cp-nl
, ebcdic-cp-us
and ebcdic-cp-wt
are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM038
038
, CP038
and EBCDIC-INT
are aliases for this charset.
Source: IBM 3174 Character Set Ref, GA27-3831-02, March 1990.
IBM1004
1004
, CP1004
and os2latin1
are aliases for this charset.
Source: CEN/TC304 N283, 1994-02-04.
IBM1026
1026
and CP1026
are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM1047
1047
and CP1047
are aliases for this charset.
Source: IBM Character Data Representation Architecture.
Registry SC09-1391-00 p 150.
IBM256
256
, CP256
and EBCDIC-INT1
are aliases for this charset.
Source: IBM Registry C-H 3-3220-050.
IBM273
273
and CP273
are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM274
274
, CP274
and EBCDIC-BE
are aliases for this charset.
Source: IBM 3174 Character Set Ref, GA27-3831-02, March 1990.
IBM275
275
, CP275
and EBCDIC-BR
are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM277
EBCDIC-CP-DK
and EBCDIC-CP-NO
are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM278
278
, CP278
, ebcdic-cp-fi
and ebcdic-cp-se
are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM280
280
, CP280
and ebcdic-cp-it
are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM281
281
, CP281
and EBCDIC-JP-E
are aliases for this charset.
Source: IBM 3174 Character Set Ref, GA27-3831-02, March 1990.
IBM284
284
, CP284
and ebcdic-cp-es
are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM285
285
, CP285
and ebcdic-cp-gb
are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM290
290
, CP290
and EBCDIC-JP-kana
are aliases for this charset.
Source: IBM 3174 Character Set Ref, GA27-3831-02, March 1990.
IBM297
297
, CP297
and ebcdic-cp-fr
are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM420
420
, CP420
and ebcdic-cp-ar1
are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM NLS RM p 11-11.
IBM423
423
, CP423
and ebcdic-cp-gr
are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM424
424
, CP424
and ebcdic-cp-he
are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM437
437
and CP437
are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM500
500
, 500V1
, CP500
, ebcdic-cp-be
and ebcdic-cp-ch
are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM850
850
and CP850
are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
Source: UNICODE 1.0.
IBM851
851
and CP851
are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM852
852
, CP852
, pcl2
and pclatin2
are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM855
855
and CP855
are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM857
857
and CP857
are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM860
860
and CP860
are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM861
861
, CP861
and cp-is
are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM862
862
and CP862
are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM863
863
and CP863
are aliases for this charset.
Source: IBM Keyboard layouts and code pages, PN 07G4586 June 1991.
IBM864
864
and CP864
are aliases for this charset.
Source: IBM Keyboard layouts and code pages, PN 07G4586 June 1991.
IBM865
865
and CP865
are aliases for this charset.
Source: IBM DOS 3.3 Ref (Abridged), 94X9575 (Feb 1987).
IBM868
868
, CP868
and cp-ar
are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM869
869
, CP869
and cp-gr
are aliases for this charset.
Source: IBM Keyboard layouts and code pages, PN 07G4586 June 1991.
IBM870
870
, CP870
, ebcdic-cp-roece
and ebcdic-cp-yu
are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM871
871
, CP871
and ebcdic-cp-is
are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM875
875
, CP875
and EBCDIC-Greek
are aliases for this charset.
Source: UNICODE 1.0.
IBM880
880
, CP880
and EBCDIC-Cyrillic
are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM891
891
and CP891
are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM903
903
and CP903
are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM904
904
and CP904
are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IBM905
905
, CP905
and ebcdic-cp-tr
are aliases for this charset.
Source: IBM 3174 Character Set Ref, GA27-3831-02, March 1990.
IBM918
918
, CP918
and ebcdic-cp-ar2
are aliases for this charset.
Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.
IEC_P27-1
iso-ir-143
is an alias for this charset.
Source: ISO 2375 registry.
INIS
iso-ir-49
is an alias for this charset.
Source: ISO 2375 registry.
INIS-8
iso-ir-50
is an alias for this charset.
Source: ISO 2375 registry.
INIS-cyrillic
iso-ir-51
is an alias for this charset.
Source: ISO 2375 registry.
INVARIANT
iso-ir-170
is an alias for this charset.
ISO-8859-1
819
, CP819
, IBM819
, ISO8859-1
, ISO_8859-1
, ISO_8859-1:1987
, iso-ir-100
, l1
and latin1
are aliases for this charset.
Source: ISO 2375 registry.
ISO-8859-10
ISO8859-10
, ISO_8859-10
, ISO_8859-10:1993
, L6
, iso-ir-157
and latin6
are aliases for this charset.
Source: ISO 2375 registry.
ISO-8859-13
ISO8859-13
, ISO_8859-13
, ISO_8859-13:1998
, iso-baltic
, iso-ir-179a
, l7
and latin7
are aliases for this charset.
Source: ISO 2375 registry.
ISO-8859-14
ISO8859-14
, ISO_8859-14
, ISO_8859-14:1998
, iso-celtic
, iso-ir-199
, l8
and latin8
are aliases for this charset.
Source: ISO 2375 registry.
ISO-8859-15
ISO8859-15
, ISO_8859-15
, ISO_8859-15:1998
, iso-ir-203
, l9
and latin9
are aliases for this charset.
Source: ISO 2375 registry.
ISO-8859-2
912
, CP912
, IBM912
, ISO8859-2
, ISO_8859-2
, ISO_8859-2:1987
, iso-ir-101
, l2
and latin2
are aliases for this charset.
Source: ISO 2375 registry.
ISO-8859-3
ISO8859-3
, ISO_8859-3
, ISO_8859-3:1988
, iso-ir-109
, l3
and latin3
are aliases for this charset.
Source: ISO 2375 registry.
ISO-8859-4
ISO8859-4
, ISO_8859-4
, ISO_8859-4:1988
, iso-ir-110
, l4
and latin4
are aliases for this charset.
Source: ISO 2375 registry.
ISO-8859-5
ISO8859-5
, ISO_8859-5
, ISO_8859-5:1988
, cyrillic
and iso-ir-144
are aliases for this charset.
Source: ISO 2375 registry.
ISO-8859-6
ASMO-708
, ECMA-114
, ISO8859-6
, ISO_8859-6
, ISO_8859-6:1987
, arabic
and iso-ir-127
are aliases for this charset.
Source: ISO 2375 registry.
ISO-8859-7
ECMA-118
, ELOT_928
, ISO8859-7
, ISO_8859-7
, ISO_8859-7:1987
, greek
, greek8
and iso-ir-126
are aliases for this charset.
Source: ISO 2375 registry.
ISO-8859-8
ISO8859-8
, ISO_8859-8
, ISO_8859-8:1988
, hebrew
and iso-ir-138
are aliases for this charset.
Source: ISO 2375 registry.
ISO-8859-9
ISO8859-9
, ISO_8859-9
, ISO_8859-9:1989
, iso-ir-148
, l5
and latin5
are aliases for this charset.
Source: ISO 2375 registry.
ISO_10367-box
iso-ir-155
is an alias for this charset.
Source: ISO 2375 registry.
ISO_2033-1983
e13b
and iso-ir-98
are aliases for this charset.
Source: ISO 2375 registry.
ISO_5427
iso-ir-37
is an alias for this charset.
Source: ISO 2375 registry.
ISO_5427-ext
ISO_5427:1981
and iso-ir-54
are aliases for this charset.
Source: ISO 2375 registry.
ISO_5428
ISO_5428:1980
and iso-ir-55
are aliases for this charset.
Source: ISO 2375 registry.
ISO_646.basic
ISO_646.basic:1983
and ref
are aliases for this charset.
Source: ISO 2375 registry.
ISO_646.irv
ISO_646.irv:1983
, irv
and iso-ir-2
are aliases for this charset.
Source: ISO 2375 registry.
ISO_6937-2-25
iso-ir-152
is an alias for this charset.
Source: ISO 2375 registry.
ISO_8859-supp
iso-ir-154
and latin1-2-5
are aliases for this charset.
Source: ISO 2375 registry.
IT
ISO646-IT
and iso-ir-15
are aliases for this charset.
Source: ISO 2375 registry.
JIS_C6220-1969-jp
JIS_C6220-1969
, iso-ir-13
, katakana
and x0201-7
are aliases for this charset.
Source: ISO 2375 registry.
JIS_C6220-1969-ro
ISO646-JP
, iso-ir-14
and jp
are aliases for this charset.
Source: ISO 2375 registry.
JIS_C6229-1984-a
jp-ocr-a
is an alias for this charset.
Source: ISO 2375 registry.
JIS_C6229-1984-b
ISO646-JP-OCR-B
and jp-ocr-b
are aliases for this charset.
Source: ISO 2375 registry.
JIS_C6229-1984-b-add
iso-ir-93
and jp-ocr-b-add
are aliases for this charset.
Source: ISO 2375 registry.
JIS_C6229-1984-hand
iso-ir-94
and jp-ocr-hand
are aliases for this charset.
Source: ISO 2375 registry.
JIS_C6229-1984-hand-add
iso-ir-95
and jp-ocr-hand-add
are aliases for this charset.
Source: ISO 2375 registry.
JIS_C6229-1984-kana
iso-ir-96
is an alias for this charset.
Source: ISO 2375 registry.
JIS_X0201
X0201
is an alias for this charset.
JUS_I.B1.002
ISO646-YU
, iso-ir-141
, js
and yu
are aliases for this charset.
Source: ISO 2375 registry.
JUS_I.B1.003-mac
iso-ir-147
and macedonian
are aliases for this charset.
Source: ISO 2375 registry.
JUS_I.B1.003-serb
iso-ir-146
and serbian
are aliases for this charset.
Source: ISO 2375 registry.
KOI-7
Source: Andrey A. Chernov <ache@nagual.pp.ru>.
KOI-8
GOST_19768-74
is an alias for this charset.
Source: Andrey A. Chernov <ache@nagual.pp.ru>.
KOI8-R
Source: RFC1489 via Gabor Kiss <kissg@sztaki.hu>. And Andrey A. Chernov <ache@nagual.pp.ru>.
KOI8-RU
Source: http://cad.ntu-kpi.kiev.ua/multiling/koi8-ru/.
KOI8-U
Source: RFC 2319. Mibenum: 2088. Source: http://www.net.ua/KOI8-U/.
KSC5636
ISO646-KR
is an alias for this charset.
Latin-greek-1
iso-ir-27
is an alias for this charset.
Source: ISO 2375 registry.
MSZ_7795.3
ISO646-HU
, hu
and iso-ir-86
are aliases for this charset.
Source: ISO 2375 registry.
NATS-DANO
iso-ir-9-1
is an alias for this charset.
Source: ISO 2375 registry.
NATS-DANO-ADD
iso-ir-9-2
is an alias for this charset.
Source: ISO 2375 registry.
NATS-SEFI
iso-ir-8-1
is an alias for this charset.
Source: ISO 2375 registry.
NATS-SEFI-ADD
iso-ir-8-2
is an alias for this charset.
Source: ISO 2375 registry.
NC_NC00-10
ISO646-CU
, NC_NC00-10:81
, cuba
and iso-ir-151
are aliases for this charset.
Source: ISO 2375 registry.
NF_Z_62-010
ISO646-FR
, fr
and iso-ir-69
are aliases for this charset.
Source: ISO 2375 registry.
NF_Z_62-010_(1973)
ISO646-FR1
and iso-ir-25
are aliases for this charset.
Source: ISO 2375 registry.
NS_4551-1
ISO646-NO
, iso-ir-60
and no
are aliases for this charset.
Source: ISO 2375 registry.
NS_4551-2
ISO646-NO2
, iso-ir-61
and no2
are aliases for this charset.
Source: ISO 2375 registry.
NeXTSTEP
next
is an alias for this charset.
Source: Peter Svanberg - psv@nada.kth.se.
PT
ISO646-PT
and iso-ir-16
are aliases for this charset.
Source: ISO 2375 registry.
PT2
ISO646-PT2
and iso-ir-84
are aliases for this charset.
Source: ISO 2375 registry.
SEN_850200_B
FI
, ISO646-FI
, ISO646-SE
, SS636127
, iso-ir-10
and se
are aliases for this charset.
Source: ISO 2375 registry.
SEN_850200_C
ISO646-SE2
, iso-ir-11
and se2
are aliases for this charset.
Source: ISO 2375 registry.
T.61-7bit
iso-ir-102
is an alias for this charset.
Source: ISO 2375 registry.
baltic
iso-ir-179
is an alias for this charset.
Source: ISO 2375 registry.
&g1esc x2d56 &g2esc x2e56 &g3esc x2f56.
greek-ccitt
iso-ir-150
is an alias for this charset.
Source: ISO 2375 registry.
greek7
iso-ir-88
is an alias for this charset.
Source: ISO 2375 registry.
greek7-old
iso-ir-18
is an alias for this charset.
Source: ISO 2375 registry.
hp-roman8
r8
and roman8
are aliases for this charset.
Source: LaserJet IIP Printer User’s Manual,.
HP part no 33471-90901, Hewlet-Packard, June 1989.
latin-greek
iso-ir-19
is an alias for this charset.
Source: ISO 2375 registry.
mac-is
macintosh
mac
is an alias for this charset.
Source: The Unicode Standard ver 1.0, ISBN 0-201-56788-1, Oct 1991.
macintosh_ce
macce
is an alias for this charset.
Source: Macintosh CE fonts.
sami
iso-ir-158
, lap
and latin-lap
are aliases for this charset.
Source: ISO 2375 registry.
Next: ASCII misc, Previous: libiconv, Up: Top [Contents][Index]