Next: , Previous: , Up: Top   [Contents][Index]

7 Tabular sources (RFC 1345)

An important part of the tabular charset knowledge in recode comes from RFC 1345 or, alternatively, from the chset tools, both maintained by Keld Simonsen. The RFC 1345 document:

“Character Mnemonics & Character Sets”, K. Simonsen, Request for Comments no. 1345, Network Working Group, June 1992.

defines many character mnemonics and character sets. The recode library implements most of RFC 1345, however:

Keld Simonsen keld@dkuug.dk did most of RFC 1345 himself, with some funding from Danish Standards and Nordic standards (INSTA) project. He also did the character set design work, with substantial input from Olle Jaernefors. Keld typed in almost all of the tables, some have been contributed. A number of people have checked the tables in various ways. The RFC lists a number of people who helped.

Keld and the recode maintainer have an arrangement by which any new discovered information submitted by recode users, about tabular charsets, is forwarded to Keld, eventually merged into Keld’s work, and only then, reimported into recode. Neither the recode program nor its library try to compete, nor even establish themselves as an alternate or diverging reference: RFC 1345 and its new drafts stay the genuine source for most tabular information conveyed by recode. Keld has been more than collaborative so far, so there is no reason that we act otherwise. In a word, recode should be perceived as the application of external references, but not as a reference in itself.

Internally, RFC 1345 associates which each character an unambiguous mnemonic of a few characters, taken from ISO 646, which is a minimal ASCII subset of 83 characters. The charset made up by these mnemonics is available in recode under the name RFC1345. It has mnemonic and 1345 for aliases. As implemened, this charset exactly corresponds to mnemonic+ascii+38, using RFC 1345 nomenclature. Roughly said, ISO 646 characters represent themselves, except for the ampersand (&) which appears doubled. A prefix of a single ampersand introduces a mnemonic. For mnemonics using two characters, the prefix is immediately by the mnemonic. For longer mnemonics, the prefix is followed by an underline (_), the mmemonic, and another underline. Conversions to this charset are usually reversible.

Currently, recode does not offer any of the many other possible variations of this family of representations. They will likely be implemented in some future version, however.

ANSI_X3.4-1968

367, ANSI_X3.4-1986, ASCII, CP367, IBM367, ISO646-US, ISO_646.irv:1991, US-ASCII, iso-ir-6 and us are aliases for this charset. Source: ISO 2375 registry.

ASMO_449

ISO_9036, arabic7 and iso-ir-89 are aliases for this charset. Source: ISO 2375 registry.

BS_4730

ISO646-GB, gb, iso-ir-4 and uk are aliases for this charset. Source: ISO 2375 registry.

BS_viewdata

iso-ir-47 is an alias for this charset. Source: ISO 2375 registry.

CP1250

1250, ms-ee and windows-1250 are aliases for this charset. Source: UNICODE 1.0.

CP1251

1251, ms-cyrl and windows-1251 are aliases for this charset. Source: UNICODE 1.0.

CP1252

1252, ms-ansi and windows-1252 are aliases for this charset. Source: UNICODE 1.0.

CP1253

1253, ms-greek and windows-1253 are aliases for this charset. Source: UNICODE 1.0.

CP1254

1254, ms-turk and windows-1254 are aliases for this charset. Source: UNICODE 1.0.

CP1255

1255, ms-hebr and windows-1255 are aliases for this charset. Source: UNICODE 1.0.

CP1256

1256, ms-arab and windows-1256 are aliases for this charset. Source: UNICODE 1.0.

CP1257

1257, WinBaltRim and windows-1257 are aliases for this charset. Source: CEN/TC304 N283.

CSA_Z243.4-1985-1

ISO646-CA, ca, csa7-1 and iso-ir-121 are aliases for this charset. Source: ISO 2375 registry.

CSA_Z243.4-1985-2

ISO646-CA2, csa7-2 and iso-ir-122 are aliases for this charset. Source: ISO 2375 registry.

CSA_Z243.4-1985-gr

iso-ir-123 is an alias for this charset. Source: ISO 2375 registry.

CSN_369103

KOI-8_L2, iso-ir-139 and koi8l2 are aliases for this charset. Source: ISO 2375 registry.

CWI

CWI-2 and cp-hu are aliases for this charset. Source: Computerworld Sza’mita’stechnika vol 3 issue 13 1988-06-29.

DEC-MCS

dec is an alias for this charset. VAX/VMS User’s Manual, Order Number: AI-Y517A-TE, April 1986.

DIN_66003

ISO646-DE, de and iso-ir-21 are aliases for this charset. Source: ISO 2375 registry.

DS_2089

DS2089, ISO646-DK and dk are aliases for this charset. Source: Danish Standard, DS 2089, February 1974.

EBCDIC-AT-DE

Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.

EBCDIC-AT-DE-A

Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.

EBCDIC-CA-FR

Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.

EBCDIC-DK-NO

Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.

EBCDIC-DK-NO-A

Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.

EBCDIC-ES

Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.

EBCDIC-ES-A

Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.

EBCDIC-ES-S

Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.

EBCDIC-FI-SE

Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.

EBCDIC-FI-SE-A

Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.

EBCDIC-FR

Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.

EBCDIC-IS-FRISS

friss is an alias for this charset. Source: Skyrsuvelar Rikisins og Reykjavikurborgar, feb 1982.

EBCDIC-IT

Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.

EBCDIC-PT

Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.

EBCDIC-UK

Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.

EBCDIC-US

Source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987.

ECMA-cyrillic

ECMA-113, ECMA-113:1986 and iso-ir-111 are aliases for this charset. Source: ISO 2375 registry.

ES

ISO646-ES and iso-ir-17 are aliases for this charset. Source: ISO 2375 registry.

ES2

ISO646-ES2 and iso-ir-85 are aliases for this charset. Source: ISO 2375 registry.

GB_1988-80

ISO646-CN, cn and iso-ir-57 are aliases for this charset. Source: ISO 2375 registry.

GOST_19768-87

ST_SEV_358-88 and iso-ir-153 are aliases for this charset. Source: ISO 2375 registry.

IBM037

037, CP037, ebcdic-cp-ca, ebcdic-cp-nl, ebcdic-cp-us and ebcdic-cp-wt are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM038

038, CP038 and EBCDIC-INT are aliases for this charset. Source: IBM 3174 Character Set Ref, GA27-3831-02, March 1990.

IBM1004

1004, CP1004 and os2latin1 are aliases for this charset. Source: CEN/TC304 N283, 1994-02-04.

IBM1026

1026 and CP1026 are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM1047

1047 and CP1047 are aliases for this charset. Source: IBM Character Data Representation Architecture. Registry SC09-1391-00 p 150.

IBM256

256, CP256 and EBCDIC-INT1 are aliases for this charset. Source: IBM Registry C-H 3-3220-050.

IBM273

273 and CP273 are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM274

274, CP274 and EBCDIC-BE are aliases for this charset. Source: IBM 3174 Character Set Ref, GA27-3831-02, March 1990.

IBM275

275, CP275 and EBCDIC-BR are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM277

EBCDIC-CP-DK and EBCDIC-CP-NO are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM278

278, CP278, ebcdic-cp-fi and ebcdic-cp-se are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM280

280, CP280 and ebcdic-cp-it are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM281

281, CP281 and EBCDIC-JP-E are aliases for this charset. Source: IBM 3174 Character Set Ref, GA27-3831-02, March 1990.

IBM284

284, CP284 and ebcdic-cp-es are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM285

285, CP285 and ebcdic-cp-gb are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM290

290, CP290 and EBCDIC-JP-kana are aliases for this charset. Source: IBM 3174 Character Set Ref, GA27-3831-02, March 1990.

IBM297

297, CP297 and ebcdic-cp-fr are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM420

420, CP420 and ebcdic-cp-ar1 are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990. IBM NLS RM p 11-11.

IBM423

423, CP423 and ebcdic-cp-gr are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM424

424, CP424 and ebcdic-cp-he are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM437

437 and CP437 are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM500

500, 500V1, CP500, ebcdic-cp-be and ebcdic-cp-ch are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM850

850 and CP850 are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990. Source: UNICODE 1.0.

IBM851

851 and CP851 are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM852

852, CP852, pcl2 and pclatin2 are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM855

855 and CP855 are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM857

857 and CP857 are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM860

860 and CP860 are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM861

861, CP861 and cp-is are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM862

862 and CP862 are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM863

863 and CP863 are aliases for this charset. Source: IBM Keyboard layouts and code pages, PN 07G4586 June 1991.

IBM864

864 and CP864 are aliases for this charset. Source: IBM Keyboard layouts and code pages, PN 07G4586 June 1991.

IBM865

865 and CP865 are aliases for this charset. Source: IBM DOS 3.3 Ref (Abridged), 94X9575 (Feb 1987).

IBM868

868, CP868 and cp-ar are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM869

869, CP869 and cp-gr are aliases for this charset. Source: IBM Keyboard layouts and code pages, PN 07G4586 June 1991.

IBM870

870, CP870, ebcdic-cp-roece and ebcdic-cp-yu are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM871

871, CP871 and ebcdic-cp-is are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM875

875, CP875 and EBCDIC-Greek are aliases for this charset. Source: UNICODE 1.0.

IBM880

880, CP880 and EBCDIC-Cyrillic are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM891

891 and CP891 are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM903

903 and CP903 are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM904

904 and CP904 are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IBM905

905, CP905 and ebcdic-cp-tr are aliases for this charset. Source: IBM 3174 Character Set Ref, GA27-3831-02, March 1990.

IBM918

918, CP918 and ebcdic-cp-ar2 are aliases for this charset. Source: IBM NLS RM Vol2 SE09-8002-01, March 1990.

IEC_P27-1

iso-ir-143 is an alias for this charset. Source: ISO 2375 registry.

INIS

iso-ir-49 is an alias for this charset. Source: ISO 2375 registry.

INIS-8

iso-ir-50 is an alias for this charset. Source: ISO 2375 registry.

INIS-cyrillic

iso-ir-51 is an alias for this charset. Source: ISO 2375 registry.

INVARIANT

iso-ir-170 is an alias for this charset.

ISO-8859-1

819, CP819, IBM819, ISO8859-1, ISO_8859-1, ISO_8859-1:1987, iso-ir-100, l1 and latin1 are aliases for this charset. Source: ISO 2375 registry.

ISO-8859-10

ISO8859-10, ISO_8859-10, ISO_8859-10:1993, L6, iso-ir-157 and latin6 are aliases for this charset. Source: ISO 2375 registry.

ISO-8859-13

ISO8859-13, ISO_8859-13, ISO_8859-13:1998, iso-baltic, iso-ir-179a, l7 and latin7 are aliases for this charset. Source: ISO 2375 registry.

ISO-8859-14

ISO8859-14, ISO_8859-14, ISO_8859-14:1998, iso-celtic, iso-ir-199, l8 and latin8 are aliases for this charset. Source: ISO 2375 registry.

ISO-8859-15

ISO8859-15, ISO_8859-15, ISO_8859-15:1998, iso-ir-203, l9 and latin9 are aliases for this charset. Source: ISO 2375 registry.

ISO-8859-2

912, CP912, IBM912, ISO8859-2, ISO_8859-2, ISO_8859-2:1987, iso-ir-101, l2 and latin2 are aliases for this charset. Source: ISO 2375 registry.

ISO-8859-3

ISO8859-3, ISO_8859-3, ISO_8859-3:1988, iso-ir-109, l3 and latin3 are aliases for this charset. Source: ISO 2375 registry.

ISO-8859-4

ISO8859-4, ISO_8859-4, ISO_8859-4:1988, iso-ir-110, l4 and latin4 are aliases for this charset. Source: ISO 2375 registry.

ISO-8859-5

ISO8859-5, ISO_8859-5, ISO_8859-5:1988, cyrillic and iso-ir-144 are aliases for this charset. Source: ISO 2375 registry.

ISO-8859-6

ASMO-708, ECMA-114, ISO8859-6, ISO_8859-6, ISO_8859-6:1987, arabic and iso-ir-127 are aliases for this charset. Source: ISO 2375 registry.

ISO-8859-7

ECMA-118, ELOT_928, ISO8859-7, ISO_8859-7, ISO_8859-7:1987, greek, greek8 and iso-ir-126 are aliases for this charset. Source: ISO 2375 registry.

ISO-8859-8

ISO8859-8, ISO_8859-8, ISO_8859-8:1988, hebrew and iso-ir-138 are aliases for this charset. Source: ISO 2375 registry.

ISO-8859-9

ISO8859-9, ISO_8859-9, ISO_8859-9:1989, iso-ir-148, l5 and latin5 are aliases for this charset. Source: ISO 2375 registry.

ISO_10367-box

iso-ir-155 is an alias for this charset. Source: ISO 2375 registry.

ISO_2033-1983

e13b and iso-ir-98 are aliases for this charset. Source: ISO 2375 registry.

ISO_5427

iso-ir-37 is an alias for this charset. Source: ISO 2375 registry.

ISO_5427-ext

ISO_5427:1981 and iso-ir-54 are aliases for this charset. Source: ISO 2375 registry.

ISO_5428

ISO_5428:1980 and iso-ir-55 are aliases for this charset. Source: ISO 2375 registry.

ISO_646.basic

ISO_646.basic:1983 and ref are aliases for this charset. Source: ISO 2375 registry.

ISO_646.irv

ISO_646.irv:1983, irv and iso-ir-2 are aliases for this charset. Source: ISO 2375 registry.

ISO_6937-2-25

iso-ir-152 is an alias for this charset. Source: ISO 2375 registry.

ISO_8859-supp

iso-ir-154 and latin1-2-5 are aliases for this charset. Source: ISO 2375 registry.

IT

ISO646-IT and iso-ir-15 are aliases for this charset. Source: ISO 2375 registry.

JIS_C6220-1969-jp

JIS_C6220-1969, iso-ir-13, katakana and x0201-7 are aliases for this charset. Source: ISO 2375 registry.

JIS_C6220-1969-ro

ISO646-JP, iso-ir-14 and jp are aliases for this charset. Source: ISO 2375 registry.

JIS_C6229-1984-a

jp-ocr-a is an alias for this charset. Source: ISO 2375 registry.

JIS_C6229-1984-b

ISO646-JP-OCR-B and jp-ocr-b are aliases for this charset. Source: ISO 2375 registry.

JIS_C6229-1984-b-add

iso-ir-93 and jp-ocr-b-add are aliases for this charset. Source: ISO 2375 registry.

JIS_C6229-1984-hand

iso-ir-94 and jp-ocr-hand are aliases for this charset. Source: ISO 2375 registry.

JIS_C6229-1984-hand-add

iso-ir-95 and jp-ocr-hand-add are aliases for this charset. Source: ISO 2375 registry.

JIS_C6229-1984-kana

iso-ir-96 is an alias for this charset. Source: ISO 2375 registry.

JIS_X0201

X0201 is an alias for this charset.

JUS_I.B1.002

ISO646-YU, iso-ir-141, js and yu are aliases for this charset. Source: ISO 2375 registry.

JUS_I.B1.003-mac

iso-ir-147 and macedonian are aliases for this charset. Source: ISO 2375 registry.

JUS_I.B1.003-serb

iso-ir-146 and serbian are aliases for this charset. Source: ISO 2375 registry.

KOI-7

Source: Andrey A. Chernov <ache@nagual.pp.ru>.

KOI-8

GOST_19768-74 is an alias for this charset. Source: Andrey A. Chernov <ache@nagual.pp.ru>.

KOI8-R

Source: RFC1489 via Gabor Kiss <kissg@sztaki.hu>. And Andrey A. Chernov <ache@nagual.pp.ru>.

KOI8-RU

Source: http://cad.ntu-kpi.kiev.ua/multiling/koi8-ru/.

KOI8-U

Source: RFC 2319. Mibenum: 2088. Source: http://www.net.ua/KOI8-U/.

KSC5636

ISO646-KR is an alias for this charset.

Latin-greek-1

iso-ir-27 is an alias for this charset. Source: ISO 2375 registry.

MSZ_7795.3

ISO646-HU, hu and iso-ir-86 are aliases for this charset. Source: ISO 2375 registry.

NATS-DANO

iso-ir-9-1 is an alias for this charset. Source: ISO 2375 registry.

NATS-DANO-ADD

iso-ir-9-2 is an alias for this charset. Source: ISO 2375 registry.

NATS-SEFI

iso-ir-8-1 is an alias for this charset. Source: ISO 2375 registry.

NATS-SEFI-ADD

iso-ir-8-2 is an alias for this charset. Source: ISO 2375 registry.

NC_NC00-10

ISO646-CU, NC_NC00-10:81, cuba and iso-ir-151 are aliases for this charset. Source: ISO 2375 registry.

NF_Z_62-010

ISO646-FR, fr and iso-ir-69 are aliases for this charset. Source: ISO 2375 registry.

NF_Z_62-010_(1973)

ISO646-FR1 and iso-ir-25 are aliases for this charset. Source: ISO 2375 registry.

NS_4551-1

ISO646-NO, iso-ir-60 and no are aliases for this charset. Source: ISO 2375 registry.

NS_4551-2

ISO646-NO2, iso-ir-61 and no2 are aliases for this charset. Source: ISO 2375 registry.

NeXTSTEP

next is an alias for this charset. Source: Peter Svanberg - psv@nada.kth.se.

PT

ISO646-PT and iso-ir-16 are aliases for this charset. Source: ISO 2375 registry.

PT2

ISO646-PT2 and iso-ir-84 are aliases for this charset. Source: ISO 2375 registry.

SEN_850200_B

FI, ISO646-FI, ISO646-SE, SS636127, iso-ir-10 and se are aliases for this charset. Source: ISO 2375 registry.

SEN_850200_C

ISO646-SE2, iso-ir-11 and se2 are aliases for this charset. Source: ISO 2375 registry.

T.61-7bit

iso-ir-102 is an alias for this charset. Source: ISO 2375 registry.

baltic

iso-ir-179 is an alias for this charset. Source: ISO 2375 registry. &g1esc x2d56 &g2esc x2e56 &g3esc x2f56.

greek-ccitt

iso-ir-150 is an alias for this charset. Source: ISO 2375 registry.

greek7

iso-ir-88 is an alias for this charset. Source: ISO 2375 registry.

greek7-old

iso-ir-18 is an alias for this charset. Source: ISO 2375 registry.

hp-roman8

r8 and roman8 are aliases for this charset. Source: LaserJet IIP Printer User’s Manual,. HP part no 33471-90901, Hewlet-Packard, June 1989.

latin-greek

iso-ir-19 is an alias for this charset. Source: ISO 2375 registry.

mac-is
macintosh

mac is an alias for this charset. Source: The Unicode Standard ver 1.0, ISBN 0-201-56788-1, Oct 1991.

macintosh_ce

macce is an alias for this charset. Source: Macintosh CE fonts.

sami

iso-ir-158, lap and latin-lap are aliases for this charset. Source: ISO 2375 registry.


Next: , Previous: , Up: Top   [Contents][Index]