The Library of Congress >> Especially for Librarians and Archivists >> Standards
MARC Standards
MARC 21 HOME >> Specifications >> Character Sets >> Part 5

MARC 21 Specifications for Record Structure, Character Sets, and Exchange Media

Code Table Extended Latin (ANSEL)

December 2007

The first column in this table contains the MARC-8 code (in hex) for the character as coming from the G0 graphic set, the second column contains the MARC-8 code (in hex) for the character as coming from the G1 graphic set, the third column contains the UCS/Unicode 16-bit code (in hex), the fourth column contains the UTF-8 code (in hex) for the UCS characters, the fifth column contains a representation of the character (where possible), the sixth column contains the MARC character name, followed by the UCS name. If the MARC name is the same as or very similar to the UCS name, only the UCS name is given. For some tables alternate encodings in Unicode and UTF-8 are given. When that occurs the alternate Unicode and alternate UTF-8 columns follow the character name.

Revised June 2004 to add the Eszett (M+C7) and the Euro Sign (M+C8) to the MARC-8 set.

Revised September 2004 to change the mapping from MARC-8 to Unicode for the Ligature (M+EB and M+EC) from U+FE20 and U+FE21 to U+0361.

Revised September 2004 to change the mapping from MARC-8 to Unicode for the Double Tilde (M+FA and M+FB) from U+FE22 and U+FE23 to U+0360.

Revised March 2005 to change the mapping from MARC-8 to Unicode for the Alif (M+2E) from U+02BE to U+02BC.


Not all characters display in all browsers. We have attempted to allow for font families that show each character set, but you must have one of these fonts on your computer. See the W3C site for a discussion of fonts: http://www.w3.org/TR/REC-CSS2/fonts.html#generic-font-families.

MARC-8MARC-8
as C1
UCSUTF-8CHARC?NAMEALTALT UTF-8
880098C298˜NON-SORT BEGIN / START OF STRING
89009CC29CœNON-SORT END / STRING TERMINATOR
8D200DE2808DJOINER / ZERO WIDTH JOINER
8E200CE2808CNON-JOINER / ZERO WIDTH NON-JOINER
21A10141C581ŁUPPERCASE POLISH L / LATIN CAPITAL LETTER L WITH STROKE
22A200D8C398ØUPPERCASE SCANDINAVIAN O / LATIN CAPITAL LETTER O WITH STROKE
23A30110C490ĐUPPERCASE D WITH CROSSBAR / LATIN CAPITAL LETTER D WITH STROKE
24A400DEC39EÞUPPERCASE ICELANDIC THORN / LATIN CAPITAL LETTER THORN (Icelandic)
25A500C6C386ÆUPPERCASE DIGRAPH AE / LATIN CAPITAL LIGATURE AE
26A60152C592ŒUPPERCASE DIGRAPH OE / LATIN CAPITAL LIGATURE OE
27A702B9CAB9ʹSOFT SIGN, PRIME / MODIFIER LETTER PRIME
28A800B7C2B7·MIDDLE DOT
29A9266DE299ADMUSIC FLAT SIGN
2AAA00AEC2AE®PATENT MARK / REGISTERED SIGN
2BAB00B1C2B1±PLUS OR MINUS / PLUS-MINUS SIGN
2CAC01A0C6A0ƠUPPERCASE O-HOOK / LATIN CAPITAL LETTER O WITH HORN
2DAD01AFC6AFƯUPPERCASE U-HOOK / LATIN CAPITAL LETTER U WITH HORN
2EAE02BCCABCʼALIF / MODIFIER LETTER APOSTROPHE
30B002BBCABBʻAYN / MODIFIER LETTER TURNED COMMA
31B10142C582łLOWERCASE POLISH L / LATIN SMALL LETTER L WITH STROKE
32B200F8C3B8øLOWERCASE SCANDINAVIAN O / LATIN SMALL LETTER O WITH STROKE
33B30111C491đLOWERCASE D WITH CROSSBAR / LATIN SMALL LETTER D WITH STROKE
34B400FEC3BEþLOWERCASE ICELANDIC THORN / LATIN SMALL LETTER THORN (Icelandic)
35B500E6C3A6æLOWERCASE DIGRAPH AE / LATIN SMALL LIGATURE AE
36B60153C593œLOWERCASE DIGRAPH OE / LATIN SMALL LIGATURE OE
37B702BACABAʺHARD SIGN, DOUBLE PRIME / MODIFIER LETTER DOUBLE PRIME
38B80131C4B1ıLOWERCASE TURKISH I / LATIN SMALL LETTER DOTLESS I
39B900A3C2A3£BRITISH POUND / POUND SIGN
3ABA00F0C3B0ðLOWERCASE ETH / LATIN SMALL LETTER ETH (Icelandic)
3CBC01A1C6A1ơLOWERCASE O-HOOK / LATIN SMALL LETTER O WITH HORN
3DBD01B0C6B0ưLOWERCASE U-HOOK / LATIN SMALL LETTER U WITH HORN
40C000B0C2B0°DEGREE SIGN
41C12113E28493SCRIPT SMALL L
42C22117E28497SOUND RECORDING COPYRIGHT
43C300A9C2A9©COPYRIGHT SIGN
44C4266FE299AFMUSIC SHARP SIGN
45C500BFC2BF¿INVERTED QUESTION MARK
46C600A1C2A1¡INVERTED EXCLAMATION MARK
47C700DFC39FßESZETT SYMBOL
48C820ACE282ACEURO SIGN
60E00309CC89̉CPSEUDO QUESTION MARK / COMBINING HOOK ABOVE
61E10300CC80̀CGRAVE / COMBINING GRAVE ACCENT (Varia)
62E20301CC81́CACUTE / COMBINING ACUTE ACCENT (Oxia)
63E30302CC82̂CCIRCUMFLEX / COMBINING CIRCUMFLEX ACCENT
64E40303CC83̃CTILDE / COMBINING TILDE
65E50304CC84̄CMACRON / COMBINING MACRON
66E60306CC86̆CBREVE / COMBINING BREVE (Vrachy)
67E70307CC87̇CSUPERIOR DOT / COMBINING DOT ABOVE
68E80308CC88̈CUMLAUT, DIAERESIS / COMBINING DIAERESIS (Dialytika)
69E9030CCC8ČCHACEK / COMBINING CARON
6AEA030ACC8ÅCCIRCLE ABOVE, ANGSTROM / COMBINING RING ABOVE
6BEB0361CDA1͡CLIGATURE, FIRST HALF / COMBINING DOUBLE INVERTED BREVEFE20EFB8A0
6CECNote 1CLIGATURE, SECOND HALF / COMBINING LIGATURE RIGHT HALFFE21EFB8A1
6DED0315CC95̕CHIGH COMMA, OFF CENTER / COMBINING COMMA ABOVE RIGHT
6EEE030BCC8B̋CDOUBLE ACUTE / COMBINING DOUBLE ACUTE ACCENT
6FEF0310CC90̐CCANDRABINDU / COMBINING CANDRABINDU
70F00327CCA7̧CCEDILLA / COMBINING CEDILLA
71F10328CCA8̨CRIGHT HOOK, OGONEK / COMBINING OGONEK
72F20323CCA3̣CDOT BELOW / COMBINING DOT BELOW
73F30324CCA4̤CDOUBLE DOT BELOW / COMBINING DIAERESIS BELOW
74F40325CCA5̥CCIRCLE BELOW / COMBINING RING BELOW
75F50333CCB3̳CDOUBLE UNDERSCORE / COMBINING DOUBLE LOW LINE
76F60332CCB2̲CUNDERSCORE / COMBINING LOW LINE
77F70326CCA6̦CLEFT HOOK (COMMA BELOW) / COMBINING COMMA BELOW
78F8031CCC9C̜CRIGHT CEDILLA / COMBINING LEFT HALF RING BELOW
79F9032ECCAE̮CUPADHMANIYA / COMBINING BREVE BELOW
7AFA0360CDA0͠CDOUBLE TILDE, FIRST HALF / COMBINING DOUBLE TILDEFE22EFB8A2
7BFBNote 2CDOUBLE TILDE, SECOND HALF / COMBINING DOUBLE TILDE RIGHT HALFFE23EFB8A3
7EFE0313CC93̓CHIGH COMMA, CENTERED / COMBINING COMMA ABOVE (Psili)

Note 1: The Ligature that spans two characters is constructed of two halves in MARC-8: EB (Ligature, first half) and EC (Ligature, second half). The preferred Unicode/UTF-8 mapping is to the single character Ligature that spans two characters, U+0361. The single character Ligature is encoded between the two characters to be spanned. The two half Ligatures in Unicode, to which the Ligature has been mapped since 1996, are indicated in the mapping as alternatives, but their use is not recommended. It is expected that font support for the single character Ligature mark will be more easily obtained than for the two halves.

Note 2: The Double Tilde that spans two characters is constructed of two halves in MARC-8: FA (Double Tilde, first half) and FB (Double Tilde, second half). The preferred Unicode/UTF-8 mapping is to the single character Double Tilde that spans two characters, U+0360. The single character Double Tilde is encoded between the two characters to be spanned. The two half Double Tildes in Unicode, to which the MARC8 Double Tilde has been mapped since 1996, are indicated in the mapping as alternatives, but their use is not recommended. It is expected that font support for the single character Double Tilde mark will be more easily obtained than for the two halves.

Go to top of document


MARC 21 HOME >> Specifications >> Character Sets >> Part 5
The Library of Congress> > Especially for Librarians and Archivists >> Standards
(2007-12)
Contact Us