Character encodings

Detailed allocation of Unicode blocks. See charsets to compare specific encodings.

Unicode planes
000080100180200280300380400480500580600680700780800880900980A00A80B00B80C00C80D00D80E00E80F00F80
0 ascii latin diac grk cyr arm heb arabic rtl brahmic s-br tibet
1 mm geor jamo ethiopic aboriginal ger brahm mon can brahmic extensions greek
2 ·… symbols maths technical () draw symbols braille arr maths misc ancient ext ·+ radicals
3 japanese cjk+ compat
4 cjk ideographs A
5 cjk unified ideographs
6
7
8
9
A yi lisu vai cyr bam lat-D brahmic ext
B hangeul syllables
C
D surrogates
E private use
F corporate use cjk compat presentation width
10 linear B a num ltr linear A ltr rtl
11 brahmic
12 cuneiform proto-cuneiform indus
13 egyptian hieroglyphs
14 anatolian egyptian
15 bra mandombe american hieroglyphs
16 recent
17 tangut
18
19 khitan jurchen
1A southeast asian
1B kana nushu shuishu proto-elamite shorthands
1C micmac hieroglyphs rongorongo large scripts
1D notational systems math alphanumeric sutton signs notational
1E ltr rtl arabic math
1F game enclosed pictographic arrows pict unassigned
Unicode BMP
00102030405060708090A0B0C0D0E0F0
0 control comn basic latin control comn latin1
1 latin extended-A latin extended-B
2 " IPA spacing modifier
3 diacritics greek
4 cyrillic
5 cyrillic+ armenian hebrew
6 arabic
7 syriac arabic+ thaana n'ko
8 samaritan manda syr reserved arabic ext-A
9 devanāgarī bengali
A gurmukhi gujarati
B oriya tamil
C telugu kannada
D malayālam sinhala
E thai lao
F tibetan
10 myanmar georgian
11 hangeul jamo
12 ethiopic
13 eth+ cherokee
14 unified canadian aboriginal syllabics
15
16 ogham runic
17 tagalog hanun buhid tagb khmer
18 mongolian canadian+
19 limbu tai le new tai lü khmer
1A lontara tai tham diacritics+
1B balinese sundanese batak
1C lepcha ol chiki cyr georg+ sn vedic
1D phonetic phonetic+ diacritics+
1E latin extended additional
1F greek+
20 general punctuation suþscript currency overlay
21 letterlike number arrows
22 mathematical symbols
23 miscellaneous technical
24 control OCR enclosed alphanumerics
25 box drawing blocks geometric shapes
26 miscellaneous symbols
27 dingbats maths-A arr
28 braille
29 supplemental arrows-B mathematical symbols-B
2A supplemental mathematical operators
2B miscellaneous symbols and arrows
2C glagolitic latin-C coptic
2D georgian+ tifinagh ethiopic+ cyrl-A
2E punctuation+ cjk radicals
2F kangxi radicals res idc
30 cjk misc hiragana katakana
31 bopomofo hangeul compat kbn bpmf strokes k+
32 enclosed cjk characters
33 cjk compatibility
34 cjk unified ideographs extension A
4C
4D hexagrams
4E cjk unified ideographs
9F
A0 yi
A3
A4 yi radicals lisu
A5 vai
A6 cyrillic ext-B bamum
A7 tones latin extended-D
A8 sylheti in phags-pa saurashtra deva+
A9 kayah li rejang jamo-A javanese mm-B
AA cham mm-A tai viet mtei+
AB ethiopic-A latin ext-E cherokee+ meithei
AC hangeul syllables
D6
D7 haungeul jamo-B
D8 high surrogates
DB
DC low surrogates
DF
E0 private use
F8
F9 cjk compatibility ideographs
FA
FB presentation
FC arabic presentation forms A
FD ?
FE vs ver ½ comp small arabic presentation B
FF halfwidth & fullwidth forms sp
Unicode SMP
00102030405060708090A0B0C0D0E0F0
100 linear B syllabary linear B ideograms
101 aegean num greek numbers ancient sym phaistos
102 iberian reserved lycian carian coptic
103 italic gothic permic ugarit old persian sh.qs
104 deseret shavian osmanya osage
105 elbasan c albanian vithkuqi todhri
106 linear A
107 cypro-minoan
108 cypriot aram palmr nabataean res numid hatr
109 phoen lydian reserved mer h meroitic cursive
10A kharoshthi s arab n arab balti manichaean
10B avestan parth pahlav psalt pahl book pahl babur
10C old turkic reserved old hungarian
10D rohingya garay byblos
10E reserved rumi reserved elym khwar
10F old sogd sogdian res uyghur
110 brahmi kaithi sora som
111 chakma mahajani sharada sinhal
112 khojki landa multani khudabadi
113 grantha tigalari
114 newar tirhuta tani
115 ranjana siddham
116 modi mong takri jenticha
117 ahom zou pyu
118 dogra sirmauri res warang citi
119 dives akuru vatteluttu nandinagari
11A zanabazar square soyombo res pau cin hau
11B devanāgarī ext-A shar+ res tolong siki khambu rai
11C bhaiksuki marchen balti B
11D masaram gondi gunjala gondi kawi
11E tocharian khotanese res makas
11F leke res chola tamil+
120 cuneiform
123
124 cuneiform numbers early dynastic cuneiform
125 " reserved proto-cuneiform
126 proto-cun numb
127 reserved
12D
12E indus
12F reserved
130 egyptian hieroglyphs
133
134 eg.c
135 egyptian hieroglyphs extended-A
143
144 anatolian hieroglyphs
145
146
147 egyptian hieroglyphs extended-B
14F
150 lampung kerinci res
151 mandombe
154
155 maya hieroglyphs
159
15A reserved
15B
15C aztec pictograms
15F
160 cirth tengwar
161 khema khe prih res moon
162 blissymbols
166
167 bagam iban
168 bamum supplement
169
16A mro mossang tangsa bassa vah
16B pahawh hmong woleai
16C kpelle afaka lk tangsa
16D tikamuli kirat rai reserved kulitan
16E mwangwego medefaidrin lontara+
16F miao lontara b-b ideo
170 tangut ideographs
187
188 tangut components
18A
Unicode SMP
00102030405060708090A0B0C0D0E0F0
18B khitan small
18C
18D khitan ideographs
195
196 jurchen
19A
19B jurchen rad reserved
19C reserved
19D
19E pau cin hau syllabary
1A2
1A3 eskaya
1A6
1A7 res kaidā
1A8 naxi dongba
1AC
1AD naxi geba
1AF
1B0 kana supplement
1B1 kana+A small kana+
1B2 nüshu
1B3 shuishu
1B4
1B5
1B6 proto-elamite
1BB
1BC duployan sh pitman
1BD shorthands?
1BF
1C0 micmac hieroglyphs
1CA
1CB rongorongo
1CD
1CE reserved
1CF
1D0 byzantine musical
1D1 musical symbols
1D2 anc greek music reserved lute flute res mayan
1D3 tai xuan jing rod math alphanumeric+
1D4 mathematical alphanumeric
1D7
1D8 sutton
1D9
1DA
1DB reserved
1DF
1E0 glagol+ pallava chalukya res
1E1 chervang hmong eebee hmong
1E2 western cham beria reserved wancho
1E3 loma
1E4 reserved
1E5 pungchen pungchuŋ marchung brusha
1E6 reserved
1E7
1E8 mende kikakui res
1E9 adlam
1EA reserved
1EB
1EC persian siyaq indic siyaq diwani siyaq
1ED ottoman siyaq reserved
1EE arabic mathematical alphabetic
1EF reserved
1F0 mahjong domino tiles playing cards
1F1 enclosed alphanumeric supplement
1F2 enclosed ideographic supplement
1F3 miscellaneous symbols and pictographs
1F5
1F6 emoticons ornament transport
1F7 alchemical geometric shapes ext
1F8 supplemental arrows-C
1F9 supplemental symbols and pictographs
1FA chess res flag identification res
1FB legacy computing graphics
1FC reserved
1FF

control whitespace diacritic
letter
punctuation
quote
symbol
math currency
numeric greek
latin cyrillic
aramaic
brahmic arabic
syllabic
african japanese cjk chinese
alphabetic
unicode 10.0 proposed deprecated unassigned invalid