Character encodings

Overview of Unicode allocation and common latin code pages. Compare alternate charsets: ISO · Windows · DOS · Apple · EBCDIC · legacy · symbolsWest · Central · North European · Turkish · Greek · Cyrillic · Hebrew.

Unicode BMP
00102030405060708090A0B0C0D0E0F0
0 control comn basic latin control comn latin1
1 latin extended-A latin extended-B
2 " IPA spacing modifier
3 diacritics greek
4 cyrillic
5 cyrillic+ armenian hebrew
6 arabic
7 syriac arabic+ thaana n'ko
8 samaritan manda syr reserved arabic ext-A
9 devanāgarī bengali
A gurmukhi gujarati
B oriya tamil
C telugu kannada
D malayālam sinhala
E thai lao
F tibetan
10 myanmar georgian
11 hangeul jamo
12 ethiopic
13 eth+ cherokee
14 unified canadian aboriginal syllabics
15
16 ogham runic
17 tagalog hanun buhid tagb khmer
18 mongolian canadian+
19 limbu tai le new tai lü khmer
1A lontara tai tham diacritics+
1B balinese sundanese batak
1C lepcha ol chiki cyr georg+ sn vedic
1D phonetic phonetic+ diacritics+
1E latin extended additional
1F greek+
20 general punctuation suþscript currency overlay
21 letterlike number arrows
22 mathematical symbols
23 miscellaneous technical
24 control OCR enclosed alphanumerics
25 box drawing blocks geometric shapes
26 miscellaneous symbols
27 dingbats maths-A arr
UTF-8
0123456789ABCDEF
0 single byte ASCII
1
2
3
4
5
6
7
8 multi-byte continuation
9
A
B
C (overl.) 2-byte sequence start
D
E 3-byte sequence start
F 4-byte sequence (overflow) 5-byte 6-byte invalid
iso-8859-1
0123456789ABCDEF
0
1
2 ! " # $ % & ' ( ) * + , - . /
3 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4 @ A B C D E F G H I J K L M N O
5 P Q R S T U V W X Y Z [ \ ] ^ _
6 ` a b c d e f g h i j k l m n o
7 p q r s t u v w x y z { | } ~
8
9
A   ¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬ - ® ¯
B ° ± ² ³ ´ µ · ¸ ¹ º » ¼ ½ ¾ ¿
C À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï
D Ð Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ý Þ ß
E à á â ã ä å æ ç è é ê ë ì í î ï
F ð ñ ò ó ô õ ö ÷ ø ù ú û ü ý þ ÿ
iso-8859-15
0123456789ABCDEF
A   ¡ ¢ £ ¥ Š § š © ª « ¬ - ® ¯
B ° ± ² ³ Ž µ · ž ¹ º » Œ œ Ÿ ¿
cp1252
0123456789ABCDEF
8 ƒ ˆ Š Œ Ž
9 ˜ š œ ž Ÿ
cp437
0123456789ABCDEF
0
1 §
8 Ç ü é â ä à å ç ê ë è ï î ì Ä Å
9 É æ Æ ô ö ò û ù ÿ Ö Ü ¢ £ ¥ ƒ
A á í ó ú ñ Ñ ª º ¿ ¬ ½ ¼ ¡ « »
B
C
D
E α ß Γ π Σ σ µ τ Φ Θ Ω δ ϕ ε
F ± ÷ ° · ²  
cp850
0123456789ABCDEF
9 É æ Æ ô ö ò û ù ÿ Ö Ü ø £ Ø × ƒ
A á í ó ú ñ Ñ ª º ¿ ® ¬ ½ ¼ ¡ « »
B Á Â À © ¢ ¥
C ã Ã ¤
D ð Ð Ê Ë È ı Í Î Ï ¦ Ì
E Ó ß Ô Ò õ Õ µ þ Þ Ú Û Ù ý Ý ¯ ´
F - ± ¾ § ÷ ¸ ° ¨ · ¹ ³ ²  

control whitespace diacritic
letter
punctuation
quote
symbol
math currency
numeric greek
latin cyrillic
aramaic
brahmic arabic
syllabic
african japanese cjk chinese
alphabetic
unicode 10.0 proposed deprecated unassigned invalid