<!–

main_leaderboard, all: [728,90][970,90][320,50][468,60]

–>

HTML Character Sets

In this article, we will discuss HTML charset. Web browsers must recognize the character set used by HTML pages in order to display them correctly.

ASCII to UTF-8

Character encoding formed with ASCII. Utilizing ASCII, you could use 128 characters on the internet: numbers (0-9), English letters (A-Z), and some special characters: ! $ + – ( ) @ < > in HTML charset.

The standard HTML character set or HTML charset for HTML 4 was ISO-8859-1. A total of 256 character codes were supported by this character set. UTF-8 was also supported in HTML 4. The original character set for Windows was ANSI (Windows-1252). There are 32 extra characters in ANSI compared to ISO-8859-1.

Almost all HTML charset / characters and symbols are covered by the UTF-8 character set, which is the HTML5 specification!



The HTML charset Attribute

When it comes to HTML charset, a web browser must be able to determine which character set is used in the HTML page.

There is a specification for this in the <meta> tag:

<meta charset=”UTF-8″>

Differences Between Character Sets

Below is a comparison of the HTML charset and the HTML character set:

NumbASCIIANSI8859UTF-8Overview
87WWWWLatin capital letter W
88XXXXLatin capital letter X
89YYYYLatin capital letter Y
90ZZZZLatin capital letter Z
91[[[[Left square bracket
92Reverse solidus
93]]]]Right square bracket
94^^^^Circumflex accent
95____Low line
200ÈÈÈLatin capital letter E with grave
32A Blank Space
33!!!!An exclamation mark
34A quotation mark
35####A number symbol
36$$$$Dollar symbol
37%%%%Percent symbol
38&&&&Ampersand
39Apostrophe
40((((Left parenthesis
41))))Right parenthesis
42****An Asterisk
43++++A Plus symbol
44,,,,A comma
45Hyphen-minus
46....Full stop (End of line)
47////Solidus
480000Digit zero (Number 0)
491111Digit one (Number 1)
502222Digit two (Number 2)
513333Digit three (Number 3)
524444Digit four (Number 4)
535555Digit five (Number 5)
546666Ddigit six (Number 6)
557777Digit seven (Number 7)
568888Digit eight (Number 8)
579999Digit nine (Number 9)
58::::A colon
59;;;;A semicolon
60<<<<less-than symbol
61====equals symbol
62>>>>greater-than symbol
63????A question mark
64@@@@A commercial at
65AAAALatin capital letter A
66BBBBLatin capital letter B
67CCCCLatin capital letter C
68DDDDLatin capital letter D
79OOOOLatin capital letter O
80PPPPLatin capital letter P
81QQQQLatin capital letter Q
82RRRRLatin capital letter R
83SSSSLatin capital letter S
84TTTTLatin capital letter T
85UUUULatin capital letter U
86VVVVLatin capital letter V
201ÉÉÉLatin capital letter E with acute
202ÊÊÊLatin capital letter E with circumflex
203ËËËLatin capital letter E with diaeresis
204ÌÌÌLatin capital letter I with grave
216ØØØLatin capital letter O with stroke
217ÙÙÙLatin capital letter U with grave
218ÚÚÚLatin capital letter U with acute
219ÛÛÛLatin capital letter U with circumflex
220ÜÜÜLatin capital letter U with diaeresis
221ÃÃÃLatin capital letter Y with acute
222ÞÞÞLatin capital letter Thorn
235ëëëLatin small letter e with diaeresis
236ìììLatin small letter i with grave
237íííLatin small letter i with acute
238îîîLatin small letter i with circumflex
69EEEELatin capital letter E
70FFFFLatin capital letter F
71GGGGLatin capital letter G
72HHHHLatin capital letter H
73IIIILatin capital letter I
74JJJJLatin capital letter J
75KKKKLatin capital letter K
76LLLLLatin capital letter L
77MMMMLatin capital letter M
78NNNNLatin capital letter N
239ïïïLatin small letter i with diaeresis
240ðððLatin small letter eth
241ñññLatin small letter n with tilde
242òòòLatin small letter o with grave
243óóóLatin small letter o with acute
244ôôôLatin small letter o with circumflex
245õõõLatin small letter o with tilde
246öööLatin small letter o with diaeresis
223ßßßLatin small letter sharp s
224ÃÃÃLatin small letter a with grave
225áááLatin small letter a with acute
226âââLatin small letter a with circumflex
227ãããLatin small letter a with tilde
228äääLatin small letter a with diaeresis
229åååLatin small letter a with ring above
230æææLatin small letter ae
231çççLatin small letter c with cedilla
232èèèLatin small letter e with grave
233éééLatin small letter e with acute
234êêêLatin small letter e with circumflex
254þþþLatin small letter thorn
255ÿÿÿLatin small letter y with diaeresis
205ÃÃÃLatin capital letter I with acute
206ÃŽÃŽÃŽLatin capital letter I with circumflex
207ÃÃÃLatin capital letter I with diaeresis
208ÃÃÃLatin capital letter Eth
209ÑÑÑLatin capital letter N with tilde
210Ã’Ã’Ã’Latin capital letter O with grave
211ÓÓÓLatin capital letter O with acute
212ÔÔÔLatin capital letter O with circumflex
96````Grave accent
97aaaaLatin small letter a
98bbbbLatin small letter b
99ccccLatin small letter c
100ddddLatin small letter d
101eeeeLatin small letter e
102ffffLatin small letter f
103ggggLatin small letter g
104hhhhLatin small letter h
105iiiiLatin small letter i
106jjjjLatin small letter j
107kkkkLatin small letter k
108llllLatin small letter l
109mmmmLatin small letter m
110nnnnLatin small letter n
111ooooLatin small letter o
112ppppLatin small letter p
113qqqqLatin small letter q
114rrrrLatin small letter r
115ssssLatin small letter s
116ttttLatin small letter t
117uuuuLatin small letter u
118vvvvLatin small letter v
119wwwwLatin small letter w
120xxxxLatin small letter x
121yyyyLatin small letter y
122zzzzLatin small letter z
123{{{{Left curly bracket
124||||Vertical line
125}}}}Right curly bracket
126~~~~Tilde
127DELDelete
128€Euro symbol
129ÂÂÂNOT USED
130‚single low-9 quotation mark
131Æ’Latin small letter f with hook
132„double low-9 quotation mark
133…horizontal ellipsis
134â€dagger
135‡double dagger
136ˆmodifier letter circumflex accent
137‰per mille sign
138ÅLatin capital letter S with caron
139‹single left-pointing angle quotation mark
140Å’Latin capital ligature OE
141ÂÂÂNOT USED
142ŽLatin capital letter Z with caron
143ÂÂÂNOT USED
144ÂÂÂNOT USED
145‘Left single quotation mark
146’Right single quotation mark
147“Left double quotation mark
148â€Right double quotation mark
149•Bullet
247÷÷÷division sign
248øøøLatin small letter o with stroke
249ùùùLatin small letter u with grave
250úúúLatin small letter u with acute
251ûûûLatin small letter with circumflex
252üüüLatin small letter u with diaeresis
253ýýýLatin small letter y with acute
150–en dash
151—em dash
152Ëœsmall tilde
153â„¢Trade mark sign
154Å¡Latin small letter s with caron
155›Single right-pointing angle quotation mark
156Å“Latin small ligature oe
157ÂÂÂNOT USED
158žLatin small letter z with caron
159ŸLatin capital letter Y with diaeresis
160no-break space
161¡¡¡inverted exclamation mark
162¢¢¢cent sign
163£££pound sign
164¤¤¤currency sign
165¥¥¥yen sign
166¦¦¦broken bar
167§§§section sign
168¨¨¨diaeresis
169©©©copyright sign
170ªªªfeminine ordinal indicator
171«««left-pointing double angle quotation mark
172¬¬¬not sign
173­­­soft hyphen
174®®®registered sign
175¯¯¯macron
176°°°degree sign
177±±±plus-minus sign
178²²²superscript two
179³³³superscript three
180´´´acute accent
181µµµmicro sign
182¶¶¶pilcrow sign
183···middle dot
184¸¸¸cedilla
185¹¹¹superscript one
186ºººmasculine ordinal indicator
187»»»right-pointing double angle quotation mark
188¼¼¼vulgar fraction one quarter
189½½½vulgar fraction one half
190¾¾¾vulgar fraction three quarters
191¿¿¿inverted question mark
192ÀÀÀLatin capital letter A with grave
193ÃÃÃLatin capital letter A with acute
194ÂÂÂLatin capital letter A with circumflex
195ÃÃÃLatin capital letter A with tilde
196ÄÄÄLatin capital letter A with diaeresis
197Ã…Ã…Ã…Latin capital letter A with ring above
198ÆÆÆLatin capital letter AE
199ÇÇÇLatin capital letter C with cedilla
213ÕÕÕLatin capital letter O with tilde
214ÖÖÖLatin capital letter O with diaeresis
215×××multiplication sign

ASCII characters:

When we talk about HTML charset, the values 128 to 255 are not used in ASCII. Control characters in ASCII range from 0 to 31 (and 127).

For letters, digits, and symbols, ASCII uses values from 32 to 126.


Character set ISO-8859-1:

128 to 159 are not operated in ISO-8859-1. From 0 to 127, ISO-8859-1 is identical to ASCII in HTML charset.

The values from 160 to 255 of ISO-8859-1 are identical to those of UTF-8 when it comes to HTML charset.


ANSI characters – Windows 1252:

If it comes to HTML charset, characters in the range of 128 to 159 are proprietary to ANSI. The values 0 through 127 of ANSI are identical to those of ASCII.

For values between 160 and 255, ANSI is identical to UTF-8 when it comes to HTML charset.


UTF-8 Character Set:

When we talk about HTML charset, a value between 128 and 159 is not used by UTF-8. More than ten thousand characters are supported in UTF-8, starting with the value 256.

For values 0 to 127, UTF-8 is identical to ASCII.

For values between 160 and 255, UTF-8 is identical to both ANSI and 8859-1 in HTML charset.

Benefits of HTML Character Sets

HTML character sets have numerous benefits that can improve the appearance, accessibility, and clarity of content in an HTML document. Here are some of the benefits of using HTML character sets:

  1. Language Support: Using the appropriate character set can ensure that content in different languages is displayed correctly. This is because different languages use different characters, and using the appropriate character set can help ensure that all characters in the language are displayed properly.
  1. Symbol Support: Different character sets include various symbols and special characters that can be used to enhance the visual appearance of content and make it more engaging and readable. Using the appropriate symbols can help convey meaning more effectively.
  1. Compatibility: Using a standardized character set, such as Unicode, ensures that content will be displayed consistently across different devices and operating systems. This is because all modern operating systems and web browsers are designed to be fully compatible with Unicode.
  1. Accessibility: Character sets can be used to improve the accessibility of content for individuals with visual impairments or other disabilities. Certain characters or symbols can provide additional cues or context to help understand the content, making it easier to read and comprehend.
  1. Clarity: Using the appropriate character set and symbols can improve the clarity and readability of content. This can make it easier for users to quickly scan and understand the information presented, leading to a better user experience.
  1. Consistency: Using a consistent character set throughout an HTML document ensures that all characters are displayed properly and helps maintain a consistent visual style and tone throughout the content. This can enhance the overall professionalism and quality of the content.
Leave your reaction below as a source of appreciation or suggestions for improvements for the betterment of this site.
We value your feedback.
+1
0
+1
0
+1
0
+1
0
+1
0
+1
0
+1
0

Subscribe To Our Newsletter
Enter your email to receive a weekly round-up of our best posts. Learn more!
icon

Leave a Reply

Your email address will not be published. Required fields are marked *