Character (computing)

From Simple English Wikipedia, the free encyclopedia

For computers, a character is a letter, number, or punctuation mark that can be shown on a screen, or printed. There are also some unseen characters, called control characters. An example of a control character is the carriage return or line feed that tells the software to start a new line. A glyph is the shape of the character that is seen when put on screen or on paper.

Since computers only use numbers, they use number codes to represent characters. For example, in ASCII, the number 65 represents the letter 'A'. The computer knows, when working in ASCII, to put the glyph for 'A' on the screen when it sees the number 65. The glyph can change shape slightly depending on the font that is used, if it is bold or in italics, etc. But it is still stored as a 65 in the computer, that does not change.

ASCII and Unicode[change | change source]

Chinese characters

One of the early standards for storing characters was ASCII. It uses 7 bits to store each character. This allows 128 characters - enough for the numbers 0-9, the upper case letters A-Z, lower case letters a-z, most punctuation marks including spaces, parentheses and braces, and some control characters. But it is not enough for letters used in other languages, such as umlauts used in German and Scandinavian languages. Some systems added an 8th bit to ASCII, for another 128 characters, but the extra characters can vary from one system to another. And this doesn't address all languages. There are thousands of Chinese characters alone. To store all of these characters, Unicode was developed. It uses 16 bits to store 65,536 characters.

ASCII is still widely used, especially in the English-speaking areas. The Internet was started in English, and until 2010 URL addresses could only contain certain ASCII characters. Some ASCII characters that are not supported are replaced by a '%' and a number for its location in the ASCII table. The space character, for instance is changed to "%20" when it shows up in a URL. In 2010, Unicode was allowed so people who spoke Russian, Greek, Chinese or other languages could see a domain name in their own language.

Related pages[change | change source]