Mojibake

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
This is what a website can look like if the wrong font encoding is used.
The Japanese Wikipedia article for Mojibake uses UTF-8 encoding. This screenshot shows what i looks like, when it is decoded using the standard Winfdows CP1252 enocding.

Mojibake (文字化け, pronounced /modʑibake/) is the name for incorrect, unreadable characters shown when computer software fails to show text correctly. When using computers, text is econded using a character encoding. In transfer, each character is replaced by its position (or number) in the encoding. To display the character again, the position is again replaced by the character. When the original encoding is not specified, a different character may be used when the number is again replaced with the character for display. Unicode was introduced to solve this problem: UTF-8 is able to encode most common characters in 2 bytes.

Before Unicode was introduced, other character encodings were used. As an example, ISO-8859 contains 15 different encodings. These are the same for the characters commonly used in English. They have several "blocks" of "special characters", which are filed differently for each encoding.

Etymology[change | change source]

Mojibake is a Japanese word. The word 文字化け ([moʥibake]) is composed of two parts. 文字 (moji) means letter, character. 化け (bake), from the verb 化ける (bakeru), means to appear in disguise, to take the form of, to change for the worse. Literally, it means "character mutation".