From Wikipedia, the free encyclopedia
Jump to: navigation, search

A compiler is a program that takes source code and translates it into a more basic language. These simpler languages are either assembly language or machine code. The action of translating the high-level programming language code is called compiling. Code that has been simplified is called compiled code.

For programmers, using a compiler is easier and faster than writing the machine code themselves because there are many different machine codes and they are difficult to understand because of abbreviations. For example, the GNU C Compiler compiles C into machine code.

Compiling the language[change | change source]

As shown in the diagram (at right), the source code of a computer program is read by a lexical analyzer which splits the text into words and symbols known as "tokens". The tokens are analyzed by a parser which looks for grammatical patterns in the use of the tokens. The parser collects the language data for an intermediate-code generator to convert the data into that form of coding. An optimizer reads the intermediate code, which was generated from the parser data, and simplifies or omits extra code, to write a more-efficient language text, which is then changed into the target computer's machine code.

Variants[change | change source]

At the end of each compilation step the partial finished product could be stored and then only processed later on. A language like Java uses this successfully, where they lack the final translation step to instructions the processor understands. They only do the final translation step once the Java program is running on a computer. This is either called "interpreting" or "JIT"ting, depending on the used technique.

Example[change | change source]

For example, the source code might contain an equation, such as "x = 5*10 + 6 + 1". The lexical analyzer would separate each number and symbol (such as "*" or "+") into separate tokens. The parser would note the pattern of tokens, as being an equation. The intermediate-code generator would write a form of coding which defines a storage variable named "x" and assigns the numerical product of 5*10 plus 6 and 1. The optimizer would simplify the calculation, of 5*10+6+1, as being just 57. Hence, the target machine-code generator would set a variable named "x" and put the value 57 into that storage place in the computer's memory, using the instructions of whichever computer chip is being used.