Database normalisation

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Database normalisation is an approach to designing databases which was introduced by Edgar F. Codd in the 1970s. Certain databases, known as relational databases, allow data to be stored in separate groups. Each group is commonly called a table. To provide useful information, these groups are connected to each other. For example, students could be stored in one group, and classes in another group. To show that a student is enrolled in a class, a "relationship" is established from one group to the other. A student could have a relationship to many classes, each of which he or she would be enrolled in, while a class would have a relationship to many students.

A traditional alternative is the "flat file database", where all of the data is grouped together like in a spreadsheet. The problem with flat file databases is that they can have a lot of blank spaces and there is a lot of information that has to be repeated for each entry. This means that the database is bigger than it has to be, and it makes it more likely that the database will contain mistakes. Relational databases, by breaking the data into groups, reduce the chance of mistakes happening and does not take up any more space than necessary. But for it to work it needs to be well designed.

Database normalization is a method of designing good relational databases. There are several "normal forms", each of which have rules which the database should be designed to meet. Codd originally specified three sets of criteria that different databases must meet: first, second and third normal form.

If a relation (or "database table") meets a certain normal form, it is not vulnerable to certain modifications, that will affect data integrity. The drawback of meeting such a set of criteria is usually that querying certain data from the database will become more difficult.