A database is a system for organizing data. It is a collection of raw data that can be manipulated, sorted, and questioned to produce information. The data can be stored in many ways. Before computers, card files, printed books and other methods were used. Now most data is kept on computer files.
The data in a database is organized in some way. Before there were computers, employee data was often kept in file cabinets. There was usually one card for each employee. On the card, information such as the date of birth or the name of the employee could be found. A database also has such "cards". To the user, the card will look the same as it did in old times, only this time it will be on the screen. To the computer, the information on the card can be stored in different ways. Each of these ways is known as a database model. The most commonly used database model is called relational database model; it uses relations and sets to store the data. Normal users talking about the database model will not talk about relations, they will talk about database tables.
Uses for Database Systems[change | edit source]
Uses for database systems include:
- They store data
- They store special information used to manage the data. This information is called metadata and it is not shown to all the people looking at the data.
- They can solve cases where many users want to access (and possibly change) the same entries of data.
- They manage access rights (who is allowed to see the data, who can change it)
- When there are many users asking questions to the database, the questions must be answered faster. So, the last person to ask a question, can get an answer in reasonable time.
- Certain attributes are more important than others, they can be used to find other data. This is called indexing. An index contains all the important data and can be used to find the other data.
- They ensure that the data always makes sense. There are certain rules that can be added to tell the database system if the data makes sense. One of the rules might say November has 30 days. This means if someone wants to enter November 31 as a date, this change will be rejected.
Changing data[change | edit source]
In databases, some data changes occasionally. There may be problems when data is changed, an error might have occurred. The error might make the data useless. The database system looks at the data, it must fulfill certain requirements. It does this by using a transaction. There are two points in time in the database, the time before the data was changed, and the time after the data was changed. If something goes wrong when changing the data, the database system simply puts the database back into the state before the change happened. This is called rollback. After all the changes are done successfully, they are committed. This means that the data makes sense again; committed changes can no longer be undone
In order to be able to do this, databases follow the ACID principle:
- All. Either all tasks of a given set (called transaction) are done, or none of them is. Known as Atomicity
- Complete. The data in the database always makes sense. There is no half-done (invalid) data. Known as Consistency
- Independent. If many people work on the same data, they will not see (or impact) each other. Each of them has their own view of the database, which is independent of the others. Known as Isolation
- Done. Transactions must be committed, when they are done. Once the committed, they can not be undone. Known as Durability.
Database model[change | edit source]
There are different ways how to represent the data.
- Simple files (called flat files): Simply write the data into a spreadsheet, then save it in the database program.
- Hierarchical model: The data is organized like a tree structure. The interesting data is at the leaves of the tree
- Network model: Use records and sets to store the data
- Relational model: This uses set theory and predicate logic. It is widely used. Data looks like it is organized in tables. These tables can then be joined together so that simple queries can be chosen from them.
- Object relational model: This uses the same data types for the database, as for the (object-oriented) application.
Ways to organize the data[change | edit source]
Like in real life, the same data can be looked at from different perspectives, and it can be organized in different ways. There are different things to consider, when organizing the data:
- Each item of data should be stored as few times as possible. Imagine that an unmarried woman is listed in the county records, State Motor Vehicle Dept, Federal Social Security Dept and International Passport Dept. If she marries, and decides to change her name, all the these departments have to be notified. If all the departments were linked, and her name stored in only one place, then updating is easy.
- If the data is stored in several different databases, it may contradict itself.
- This problem makes finding data slower. If there is a lot of data, this problem of storing one piece of data in many places, will take up a lot of space. In our example there were 4 databases for one person. That will be 8 changes made, if a second person has exactly the same problem.
- If you have this problem, a method called Database Normalisation was developed to solve it. Currently there are 5 Normal forms. These are ways to make a database faster, and make the data take less space.