6.27.2008

DATABASE

1)What is database?
A database is a structured collection of records or data. A computer database relies upon software to organize the storage of data. The software models the database structure in what are known as database models. The model in most common use today is the relational model. Other models such as the hierarchical model and the network model use a more explicit representation of relationships (see below for explanation of the various database models).

The definition of a database is a structured collection of records or data that is stored in a computer system. In order for a database to be truly functional, it must not only store large amounts of records well, but be accessed easily. In addition, new information and changes should also be fairly easy to input. In order to have a highly efficient database system, you need to incorporate a program that manages the queries and information stored on the system. This is usually referred to as DBMS or a Database Management System. Besides these features, all databases that are created should be built with high data integrity and the ability to recover data if hardware fails.

Database management systems (DBMS) are the software used to organize and maintain the database. These are categorized according to the database model that they support. The model tends to determine the query languages that are available to access the database. A great deal of the internal engineering of a DBMS, however, is independent of the data model, and is concerned with managing factors such as performance, concurrency, integrity, and recovery from hardware failures. In these areas there are large differences between products.

2)What are Database Models?

Various techniques are used to model data structure. Most database systems are built around one particular data model, although it is increasingly common for products to offer support for more than one model. For any one logical model various physical implementations may be possible, and most products will offer the user some level of control in tuning the physical implementation, since the choices that are made have a significant effect on performance. Here are three examples:

Hierarchical model

In a hierarchical model, data is organized into an inverted tree-like structure, implying a multiple downward link in each node to describe the nesting, and a sort field to keep the records in a particular order in each same-level list. This structure arranges the various data elements in a hierarchy and helps to establish logical relationships among data elements of multiple files. Each unit in the model is a record which is also known as a node. In such a model, each record on one level can be related to multiple records on the next lower level. A record that has subsidiary records is called a parent and the subsidiary records are called children. Data elements in this model are well suited for one-to-many relationships with other data elements in the database.

This model is advantageous when the data elements are inherently hierarchical. The disadvantage is that in order to prepare the database it becomes necessary to identify the requisite groups of files that are to be logically integrated. Hence, a hierarchical data model may not always be flexible enough to accommodate the dynamic needs of an organization.

Network model

The network model tends to store records with links to other records. Each record in the database can have multiple parents, i.e., the relationships among data elements can have a many to many relationship. Associations are tracked via "pointers". These pointers can be node numbers or disk addresses. Most network databases tend to also include some form of hierarchical model. Databases can be translated from hierarchical model to network and vice versa. The main difference between the network model and hierarchical model is that in a network model, a child can have a number of parents whereas in a hierarchical model, a child can have only one parent.

The network model provides greater advantage than the hierarchical model in that it promotes greater flexibility and data accessibility, since records at a lower level can be accessed without accessing the records above them. This model is more efficient than hierarchical model, easier to understand and can be applied to many real world problems that require routine transactions. The disadvantages are that: It is a complex process to design and develop a network database; It has to be refined frequently; It requires that the relationships among all the records be defined before development starts, and changes often demand major programming efforts; Operation and maintenance of the network model is expensive and time consuming.

Examples of database engines that have network model capabilities are RDM Embedded and RDM Server.

Relational model

The basic data structure of the relational model is a table where information about a particular entity (say, an employee) is represented in columns and rows. The columns enumerate the various attributes of an entity (e.g. employee_name, address, phone_number). Rows (also called records) represent instances of an entity (e.g. specific employees).

The "relation" in "relational database" comes from the mathematical notion of relations from the field of set theory. A relation is a set of tuples, so rows are sometimes called tuples. All tables in a relational database adhere to three basic rules.

The ordering of columns is immaterial

Identical rows are not allowed in a table

Each row has a single (separate) value for each of its columns (each tuple has an atomic value).

If the same value occurs in two different records (from the same table or different tables) it can imply a relationship between those records. Relationships between records are often categorized by their cardinality (1:1, (0), 1:M, M:M).

Tables can have a designated column or set of columns that act as a "key" to select rows from that table with the same or similar key values. A "primary key" is a key that has a unique value for each row in the table. Keys are commonly used to join or combine data from two or more tables. For example, an employee table may contain a column named address which contains a value that matches the key of a address table. Keys are also critical in the creation of indexes, which facilitate fast retrieval of data from large tables. It is not necessary to define all the keys in advance; a column can be used as a key even if it was not originally intended to be one.

Relational operations

Users (or programs) request data from a relational database by sending it a query that is written in a special language, usually a dialect of SQL. Although SQL was originally intended for end-users, it is much more common for SQL queries to be embedded into software that provides an easier user interface. Many web applications, such as Wikipedia, perform SQL queries when generating pages.

In response to a query, the database returns a result set, which is the list of rows constituting the answer. The simplest query is just to return all the rows from a table, but more often, the rows are filtered in some way to return just the answer wanted. Often, data from multiple tables are combined into one, by doing a join. There are a number of relational operations in addition to join.

Normal forms

Relations are classified based upon the types of anomalies to which they're vulnerable. A database that's in the first normal form is vulnerable to all types of anomalies, while a database that's in the domain/key normal form has no modification anomalies. Normal forms are hierarchical in nature. That is, the lowest level is the first normal form, and the database cannot meet the requirements for higher level normal forms without first having met all the requirements of the lesser normal form.

3)What is database management systems?

Relational database management systems

An RDBMS implements the features of the relational model outlined above. In this context, Date's Information Principle states:

The entire information content of the database is represented in one and only one way. Namely as explicit values in column positions (attributes) and rows in relations (tuples) Therefore, there are no explicit pointers between related tables.

Post-relational database models

Several products have been identified as post-relational because the data model incorporates relations but is not constrained by the Information Principle, requiring that all information is represented by data values in relations. Products using a post-relational data model typically employ a model that actually pre-dates the relational model. These might be identified as a directed graph with trees on the nodes.

Examples of models that could be classified as post-relational are PICK aka MultiValue, and MUMPS.

Object database models

In recent years, the object-oriented paradigm has been applied to database technology, creating a new programming model known as object databases. These databases attempt to bring the database world and the application programming world closer together, in particular by ensuring that the database uses the same type system as the application program. This aims to avoid the overhead (sometimes referred to as the impedance mismatch) of converting information between its representation in the database (for example as rows in tables) and its representation in the application program (typically as objects). At the same time, object databases attempt to introduce the key ideas of object programming, such as encapsulation and polymorphism, into the world of databases.

A variety of these ways have been tried for storing objects in a database. Some products have approached the problem from the application programming end, by making the objects manipulated by the program persistent. This also typically requires the addition of some kind of query language, since conventional programming languages do not have the ability to find objects based on their information content. Others have attacked the problem from the database end, by defining an object-oriented data model for the database, and defining a database programming language that allows full programming capabilities as well as traditional query facilities.

4)How to access information using Database?

While storing data is a great feature of databases, for many database users the most important feature is quick and simple retrieval of information. In a relational database, it is extremely easy to pull up information regarding an employee, but relational databases also add the power of running queries. Queries are requests to pull specific types of information and either show them in their natural state or create a report using the data. For instance, if you had a database of employees and it included tables such as salary and job description, you can easily run a query of which jobs pay over a certain amount. No matter what kind of information you store on your database, queries can be created using SQL to help answer important questions.

5)How to Store a Database?

Databases can be very small (less than 1 MB) or extremely large and complicated (terabytes as in many government databases), however all databases are usually stored and located on hard disk or other types of storage devices and are accessed via computer. Large databases may require separate servers and locations, however many small databases can fit easily as files located on your computer's hard drive.

6)How to secure a database?

Obviously, many databases store confidential and important information that should not be easily accessed by just anyone. Many databases require passwords and other security features in order to access the information. While some databases can be accessed via the internet through a network, other databases are closed systems and can only be accessed on site.

No comments: