A distributed database management system (DDBMS) is a centralized software system that manages a distributed database in a manner as if it were all stored in a single location.
A distributed database could also be a database that consists of two or more files located in several sites either on the same network or on entirely different networks. Portions of the database are stored in multiple physical locations and processing is distributed among multiple database
Distributed databases offer some key advantages over centralized databases. Many companies are switching to distributed databases (in which the database, as its name implies, is distributed throughout an array of servers in various locations), for a variety of reasons. Let’s check out a number of the essential advantages of distributed databases, a typical scenario during which they're used, and therefore the different formats during which data is distributed throughout the distributed data system.
Reliability – Building an infrastructure is analogous to investing: diversify to scale back your chances of loss. Specifically, if a failure occurs in one area of the distribution, the whole database doesn't experience a setback.
Security – You can give permissions to single sections of the overall database, for better internal and external protection.
Cost-effective – Bandwidth prices go down because users are accessing remote data less frequently.
Local access – Similarly to #1 above, if there's a failure within the umbrella network, you'll still get access to your portion of the database.
Growth – If you add a replacement location to your business, it’s simple to make a further node within the database, making distribution highly scalable.
Speed & resource efficiency – Most requests and other interactivity with the database are performed at an area level, also decreasing remote traffic.
Responsibility & containment – Because any glitches or failures occur locally, the difficulty is contained and may potentially be handled by the IT staff designated to handle that piece of the company.
·It is employed to make , retrieve, update and delete distributed databases.
·It synchronizes the database periodically and provides access mechanisms by the virtue of which the distribution becomes transparent to the users.
·It ensures that the info modified at any site is universally updated.
·It is employed in application areas where large volumes of knowledge are processed and accessed by numerous users simultaneously.
·It is designed for heterogeneous database platforms.
·It maintains confidentiality and data integrity of the databases.
Factors Encouraging DDBMS
The following factors encourage moving over to DDBMS −
·Distributed Nature of Organizational Units − Most organizations within this times are subdivided into multiple units that are physically distributed over the earth . Each unit requires its own set of local data. Thus, the general database of the organization becomes distributed.
·Need for Sharing of Data − The multiple organizational units often need to communicate with each other and share their data and resources. This demands common databases or replicated databases that should be used in a synchronized manner.
·Support for Both OLTP and OLAP − Online Transaction Processing (OLTP) and Online Analytical Processing (OLAP) work upon diversified systems which may have common data. Distributed database systems aid both these processing by providing synchronized data.
·Database Recovery − one among the common techniques utilized in DDBMS is replication of knowledge across different sites. Replication of knowledge automatically helps in data recovery if database in any site is broken . Users can access data from other sites while the damaged site is being reconstructed. Thus, database failure may become almost inconspicuous to users.
·Support for Multiple Application Software − Most organizations use a spread of application software each with its specific database support. DDBMS provides a consistent functionality for using an equivalent data among different platforms.
1. Homogeneous Database:
In a homogeneous database, all different sites store database identically. The operating system, database management system and the data structures used – all are same at all sites. Hence, they’re easy to manage.
2. Heterogeneous Database:
In a heterogeneous distributed database, different sites can use different schema and software which will cause problems in query processing and transactions. Also, a specific site could be completely unaware of the opposite sites. Different computers may use a special OS , different database application. They may even use different data models for the database. Hence, translations are required for various sites to speak .
Distributed Data Storage
There are 2 ways in which data can be stored on different sites. These are:
In this approach, the whole relation is stored redundantly at 2 or more sites. If the whole database is out there in the least sites, it's a totally redundant database. Hence, in replication, systems maintain copies of data.
This is advantageous as it increases the availability of data at different sites. Also, now query requests are often processed in parallel.
However, it has certain disadvantages as well. Data needs to be constantly updated. Any change made at one site must be recorded at every site that relation is stored alternatively it's going to cause inconsistency. This is a lot of overhead. Also, concurrency control becomes way more complex as concurrent access now needs to be checked over a number of sites.
In this approach, the relations are fragmented (i.e., they’re divided into smaller parts) and each of the fragments is stored in different sites where they’re required. It must be made sure that the fragments are such they will be wont to reconstruct the first relation (i.e, there isn’t any loss of data).
Fragmentation is advantageous as it doesn’t create copies of data, consistency is not a problem.
Fragmentation of relations can be done in two ways:
·Horizontal fragmentation – Splitting by rows – The relation is fragmented into groups of tuples so that each tuple is assigned to at least one fragment.
·Vertical fragmentation – Splitting by columns – The schema of the relation is divided into smaller schemas. Each fragment must contain a standard candidate key so on ensure lossless.