Reply: Discussion Thread: Business Information
The student must then post a reply of at least 400 words by 11:59 p.m. (ET) on Sunday of the assigned Module: Each reply must incorporate at least 1 scholarly citation in APA
format. Any sources cited must have been published within the last five years.
The reply can be from any one of the paragraphs listed below but 400 words with 1 scholarly citation
Reply to : Saeng Su Kwon
Today, I’ll discuss little bit about distributed database. I’ll try to divide into sections to discuss distributed database.
1. An overview of distributed databases
“A distributed database appears to a user as a single database but is, in fact, a set of databases stored on multiple computers. The data on several computers can be simultaneously accessed and modified using a network. Each database server in the distributed database is controlled by its local DBMS, and each cooperates to maintain the consistency of the global database.” (Distributed Databases) In the 1990s, the technology of distributing and storing a database and recognizing it as a single database and using it was recognized as an advanced technology with a high degree of difficulty. In 2000, there were many companies who wanted to research and introduce distributed databases as if they were recognizing cloud computing and SOA (Service Oriented Architecture). As the Database Management System, DBMS, function became stronger and the network speed increased, the distributed database did not spread as much as initially expected, but many databases still use the distributed database through the sharing system between databases over the network. There are many companies using distributed systems and I can tell you big six different companies such as Netflix, Uber, eBay, Zalando, Amazon, and SoundCloud etc. For example I want to share the words that Werner Vogels said who is Amazon CTO right now. He said “The giant, monolithic “bookstore” application and giant database that we used to power Amazon.com limited our speed and agility. Whenever we wanted to add a new feature or product for our customers, like video streaming, we had to edit and rewrite vast amounts of code on an application that we’d designed specifically for our first product—the bookstore. This was a long, unwieldy process requiring complicated coordination, and it limited our ability to innovate fast and at scale” (Werner Vogels, Amazon CTO) Now, I’ll talk about the definition of distributed database and is as follows.
· A database that can be used as a single virtual machine by using a database that is distributed in several places.
· A collection of data logically belonging to the same system, but physically distributed over a computer network. Physical site distribution, logical user integration and sharing. In other words, a distributed database can be defined as a database that maximizes usability/performance by locating the database to multiple nodes in multiple regions using a fast network environment that connects databases.
2. Distributed Database Transparency
“Distribution transparency is the property of distributed databases by the virtue of which the internal details of the distribution are hidden from the users. The DDBMS designer may choose to fragment tables, replicate the fragments and store them at different sites. However, since users are oblivious of these details, they find the distributed database easy to use like any centralized database.” (DDBMS)
To become a distributed database, six transparency requirements must be satisfied.
1. Split Transparency (fragmentation): One logical relation is split into multiple fragments so that a copy of each fragment is stored in multiple sites.
2. Location Transparency: No need to specify where the data you want to use is stored. Location information must be maintained in the System Catalog.
3. Transparency of local ideology: Mapping between local DBMS and physical DB is guaranteed. Names independent of each local system name can be used.
4. Duplicate Transparency: The property of not needing to know if a DB object is duplicated in multiple sites.
5. Failure Transparency: Maintaining transaction atomicity regardless of component (DBMS, Computer) failure.
6. Parallel Transparency: Maintaining consistency of results when performing multiple transactions simultaneously.
As with traditional distributed database construction, it is rare these days to build a database in a distributed environment while satisfying all the above characteristics. Recently, there are more cases of building a database by integrating rather than building a database in a distributed environment. Nevertheless, if the database of the above distributed environment is used appropriately according to the business and regional characteristics, it has the characteristics of providing various advantages. It is being used usefully.
3. Distributed database application method and advantages and disadvantages
“Database systems have evolved from data processing in which each application to define and maintain their own data, to one in which the data are defined and managed centrally. This new orientation leads to independent data, such applications become immune to changes in physical or logical data organization and vice versa.” (Distributed Databases and Distributed Database Management Systems) In this section, I’ll talk about how to apply distributed database and pros and cons about distributed database.
· How to apply distributed database
· The way to use the database in a distributed environment with excellent performance and value in the field is to look at the flow of work and configure the database according to the architectural characteristics according to the work configuration. The purpose is not simply to build a database in a distributed environment, but the ability to selectively design the database distribution structure according to the characteristics of the business. When looking at only these aspects, it can be understood as meaning database structure design (architecture) rather than database distributed design.
· Distributed database pros and cons
· Pros
· Local autonomy, incremental system capacity expansion
· Reliability and Availability
· Efficacy and flexibility
· Fast response speed and reduced communication cost
· Increase the availability and reliability of your data
· Proper sizing of the system
· Increase acceptance of the needs of users in each region
· Cons
· Software development cost
· Increased potential for errors
· Increase in processing cost
· Complexity and cost of design and management
· Irregular response speed
· Difficulty in control
· Threats to data integrity
4. Direction of use of distributed database
Distributed database is an advanced technology applied in the recent database environment where business functions are diversified, and the amount of data increases exponentially. According to the characteristics of the business, the technology that utilizes a distributed database is required. In Why Use Distributed Computing it talks about ‘Can Distributed Computing Enhance my Performance?’ and it talks about how distributed computing works in different machines. “Distributed computing allows different users or computers to share information. Distributed computing can allow an application on one machine to leverage processing power, memory, or storage on another machine. It is possible that distributed computing could enhance performance of a stand-alone application, but this is often not the reason to distribute an application.” (Why Use Distributed Computing, 1996-99)
5. The value of distributed database configuration
The most important value when data is configured in a distributed environment is that it provides fast performance that cannot be provided by an integrated database. By establishing a distributed database environment, it becomes possible to provide fast performance by building a distributed database environment for the reasons of network load and transaction concentration caused by accessing and processing remote servers or other servers. It is for this reason that a database of a distributed environment is built. Data is very important in our lives now and every data has a value. In this article it shares the benefits of using a distributed database architecture. “With data becoming an essential aspect of our lives, distributed databases lie at the heart of every organization's data infrastructure. In most cases, end-users interacting with a web service or a mobile application might not see a distributed database in action” (The Why and How of Distributed Databases, 2020)
6. Distributed database application technique
Types of database distribution include table location distribution, table partition distribution, table replication distribution, and table summary distribution strategy. Among them, it is the most used method of table replication, partitioning and distribution, and this method is the most useful technical method applied to many databases with poor performance. To design a database in a distributed environment, it can be designed in the form of performing integrated data modeling and distributing or replicating tables by region or server according to the business characteristics of each table. Transparency in distributed database system is also the technique you can use in the distributed database system. “With minimal effort, you can develop applications that make an Oracle Database distributed database system transparent to users that work with the system. The goal of transparency is to make a distributed database system appear as though it is a single Oracle Database.” (Distributed Database Application Development)
6.1 Distribution of table
· Table position distribution does not change the table structure. Also, tables are not created as duplicates in other databases. However, the designed tables are positioned differently. The location distribution for each table is used when the form of using information is different for each location. Since the location of the table differs by location, a schematized database document for each location is required to determine the location of the table.
6.2 Table Fragmentation Distribution
· Table Partitioning Distribution is a method of splitting and distributing each table rather than simply locating it in a different place. The method of partitioning and distributing a table is divided into two types according to the criteria for dividing the table. The first is Horizontal Fragmentation, which divides the table into rows, and the second is Vertical Fragmentation, which divides the table into columns.
6.3 Table Replication Distribution
· Table Replication Distribution is a type of creating and managing the same table in different regions or servers at the same time.
In the master database, there is segment replication, in which only the contents of a part of a table in the master database are in other regions or servers, and there is broadcast replication in which the contents of tables in the master database exist in each region or server.
· Segment Replication: It has a unified table in one place (headquarters) and each branch has a row corresponding to the branch. The data that exists in the branch office must exist in the head office. In other words, the head office data is the sum of the branch office data. Not only is it easy to process data in each branch, but the integrated table in the head office is also used for integrated processing of all data, so it is possible to quickly perform a task that does not involve JOINs in multiple tables.
· Broadcast Replication: It has a unified table in one place (headquarters), and each branch has all the same data as the head office. The data that exists in the branch office must exist in the head office. The amount of data in all branches and the amount of data in the head office are the same. Since both the head office and the branch have the same information, there are no special restrictions on the head office or branch or data processing.
· Broadcast replication is also a database distribution technique often used in actual projects. In the case of partial replication, data input, modification, and deletion occur at the branch office, so there are many methods used by the head office, whereas in the case of broadcast replication, data is input, modified, and deleted at the head office and used by the branch office. The difference is, as with partial replication, it takes a lot of time to replicate data and causes a load on the database and server, so it is usually replicated by batch rather than copying by on-line processing.
6.4 Table Summarization Distribution
· Table Summarization Distribution is when data is similar but different types exist between regions or between servers. According to the summary method, rollup summarization is a method that produces integrated data using the same distributed data while having the same table structure, and the integrated data using distributed data of different contents. There is a consolidation summarization method that calculates
7. Examples of improved performance by applying a distributed database
There are many ways to improve performance in distributed database and GeeksforGeeks said this is one of them “We can achieve interquery and intraquery parallelism by executing multiple queries at different sites by breaking up a query into a number of subqueries that basically executes in parallel which basically leads to improvement in performance.” (Advantages of distributed database, 2019) When performing a project, it is often the case that performance is degraded by designing a database without understanding the principle of a simple distributed environment. If the principle of distribution of replication is applied simply, the performance can be improved and designed where there are many business characteristics. Database distribution design is effective when applied in the following cases.
· It should be applied to sites where performance is important.
· If a distributed environment is configured for common code, reference information, master data, etc., the performance is improved.
· This is good when real-time synchronization is not required. Distributed environment can be configured even with near real time business characteristics.
· It is good to distribute the load when the load is concentrated on a specific server.
· When configuring the backup side (Disaster Recovery Site), it can be configured by simply applying the distribution function.
References
Distributed Databases. Distributed databases. (n.d.). Retrieved September 28, 2021, from https://docs.oracle.com/cd/A57673_01/DOC/server/doc/SCN73/ch21.htm
DDBMS - Distribution Transparency. (n.d.). Retrieved September 28, 2021, from https://www.tutorialspoint.com/distributed_dbms/distributed_dbms_distribution_tran sparency.htm.
Distributed Database and Distributed Database Management Systems. Advantages and disadvantages of distributed database systems. (n.d.). Retrieved September 28, 2021, from https://www.ipl.org/essay/Advantages-And-Disadvantages-Of-Distribut ed-Database-Systems-PJGJECNGSWU.
Jim Watson, [email protected] and R. A. (n.d.). [8] why use distributed COMPUTING (part of the CORBA faq, Copyright © 1996-99). Why Use Distributed Computing [UPDATED!]. Retrieved September 28, 2021, from https://www4.cs.fau.de/~geier/corba-faq/why-distrib-computing.html.
The why and how of distributed databases. Fauna. (n.d.). Retrieved September 28, 2021, from https://fauna.com/blog/the-why-and-how-of-distributed-databases.
Oracle® Database Administrator's Guide. (n.d.). Distributed Database Application Development. Distributed database application development. Retrieved September 28, 2021, from https://docs.oracle.com/cd/E18283_01/server.112/e17120/ds_concepts005.htm.
Advantages of distributed database. GeeksforGeeks. (2019, April 30). Retrieved September 28, 2021, from https://www.geeksforgeeks.org/advantages-of-distributed-database/.
6 companies pioneering the use of distributed systems. Scalac. (2020, August 26). Retrieved September 28, 2021, from https://scalac.io/blog/6-companies-using-distributed- systems/.