Technical Proposal

harleyteam34
TechnicalProposalPartII.pptx

Data architecture

Data architecture refers to the rules ,policies, standards and models that dictate on how the data shall be stored, arranged, and applied to organizations for its intended use

It includes the following

The architecture defense

The architecture design

The Database Management System

The data architecture design consists of 1. Conceptual – represents all business entities 2. Logical – represents the logic of how entities are related 3. Physical – the realization of the data mechanisms for a specific type of functionality.

Data architecture defense is more about the reduction of downside effect. Database Management System (DMS) is a software that is used for managing and creating databases. It provides a platform for the programmers and other users to play around with data.

1

IMPLEMENTATION PLAN

Introduction

Consolidation of a data warehouse is not a simple process.

There is a lot which needs to be done in order to see the process bear fruits.

The consolidation may be harder especially where we have more than one organization involved (Rainardi, 2013).

Factors like resources, timeline and resources must be considered in the implementation plan

Consolidation of a data warehouse is not a simple process, Factors like resources, timeline and resources must be considered in the implementation plan

3

Timeline

The time required for the implementation will depend on the created warehouse plan.

The deliverables may include user and system documentation, interface document, system test sign off, staffing costs, test plan document, design documents, data warehouse reports, and test plan document.

All of the deliverables serves a different function and is vital.

The end user experience can also be considered as some of the deliverable

It is the most vital and it ascertains if the process was effective or not (Ariyachandra & Watson, 2014).

The time required for the implementation will depend on the created warehouse plan, deliverables may include user and system documentation, interface document, system test sign off, staffing costs, test plan document, design documents

4

Resources

Sybase

A totally new site would need to be composed, created, and actualized to deal with the new debacle recuperation design

Have the capacity to process a lot of information and increment the execution to profit the end client

IBM

These structural objectives address fundamental difficulties concerning database extensibility, flexibility, scalability, and portability.

Computacenter

Associations today rely upon consistent access to corporate data assets for compelling business basic leadership.

Be that as it may, for the IT work, adequately dealing with the sheer volume of information produced by the association consistently is demonstrating a difficult task.

Money

This depends on the size of the business, which determines the approach that will be used in BI. Large companies can afford large amounts to fund a BI initiative as compared to the small or middle-sized ones.

Resources include Sybase, IBM, Computacenter and money. The resources are very fundamental for any data warehouse consolidation

5

Training

Training is highly required among the employees before the consolidated warehouse commences functioning.

They should be trained about SQL servers. Furthermore, they need to be training about Warehousing Data and BI techniques that covers significant concepts like Data visualization, data modeling, DW Architecture, ETL fundamentals and Erwin (Silverston et al., 2014).

This will ensure smooth daily operations and the employees a better position to deliver as per the set goals.

Failure to offer the required training may cause complexities and much inefficiency.

Training is highly required among the employees before the consolidated warehouse commences functioning. They should be trained on SQL, data visualization, data modeling. Failure to offer the required training may cause complexities and much inefficiency

6

Security Policy

Access to the warehouse is restricted except only to the data warehouse managers and other key staffs in the field.

Employees should not have access to records because this may act as a breach on privacy.

I will work with the project team to determine the rights for the users.

No employee should have access to the servers at all costs.

Access to the warehouse is restricted except only to the data warehouse managers and other key staffs in the field. Employees should not have access to records. No employee should have access to the servers at all costs

7

Conclusion

Warehouse consolidation plan may be of great help to a company depending on how it will be implemented.

The process can lead to reduced costs.

However, access to other company’s data can increase insecurity.

Warehouse consolidation plan may be of great help to a company depending on how it will be implemented, the process can lead to reduced costs, access to other company’s data can increase insecurity

8

Data integrity and scrubbing

Introduction.

Brief overview of the topic of discussion.

9

Data issues to be addressed. Heterogeneity of data

Data integration involves coordination of different databases to ensure that they work together smoothly. Some data issues arise while trying to achieve homogenization of data.

One problem that one can be faced with while homogenizing data is varying data formats due to vast quantities of data. This may be as a result of combining hierarchical databases with relational databases.

Data integration mainly involves coordination of different databases with the aim of ensuring that they work together smoothly.

Varying data formats often cause some problems while fusing data from different databases.

One database could be hierarchical while the other is relational.

10

How to fuse data with different formats

To tackle this issue of moving and fusing highly incompatible databases, one can use software that facilitates a series of data access routines.

This enables structured query languages to access DBM data files (Federal Highway Administration, 2017).

These data file can be either relational or non-relational.

Combining date fields with varying date formats is one of the major issues associated with data integration.

It can be done by using the DATEADD() function(Jonathan, 2017).

Software that facilitate a number of data access routines can be used to fuse highly incompatible databases.

11

Bad data

The quality of data being transferred is usually a common issue during data integration.

Failing to clean up data before integration may result into serious data issues afterwards.

To tackle the issue of bad data, it is essential to identify the bad data and fix it in its original source (Todd Hinton, 2016).

According to (Todd Hinton, 2016), integrating bad causes the database to be corrupted and produce results that are of low quality.

Data should be cleaned before integration to avoid serious data issues.

The bad data should be identified before it is transferred and corrected in its source.

Integrating bad data results in a corrupted database.

12

Primary key and Index key

A primary key is an integral part of the database that uniquely specifies a row in a relation table.

The preferable primary key for this database will be the identity card number as everyone has a different one. Another appropriate primary key is the customers’ member numbers.

On the other hand, an index key helps a query to swiftly and efficiently access data from a database.

The indexes for this database could be the sex of the members.

The purpose of a primary key is to uniquely identify a row in a relation table.

The primary key for this database could be members’ identity card number.

An index key makes access of data in a database to be efficient.

The index key should be the sex of the members.

13

Foreign Key

The foreign key is used to refer to a primary key in another table (Ben, 2017).

The appropriate foreign key will be the members’ work id. This is because the members’ identity card number is related to their work id.

This key refers to a primary key which exists in another table.

The members’ work id is the appropriate foreign key.

14

Duplicate data

After merging the databases, it is essential to eliminate the duplicate data. Duplicate can harm businesses in the sense that they cause marketing budget wastage. For example if four malipacks are sent to the same person, the business will pay for duplicate print and postage costs.

The elimination can be done by utilizing software that will help in identifying the duplicate data and deleting it (Towerdata, 2013). Such a software is Dedupe software.

Eliminating duplicate data is essential as it helps to save the company money that could have otherwise been lost.

The Dedupe software can identify and delete duplicate data.

15

References

Ben (2017). What are the Difference Between a Primary Key vs. Foreign Key? Retrieved from https://www.databasestar.com/primary-key-vs-foreign-key/

Federal Highway Administration (2017). Challenges to Data Integration. U.S. Department of Transportation.

Jonathan Drummey (2017). Combine Two Data Fields into Date Format. Retrieved from https://community.tableau.com/message/288410#288410

Todd Hinton (2016). 4 Ways to Solve Data Quality Issues. Retrieved from https://www.redpointglobal.com/blog/4-ways-solve-data-quality-issues/

Towerdata (2013). Top Reasons for Duplicate Data (And 3 Techniques to fix it). Retrieved from https://www.towerdata.com/blog/bid/114711/Top-Reasons-for-Duplicate-And-3-Techniques-to-Fix-It

16

Ariyachandra, T., & Watson, H. J. (2014). Which data warehouse architecture is most successful? Business Intelligence Journal

Kimball, R., & Ross, M. (2011). The data warehouse toolkit: the complete guide to dimensional modeling. John Wiley & Sons.

Mannino, M. V., & Walter, Z. (2013). A framework for data warehouse refreshes policies. Decision Support Systems, 

Rainardi, V. (2013). Data warehouse architecture. Building a Data Warehouse: With Examples in SQL Server,

Silverston, L., Inmon, W. H., & Graziano, K. (2014). The data model resource book: a library of logical data models and data warehouse designs. John Wiley & Sons