Computer Eng

profilerocksolid
disc2.docx

Discussion 2

Data Management

After studying this week’s assigned readings, discussion the following:

1.  What are the business costs or risks of poof data quality? Support your discussion with at least 3 references.

2.  What is data mining? Support your discussion with at least 3 references.

3.  What is text mining? Support your discussion with at least 3 references.

Please use APA throughout. 

Post your initial response no later than Friday of week 3. Please note that initial post not completed on the due date will receive zero grade. See class syllabus for late assignment policies. Review posting/discussion requirements.

Read and respond to at two (2) of your classmates no later than the last day of week 3. In your response to your classmates, consider comparing your articles to those of your classmates. Below are additional suggestions on how to respond to your classmates’ discussions:

· Ask a probing question, substantiated with additional background information, evidence or research.

· Share an insight from having read your colleagues’ postings, synthesizing the information to provide new perspectives.

· Offer and support an alternative perspective using readings from the classroom or from your own research.

· Validate an idea with your own experience and additional research.

· Make a suggestion based on additional evidence drawn from readings or after synthesizing multiple postings.

· Expand on your colleagues’ postings by providing additional insights or contrasting perspectives based on readings and evidence.

1) Respond to the below discussion with 150 words

Last 45 mins ago

1.     What are the business costs or risks of poor data quality?

                            The individuals benefits of the business might make decided at the individuals majority of the data gathered besides poor information way cripples that horrendous nature over an affiliation, poor rate for Perfect gathering. The individuals blended up information prompts a way peril for astuteness examination all things considered orchestrating for wretched decisions settling around catastrophe affiliations. To example, Previously, a demonstrating exertion, done an ace mistypes the phone number alternately email for a uncommon customer that perspective those majority of the data is starting with asserting no usage, settling on reducing ROI to that moving exertion, adversity for customer or a profits of the business beneficial fortune (Miner, 2012). There may be In addition a reasonable if goodness risk for affiliations. Poor dominant part of the information bore will incorporate with a tremendous blend from asserting negative conclusions to relate level participation. Inside the vital spot, conventional particular fulfillment majority of the data that could not clear right hand degrees reexamined will wear a horrifyingly enchantment level negative cash related Also social finishes to an affiliation together. A tick with every last one of more marvelous change beginning for manual variable makes it delicate towards the data with respect to staggering states. On the off danger that which you're endeavoring to examine your affiliation's perfect gas purchasers for the individuals associate for benefits of the business endeavor take a gander at, to example, you can't do it to your database might not require that majority of the data.

2.     What is data mining?

                     Data mining is the path toward managing immense information records to perceive outlines and set up associations with deal with issues through information examination. Information mining instruments empower dares to predict future examples. Mining enormous supplies of information in oil and gas tasks incorporates taking steps to key methodology and advances and getting a handle on better methodologies for considering basic reasoning. To expel a motivation from colossal information stores and change the way decisions are impacted, various executives to have swung to front line information mining strategies nearby persistent logical and information getting ready capacities.

3.     What is text mining?

                     Text mining will be those courses at giving experiences of the unstructured novel data similarly, as a deal with program information. There require help few from asserting procedures for substance mining which join rundown the place the substance will be joined on a dynamic, information extraction the place there might be an extraction of a specific subject and broke down (Strong, 1996). It is a dependably moving structure which consolidates a tremendous measure from guaranteeing experimentation technique on achieves a dominating happen. Substance mining is essentially standard with those changes about electronic long go easygoing correspondence. Substance mining usually controls meets desires whose utmost is that the correspondence for reasonable on goodness Taking in on the other hand assessments, and Furthermore the lifts for attempting with pull back data from such substance reliably might be interesting.

References:

 Rossi, B. (2017, February 16). Will poor data quality jeopardise GDPR compliance? Information Age.

 Miner, G. (2012). Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications. Amsterdam: Academic Press.

 Shiqun Yin Yuhui Qiu1,Chengwen Zhong, 2007. Web Information Extraction and Classification Method .IEEE

 Johannes C. Scholtes. ?Text-Mining: The next step in search technology?, DESI-III Workshop Barcelona, 2009.

 Brill, E. (1992) “A simple rule-based part of speech tagger.” Proc Conf on Applied Natural Language Processing ANLP-92. Trento, Italy

 Ellram, L. M., & Siferd, S. P. (1993). Purchasing: The cornerstone of the Total Cost of Ownership concept. Journal of Business Logistics

 Chakraborty, G., Garla, S., Pagolu, M., & SAS, I. (2013). Text Mining and Analysis: Practical Methods, Examples, and Case Studies Using SAS. Cary, N.C.: SAS Institute.

2) Respond to the below discussion with 150 words

1) Business costs or risks of poor data quality:

We all know how poor data quality affects whole business and all the affiliated models. Below are the few points which are very sparse and mostly followed the same in all other articles but they are the most affected one:

· Poor data may lead to misallocation of resources, which can divert whole approach or system to read in other way. 

· Poor data may lead to other flawed strategies, which will be the key to unsuccessful approach.

· Poor data quality may lead decision makers to make poor decisions.

· Poor data may lead to orders resulting into wrong path.

· Poor data may lead to inventory levels maybe incorrect.

· Poor data may lead to lost sales and other opportunities

· Poor data may lead to customers may become frustrated and driven away.

· It damages the reputation of the firm as data consistency is not maintained.

· It can hamper the decision making and lead to delays due to wrong data.

· There may be loss of revenue as data is the backbone for all forms of analysis and improvement.

· There may be legal issue due to poor data quality as there is requirement to properly store and track the data in required format.

· It reduces the morale and motivation of employees at work and loose trust in data shared by company.

2) Data Mining:

Data mining is the process in which organization uses the raw data and converts it into a form which is more useful for firm and information can be fetched. There are large numbers of software available which are used to convert the raw data to proper structure and then provide meaningful information. It helps to support efficient decision making in the organization. It has become increasingly important in current era with more availability of data. It includes many different sectors where Data Mining used profoundly like:

1. Fraud Detection

2. Market Analysis

3. Customer Retention

4. Production Control

5. Science Exploration

Data Mining Applications:

· Market Analysis and Management.

· Corporate Analysis & Risk Management.

· Fraud Detection.

3) Text Mining:

Text Mining is to process unstructured information, extract meaningful numeric indices from the text, and make the information contained in the text accessible to the various data mining algorithms. Text mining is also referred as text analytics in which high quality information is fetched from raw text. Just like data mining, the information is taken out from text format. It mainly derived using the trend and pattern study in text. Statistical pattern learning is one such method for text mining.

It also includes following areas:

1. Automatic processing of messages, emails, etc

2. Applications for Text Mining

3. Analyzing open-ended survey responses

4. Analyzing warranty or insurance claims, diagnostic interviews, etc.

5. Investigating competitors by crawling their web sites

References:

Marlman, K. S. (2002). A treatise on data quality and a structured approach for improving data quality for an air carrier surveillance program (Order No. 1409067). Available from ProQuest Dissertations & Theses Global. (230821333). Retrieved from https://0-search-proquest-com.library.acaweb.org/docview/230821333?accountid=9864

Liu, J. S. (2011). Study on the method of reducing cost of poor quality in x company (Order No. 10490273). Available from ProQuest Dissertations & Theses Global. (1874383920). Retrieved from https://0-search-proquest-com.library.acaweb.org/docview/1874383920?accountid=9864

Pollock, S. E. (2012). Data quality rules in the analytic health repository (Order No. 1516288). Available from ProQuest Dissertations & Theses Global. (1036597866). Retrieved from https://0-search-proquest-com.library.acaweb.org/docview/1036597866?accountid=9864

Shi, H. J. (2009). The research and implementation on visual data mining technology (Order No. 10525992). Available from ProQuest Dissertations & Theses Global. (1874976864). Retrieved from https://0-search-proquest-com.library.acaweb.org/docview/1874976864?accountid=9864

Li, H. (2010). Design and implementation of vocational college library knowledge services data mining (Order No. 10372376). Available from ProQuest Dissertations & Theses Global. (1868359099). Retrieved from https://0-search-proquest-com.library.acaweb.org/docview/1868359099?accountid=9864

Yang, F. (2011). The research of privacy protection in data mining (Order No. 10555140). Available from ProQuest Dissertations & Theses Global. (1875425103). Retrieved from https://0-search-proquest-com.library.acaweb.org/docview/1875425103?accountid=9864

Chen, Y. (2009). Application research of clustering algorithms in web text mining (Order No. 10399125). Available from ProQuest Dissertations & Theses Global. (1870003838). Retrieved from https://0-search-proquest-com.library.acaweb.org/docview/1870003838?accountid=9864

Hu, F. (2011). Research of semantic text mining based on ontology (Order No. 10500690). Available from ProQuest Dissertations & Theses Global. (1870405991). Retrieved from https://0-search-proquest-com.library.acaweb.org/docview/1870405991?accountid=9864

Mei, Q. (2009). Contextual text mining (Order No. 3406786). Available from ProQuest Dissertations & Theses Global. (288235375). Retrieved from https://0-search-proquest-com.library.acaweb.org/docview/288235375?accountid=9864

Yang, F. (2011). The research of privacy protection in data mining (Order No. 10555140). Available from ProQuest Dissertations & Theses Global. (1875425103). Retrieved from https://0-search-proquest-com.library.acaweb.org/docview/1875425103?accountid=9864