R Studio

profileRahulChand
Module4-PortfolioMilestone.docx

1

2

Module 4: Milestone

Option #1: Credit Data Mining

Prinu Pappachan

Colorado State University Global

Course Code: MIS510-1

Eric Straw

Summary

I chose the credit data mining because of its complexity. Credit card industry is one of the industries that is growing very rapidly. The sheer volume of data that is collected from consumer to establish their credit score or standing is astronomical. There are many models that are been used to accurately evaluate an applicant credit score like the credit classification mode. Since this is something that I been learning in this course, I have chosen this option.

Project Outline

I will be following the textbook process for data mining which include 4-7 steps

1.Business Understanding - First, I will establish the objective of the whole exercise by listing out the components that are crucial for this project with a clear view on the end goal. This will be done by providing a background on the business domain and a detailed explanation on what the business need is. This phase also will determine the data mining goal and what success would look like.

2. Obtaining the data and understanding it - I will then explore all the data at disposal and evaluate the quality of the data. This would be done initially by accumulating the data and then defining them. This is where we build the semantic layer. Once the data is staged, the data is explored and verified.

3.Data Preparation - This process would need data curation, integration, and formatting. In this phase, it will be determined if any new data attributes are needed. In this step I will also reduce the data dimension.

4.Modeling - In this phase, I will determine the best approach for the analysis and the right model that would be used. By this phase, I would know what algorithm would be used. The data need to be categorized as training, test, and validation. Post the data categorization, comes the model building phase. Once the model is built, it will be assessed.

5.Evaluation - Here I will evaluate the result and review process and summarize the findings and would do tweaks to ensure that the model meets the business success criteria. Here is where I will also determine on the next steps.

Question:

Please could you confirm which data set I can use for this project?

Reference material to be used

Author, A. A., & Author, B. B. (2020). Title of article. Title of article. https://doi.org/xxxx

Galit Shmueli & Peter C. Bruce & Inbal Yahav & Nitin R. Patel & Kenneth C & Lichtendahl Jr (2018). Data Mining for Business Analytics. Overview of Data Mining Process chapter 2(2.3) 18-21.

David L. Olson & Bongsug(Kevin) Chae.(2020).Direct marketing decision support through predictive customer response modeling.http://cbafiles.unl.edu/public/cbainternal/facStaffUploads/DSS20121.pdf

T.I. Lytvynenko.(2016).PROBLEM OF DATA ANALYSIS AND FORECASTING USING DECISION TREES METHOD.http://ceur-ws.org/Vol-1631/220-226.pdf

Ali, Aouad & Adam N. Elmachtoub & Kris J. Ferreira & Ryan, McNellis.(2020).Market Segmentation Trees.https://www.hbs.edu/faculty/Publication%20Files/Market%20Segmentation%20Trees%20(003)_9d76802b-5ca5-4039-a35b-e1df3a23b3bc.pdf

Sohrabi, Babak; Raeesi Vanani, Iman; Nasiri, Narges; Ghassemi Rudd, Armin. In Tourism Management Perspectives. July 2020

Olson, David L.& Chae, Bongsug(Kevin). In Decision Support Systems. December 2012

Wang, Chong; Zhao, Shuai; Kalra, Achir; Borcea, Cristian; Chen, Yi. Journal of the Association for Information Science & Technology. Aug2018, Vol. 69 Issue 8, p1007-1022. 16p. 1 Diagram, 6 Charts, 12 Graphs.

Eric T. Bradlow; Alan M. Zaslavsky. In: Journal of the American Statistical Association. 94(445):43-52; American Statistical Association, 1999

Dean Abbott.(2014). Applied Predictive Analytics. Principles and techniques for the professional data analyst.