research paper
7/25/2020 Originality Report
https://ucumberlands.blackboard.com/webapps/mdb-sa-BB5a31b16bb2c48/originalityReport/ultra?course_id=_116145_1&includeDeleted=true&attem… 1/2
SafeAssign Originality Report Summer 2020 - Intro to Data Mining (ITS-632-06) - Second Bi-Term • Midterm Project • Submitted on Sat, Jul 25, 2020, 5:46 PM
Deepak Kedasi View Report Summary
View Originality Report - Old Design
INCLUDED SOURCES
Sources
Institutional database (2) %100
Student paper
Student paper
Internet (1) %0
Top sources
Attachment 2 midterm project.docx
%100
2
1
Analysis Tools
Data Analytic Tools
Midterm project
Deepak Kedasi
ITS-632-06
University of The Cumberlands
Introduction: The most usual applications of analytical tools are describing a set of data using point,
estimation, decision tree, regression, and many more. The objective of the any statistical or business analytical toll is to describe and analyze the data set and figure out the meaningful information from the raw data. Based on these relevant information you can increase the efficiency and performance and can make logical conclude. Given a set of data, for instance, the data on comparison among attributes of HCAHPS survey. The survey contain a list of the average state for HCAHPS responses. HCAHPS is a national, standardized survey of hospital patients about their experiences during a recent inpatient hospital stay. History of Tool
There are many analytical tools to present and describe the data. It is very tough to shortlist top and
important tools because each tool has own specification and demand. I will mention the well-known 10 tools which are used everywhere commonly. i) R programing
ii) Excel
iii) Tableau public iv) Python
v) SAS
vi) Apache Spark
vii) Rapid miner viii) KNIME
ix) QlikView
x) Splunk
There are many more but the above mentioned tools are very commonly used. However, I will use the
excel tool in the analytical paper because I am very familiar with excel and also I have a strong command over excel to interpret data. From the beginning version the end user programing of macro supported by excel. And user defined functions. In early forms and version of Excel these command were written in a macro language whose announcements had formula language structure and resided in the cells of unique reason macro sheet sheets. Excel is one the most fundamental, basic, essential and generally utilized investigative tool nearly in all businesses and organization. Regardless of whether you are an expert in others analytical tools as I mentioned ten tools of them but still you must have need to use excel.
Review of the Data
The data I have retrieved from the link https://healthdata.gov/dataset/patient-survey-pch-hcahps-pps-
exempt-cancer-hospital-%E2%80%93-state is the list of the average state for HCAHPS responses. In the data set there are seven columns (variables). i) State ii) HCAHPS Measure ID iii) HCAHPS Question iv) HCAHPS Answer Description v) HCAHPS Answer Percent vi) Measure Start Date vii) Measure End Date
All of the given variable in the data set are qualitative except the variable name: HCAHPS Answer Percent. I will consider HCAHPS Answer Percent variable for analysis in this paper because the rest of the variables are qualitative and subjective. The description about each variable is given in the excel sheet. By analyzing and examining the raw data, we can make and draw logical conclusions or even compare, contrast or rank this hospital or establishments based on the specified attribute. Evaluating the status of patient by considering its attributes that affect environment is a very important aspect for the growth and development of any hospital and organization as well. The use of various descriptive statistical measures is one of the most effective ways to examine properly these attributes. To name some, one needs to patient and employee the application of measures of central tendencies, measures of variability, and positions, estimation and even correlation. Once data are gathered and analyzed, one will be aware of the attribute given the most importance by the patient, and also those given the least importance. This paper will focus on methods of estimation and histogram of the data set variable HCAHPS Answer Percent. There are two
1
2
2
2
3
2
7/25/2020 Originality Report
https://ucumberlands.blackboard.com/webapps/mdb-sa-BB5a31b16bb2c48/originalityReport/ultra?course_id=_116145_1&includeDeleted=true&attem… 2/2
types, one is point estimation, and the other is interval estimation (Walpole, 1982). Exploring the Data with the tool
The descriptive summary of the variable HCAHPS Answer Percent is given below:
HCAHPS Answer Percent
Mean 34.48
Standard Error 0.73
Median 20
Mode 5
Standard Deviation 28.80
Sample Variance 829.49
Kurtosis -1.30
Skewness 0.58
Range 89
Minimum 2
Maximum 91
Sum 53000
Count 1537
The table shown important information regarding hospital patients about their experiences during a
recent inpatient hospital stay. HCAHPS Answer Percent average value is 34.38 while the median value is 20. As we can see the data Skewness is 0.58, it means the data is almost normally distributed when the data is normally distributed then the mean value is the best point estimation so for the given variable the mean is the best point estimation. The standard deviation is 28.80 it shows that data has high variability from the average value. The minimum value is 2 and maximum is 91 so the range is (91-2=89) difference between maximum and minimum value. Graphical representation of the variable HCAHPS Answer Percent. Before I make the histogram I have to calculate the number of classes and width of the class. There are total 1537 observation, the number of classes will be calculated using formula.
Where n represent the number of classes. I will create 10 classes And width of the class calculate using below method
2
2
2
I will choose the width = 10
The histogram shown that the data is skewed to the right and the most numbers HCAHPS Answer Percent lie between the range 2 to 22 and less number of HCAHPS Answer Percent lie between the ranges 32 to 52. Confidence interval: 95% confidence interval for HCAHPS Answer Percent
Conclusion: The point estimates on average attitude toward variable HCAHPS Answer Percent is 34.48,
median is 20, Standard Deviation is 28.80, the histogram shows that the data is right skewed and 95% confidence interval of variable HCAHPS Answer Percent is (33.04,35.92)
References
Walpole, R. (1982). Introduction to Statistics. (3rd ed.). Prentice Hall Publication. Downie, N. M. &
Heath, R. W. (1965). Basic Statistical Methods (2nd ed.). Harper & Row Publisher
Reid, H. (2013, August). Introduction to Statistics. SAGE
Publication.https://healthdata.gov/dataset/patient-survey-pch-hcahps-pps-exempt-cancer-hospital- %E2%80%93-state
HCAHPS Answer Percent
Frequency 2-12 12-22 22-32 32-42 42-52 52-62 62-72 72-82 82-92 447 389 109 43 46 76 163 181 83 Cumulative % 2-12 12-22 22-32 32-42 42-52 52-62 62-72 72-82 82-92 0.29082628497072216 0.54391672088484055 0.61483409238776843 0.64281067013662985 0.67273910214703969 0.72218607677293434 0.82823682498373452 0.94599869876382559 1 Classes
Frequency
2
2
2
2 2
2
2
Word Count: Submitted on: Submission UUID: Attachment UUID: 1,076 07/25/20 c6b3f91f-8364-5a12-4765-d6e5855763a2 d0b45159-f1cf-cc70-8fb7-c1ed27cc499b