Unit VII Research Paper Research Methods
1
9
Sun Coast Data
Jovan J Maires
Columbia Southern University
Unit IV Scholarly Activity
Dr. Renee Norris-Jones
15 September 2022
Data Analysis: Descriptive Statistics and Assumption Testing
This article is going to utilize descriptive statistics techniques to describe the Sun Coast Remediation data. The purpose of this description is to determine whether the assumptions are achieved to apply the parametric statistical processes.
Correlation: Descriptive Statistics and Assumption Testing
Frequency Distribution Table
|
PM size |
Frequency |
|
0 _ 1 |
8 |
|
2 _ 4 |
24 |
|
5 _7 |
37 |
|
8 _ 10 |
34 |
PM size
Frequency
y
0-1
8
2-4
24
5-7
37
8-10
34
Sick
Days
Frequency
cy
0-2
1
4-7
61
8-9
30
10-12
11
Histogram
1
4
7
10
More
0
5
10
15
20
25
30
35
40
2
7
9
12
More
0
10
20
30
40
50
60
70
Frequency
PM Size Sick days
The histograms for correlation and the annual sick days
Descriptive Statistics Table
Measurement Scale
The numerous metrics used in analyzing the variables investigators use in data processing are referred to as measurement scales. They are crucial in research and statistics because the level of data measurement determines the data analysis technique to be used.
In this particular instance, the measurement scale is Ordinal.
The ordinal scale is a measurement that reports data ordering and ranking without determining the degree of variation between them. Ordinal symbolizes the notion of "order." Ordinal data is also referred to as qualitative data or categorical data. It can be categorized, identified, and listed.
Measure of Central Tendency
A statistic that reflects the specific significance of an overall dataset is known as a central tendency. Examples of central tendency include mode, median, arithmetic mean, and geometric mean, among others.
The mean is a measure of central tendency that takes the average of all values in a data set. The median is better than the mean for data from skewed distributions because it is not influenced by extremely large values (Mondal et al, 2022).
A test of normality for continuous data is an important step in determining measures of central tendency and statistical methods for data analysis. When our data has a normal distribution, parametric tests are applied to make comparisons among the groups; otherwise, nonparametric methodologies are employed.
Skewness and Kurtosis
The skewness for the distribution is near zero which implies that the values are skewed to the right. The kurtosis is less than three implying that it is a broadening of the peak and thickening of the tails.
Evaluation
Considering the descriptive statistic above it is clear that the data is normally distributed indicating that the assumptions for parametric statistical testing have been met.
Simple Regression: Descriptive Statistics and Assumption Testing
Frequency Distribution Table
Expenditure
Frequency
20-500
108
501-1000
76
1001-
1500
27
1501-
2000
11
2001-
2500
1
Time
Frequency
0-50
6
51-100
26
101-200
98
201-300
85
301-400
8
Histogram
|
Frequency |
Training Expenditure
120 100 80 60 40 20 0 500 1000 1500 2000 2275 More Expenditure |
50
100
200
300
400
More
0
20
40
60
80
100
120
Time
Frequency
Descriptive Statistics Table
safety training
expenditure
lost time hours
|
Mean |
595.9843812 |
Mean |
188.004 5 |
|
Standard Error |
31.4770075 |
Standard Error |
4.80308 9 |
|
Median |
507.772 |
Median |
190 |
|
Mode |
234 |
Mode |
190 |
|
Standard Deviation |
470.0519613 |
Standard Deviation |
71.7254 2 |
|
Sample Variance |
220948.8463 |
Sample Variance |
5144.53 6 |
|
Kurtosis |
0.444080195 |
Kurtosis |
-.50122 |
Skewness 0.951331922 Skewness -0.08198
Range 2251.404 Range 350 Minimum 20.456 Minimum 10
Maximum 2271.86 Maximum 360 Sum 132904.517 Sum 41925
Count 223 Count 223
Largest (1) 2271.86 Largest (1) 360
Smallest (1) 20.456 Smallest (1) 10
9.46548
Confidence Level(95.0%) 62.03197147 Confidence Level(95.0%) 4
Measurement Scale
A Nominal Scale is a measurement scale for which figures are only used as labels to distinguish or categorize an object.
Measure of Central Tendency
Since the mean is typically not in the center of a distribution, the median is frequently chosen as the preferable measure of central tendency in these skewed distributions. When the tail on the right side of the distribution is longer than the tail on the left, the distribution is said to be favorably or right skewed.
Skewness and Kurtosis
Considering the skewness and kurtosis values, the distribution the tail on the right side is long indicating that the distribution is right skewed (Braden & Matis, 2022).
Evaluation
The data analysis results in showing that the data has similar extent of variance and hence the assumption for parametric tests is attained.
Multiple Regression: Descriptive Statistics and Assumption Testing
Frequency Distribution Table
|
Decibel |
Frequency |
|
107 - 111 |
51 |
|
112 - 116 |
126 |
|
117 - 121 |
249 |
|
122 - 131 |
786 |
|
132 - 141 |
287 |
Histogram
106
111
116
121
131
141
More
0
200
400
600
800
1000
Sound Level
Decibles
Frequency
Descriptive Statistics Table
|
Decibel |
|
|
|
|
|
Mean |
124.8359 |
|
Standard Error |
0.177945 |
|
Median |
125.721 |
|
Mode |
127.315 |
|
Standard Deviation |
6.898657 |
|
Sample Variance |
47.59146 |
|
Kurtosis |
-0.31419 |
|
Skewness |
-0.41895 |
|
Range |
37.607 |
|
Minimum |
103.38 |
|
Maximum |
140.987 |
|
Sum 187628.4 Count |
1503 |
Measurement Scale
Interval
The interval scale is a quantitative measurement scale with order, significant and equal differences between the two variables, and arbitrary zero presence. It takes measurements of variables at regular intervals along a common scale.
Measure of Central Tendency
Mean
Summary metric that aims to summarize the entirety of a set of data using a single value that corresponds to the median or center of the distribution.
The determination of normalcy is a crucial step in choosing the statistical techniques and measurements of central tendency for data analysis. Parametric tests are employed when our data have a normal distribution; otherwise, nonparametric techniqe utilized to compare the groups (Owuor et al, 2022).
.
Skewness and Kurtosis
The negative values of skewness and kritosis indicate that the data is skewed to the left the left tail is long relative to the right tail, also the distribution is the curve with th same mean and standard deviation.
Evaluation
The assumptions are met since the mean is the measure of central tendency implying that the distribution is not affected by extreme values.
Independent Samples t Test: Descriptive Statistics and Assumption Testing
Frequency Distribution Table
Trainin
g
Frequen
cy
49-60
12
61-70
20
71-80
21
81-90
8
91-100
1
Trainin
g
Frequen
cy
74-80
14
81-85
21
86-90
19
91-95
6
96-100
2
Histogram
80
85
90
95
100
More
0
5
10
15
20
25
Training
Frequency
Descriptive Statistics Table
|
Prior training |
|
|
Revised training |
|
|
Mean |
2 |
|
Mean |
9 |
|
|
1.40278 |
|
0.65947 |
|
|
Standard Error |
8 |
|
Standard Error |
9 |
|
Median |
70 |
|
Median |
85 |
|
Mode |
80 |
|
Mode |
85 |
|
|
11.0455 |
|
5.19274 |
|
|
Standard Deviation |
6 |
|
Standard Deviation |
2 |
|
|
122.004 |
|
26.9645 |
|
|
Sample Variance |
5 |
|
Sample Variance |
7 |
|
Kurtosis |
-0.77668 |
|
Kurtosis |
-0.35254 |
|
|
|
|
|
|
|
|
|
|
|
|
|
Skewness |
-0.0868 |
|
Skewness |
0.14408 5 |
|
Range |
41 |
|
Range |
22 |
|
Minimum |
50 |
|
Minimum |
75 |
|
Maximum |
91 |
|
Maximum |
97 |
|
Sum |
4327 |
|
Sum |
5256 |
|
Count |
62 |
|
Count |
62 |
|
Largest(1) |
91 |
|
Largest(1) |
97 |
|
Smallest(1) |
50 |
|
Smallest(1) |
75 |
|
Confidence Level(95.0%) |
2.80504 8 |
|
Confidence Level(95.0%) |
1.31871 |
Measurement Scale
Interval, the distance between two distinct variables is significant.
Measure of Central Tendency
Mean. This measure represents the center of the distribution of the data indicating the assumption of parametric testing is achieved (Eini & Khaloozadeh, 2022).
Skewness and Kurtosis
The values of skewness and kirtosis indicate that the data is normally distributed.
Evaluation
Since the measure of central tendency is Mean. This measure represents the center of the distribution of the data indicating the assumption of parametric testing is achieved.
Dependent Samples (Paired-Samples) t Test: Descriptive Statistics and Assumption Testing
Frequency Distribution Table
Exposu
re
Frequen
cy
5-15
5
16-25
8
26-35
12
36-45
16
46-56
8
Exposu
re
Frequen
cy
5-15
5
16-25
8
26-35
11
36-45
17
46-56
8
Histogram
Frequency
15
25
35
45
56
More
0
5
10
15
20
Descriptive Statistics Table
|
Pre-exposed μg/dL |
|
|
Post-Exposure μg/dL |
|
|
|
32.857142 |
|
|
33.2857 |
|
Mean |
9 |
|
Mean |
1 |
|
|
1.7523065 |
|
|
1.78142 |
|
Standard Error |
5 |
|
Standard Error |
3 |
|
Median |
35 |
|
Median |
36 |
|
Mode |
36 |
|
Mode |
38 |
|
|
12.266145 |
|
|
12.4699 |
|
Standard Deviation |
8 |
|
Standard Deviation |
6 |
|
|
150.45833 |
|
|
|
|
Sample Variance |
3 |
|
Sample Variance |
155.5 |
|
Kurtosis |
0.5760371 |
|
Kurtosis |
-0.65421 |
|
|
3 |
|
|
|
|
|
- |
|
|
|
|
Skewness |
0.4251096 |
|
Skewness |
-0.48363 |
|
|
5 |
|
|
|
|
Range |
50 |
|
Range |
50 |
|
Minimum |
6 |
|
Minimum |
6 |
|
Maximum |
56 |
|
Maximum |
56 |
|
Sum |
1610 |
|
Sum |
1631 |
|
Count |
49 |
|
Count |
49 |
|
Largest(1) |
56 |
|
Largest(1) |
56 |
|
Smallest(1) |
6 |
|
Smallest(1) |
6 |
|
|
3.5232484 |
|
|
3.58179 |
|
Confidence Level(95.0%) |
5 |
|
Confidence Level(95.0%) |
2 |
Measurement Scale
For this case, interval is applied as the measurement scale is makes it possible to assign a numerical values to any given assessment.
Measure of Central Tendency
Mean as a measure of central tendency
Skewness and Kurtosis
The data distribution is peaked and possesses thick tails on one side and also the tail of the distribution curve is longer on the right side. On the other side, the outliers of the distribution curve are toward the left and far from the mean on the right.
Evaluation
The fact that the mean more specifically reflects the center of the distribution of the data.
ANOVA: Descriptive Statistics and Assumption Testing
Frequency Distribution Table
Air
Frequency
cy
1-3
1
4
7-9
6
10-12
7
12-15
2
Soil
Frequency
cy
5-7
3
8-10
13
10-13
4
Water
Frequency
cy
1-3
1
4-6
10
7-9
5
10-12
4
Training
g
Frequency
1-3
1
4-6
16
7-9
3
Histogram
3
6
9
12
15
More
0
2
4
6
8
Histogram Air
Air
Frequency
7
10
13
More
0
5
10
15
Histogram Soil
Soil
Frequency
3
6
9
12
More
0
2
4
6
8
10
12
Histogram Water
Water
Frequency
Histogram Training
3
6
9
More
0
5
10
15
20
Frequency
Descriptive Statistics Table
|
A = Air |
|
B = Soil |
|
||||
|
Mean |
8.9 |
Mean |
9.1 |
||||
|
|
|
0.68402 |
|
|
|
0.39000 |
|
|
Standard Error |
8 |
Standard Error |
7 |
||||
|
Median |
9 |
Median |
9 |
||||
|
Mode |
11 |
Mode |
8 |
||||
|
|
|
3.05906 |
|
|
|
1.74416 |
|
|
Standard Deviation |
8 |
Standard Deviation |
3 |
||||
|
|
|
9.35789 |
|
|
|
3.0421 |
|
|
Sample Variance |
5 |
Sample Variance |
5 |
||||
|
Kurtosis |
-0.6283 |
Kurtosis |
0.11923 |
||||
|
|
|
|
|
|
|
0.492 |
|
|
Skewness |
-0.36085 |
Skewness |
2 |
||||
|
Range |
11 |
Range |
7 |
||||
|
Minimum |
3 |
Minimum |
6 |
||||
|
Maximum |
14 |
Maximum |
13 |
||||
|
Sum |
178 |
Sum |
182 |
||||
|
Count |
20 |
Count |
20 |
||||
|
Largest (1) |
14 |
Largest (1) |
13 |
||||
|
Smallest (1) |
3 |
Smallest (1) |
6 |
||||
|
Confidence Level(95.0%) |
1.43168 |
Confidence Level (95.0%) |
0.81629 4 |
|
|||
|
|
8 |
|
|
|
|||
|
C = Water |
|
D = Training |
|
||||
|
Mean |
7 |
Mean |
5.4 |
||||
|
|
0.57582 |
|
0.26556 |
||||
|
Standard Error |
9 |
Standard Error |
8 |
||||
|
Median |
6 |
Median |
5 |
||||
|
Mode |
6 |
Mode |
5 |
||||
|
|
2.57518 |
|
1.18765 |
||||
|
Standard Deviation |
5 |
Standard Deviation |
6 |
||||
|
|
6.63157 |
|
1.41052 |
||||
|
Sample Variance |
9 |
Sample Variance |
6 |
||||
|
Kurtosis |
-0.23752 |
Kurtosis |
0.25374 7 |
||||
|
|
|
|
0.7602 |
|
|
|
0.15918 |
|
Skewness |
6 |
Skewness |
3 |
||||
|
Range |
9 |
Range |
5 |
||||
|
Minimum |
3 |
Minimum |
3 |
||||
|
Maximum |
12 |
Maximum |
8 |
||||
|
Sum |
140 |
Sum |
108 |
||||
|
Count |
20 |
Count |
20 |
||||
|
Largest (1) |
12 |
Largest (1) |
8 |
||||
|
Smallest (1) |
3 |
Smallest (1) |
3 |
||||
|
Confidence Level (95.0%) |
1.20522 |
Confidence Level (95.0%) |
0.55584 |
||||
|
|
4 |
|
|
Measurement Scale
Ratio, the measurement of scale is quantitative in nature. Since i has enabled the comparison of the intervals
Measure of Central Tendency
Mean, the measure of central tendency is mean since the distribution requires scores that are numerical values on ratio scale.
Skewness and Kurtosis
The skewness and kritosis for this case is within the range and hence they represent a normal distribution.
Evaluation
The assumptions for parametric testing has been achieved since the measure of central tendency is mean (Kashlak et al, 2022).
References
Braden, P., & Matis, T. (2022). Cornish–Fisher-Based Control Charts Inclusive of Skewness and Kurtosis Measures for Monitoring the Mean of a Process. Symmetry, 14(6), 1176.
Eini, E. J., & Khaloozadeh, H. (2022). Tail variance for generalized skew-elliptical distributions. Communications in Statistics-Theory and Methods, 51(2), 519-536.
Kashlak, A. B., Myroshnychenko, S., & Spektor, S. (2022). Analytic Permutation Testing for functional data ANOVA. Journal of Computational and Graphical Statistics, 1-10.
Mondal, H., Swain, S. M., & Mondal, S. (2022). How to conduct descriptive statistics online: A brief hands-on guide for biomedical researchers. Indian Journal of Vascular and Endovascular Surgery, 9(1), 70.
Owuor, O. S., Benedict, T. J., & Kevin, O. O. (2022). Outlier Detection Technique for Univariate Normal Datasets. American Journal of Theoretical and Applied Statistics, 11(1), 1-12.