| | | MAJOR LEAGUE BASEBALL |
| NAME | TEAM | POSITION | YEARS IN THE LEAGUE | AGE | HOME RUNS | HEIGHT |
| Manny Machado | Battimol Orioles | 1 | 6 | 25 | 153 | 1.91 |
| Eduardo Nunez | Boston Red Sox | 2 | 8 | 30 | 49 | 1.83 |
| Gleyber Torres | New York Yankees | 2 | 1 | 21 | 9 | 1.85 |
| Daniel Robertson | Tamba Bay Rays | 1 | 1 | 24 | 10 | 1.72 |
| Tray Tulowitzk | Toronto Blue Jays | 1 | 12 | 33 | 224 | 1.91 |
| Ozhaino Albies | Atlanta Braves | 2 | 1 | 21 | 19 | 1.73 |
| Starlin Castro | Miami Morlins | 2 | 8 | 28 | 101 | 1.84 |
| Gravin Cecchini | New York Mets | 1 | 2 | 24 | 1 | 1.88 |
| Aaron Altherr | Philadelphia Philies | 1 | 4 | 27 | 32 | 1.95 |
| Antony Rendon | Warshington Nationals | 1 | 5 | 27 | 81 | 1.83 |
| Yolmer Sanchez | Chicago White Sox | 2 | 4 | 25 | 23 | 1.8 |
| Fransisco Linder | Cleveland Indians | 1 | 3 | 27 | 24 | 1.8 |
| Ian Kinsler | Detroit Tigers | 2 | 12 | 35 | 236 | 1.83 |
| Alsides Esobar | Kansas City Royals | 1 | 10 | 31 | 38 | 1.85 |
| Nick Gordon | Minnesota Twins | 1 | 4 | 22 | 2 | 1.74 |
| Ben Zobrist | Chicago Clubs | 2 | 12 | 37 | 159 | 1.91 |
| Euginio Suarez | Cinccinati Reds | 1 | 4 | 26 | 71 | 1.8 |
| Josh Harrison | Pittsburgh Pirates | 1 | 7 | 30 | 45 | 1.83 |
| Jedd Gyerko | St Louis Cardinals | 2 | 5 | 29 | 103 | 1.78 |
| Carlos Correa | Hauston Astros | 1 | 6 | 23 | 74 | 1.93 |
| Jedd Lowrie | Oakland Athletics | 2 | 10 | 34 | 90 | 1.83 |
| Mitch Haniger | Seattle Marriners | 1 | 2 | 27 | 32 | 1.88 |
| Roughened Odor | Texas Rangers | 2 | 4 | 24 | 89 | 1.81 |
| Nolan Arenado | Colorado Rockies | 1 | 5 | 27 | 156 | 1.88 |
| Alexi Amarista | San Diego Padres | 2 | 7 | 29 | 21 | 1.7 |
| | "1" represents "shortspot" |
| | "2" represents "2nd baseman" |
| A) Measures of central tendancy based on the variable "age" |
| | Column1 |
| | Mean | 27.44 |
| | Median | 27 |
| | Mode | 27 |
| B)Measures of spread based on age |
| | Standard Deviation | 4.26 |
| | Sample Variance | 18.17 |
| | Kurtosis | -0.19 |
| | Skewness | 0.53 |
| | Range | 16 |
| | Minimum | 21 |
| | Maximum | 37 |
| | Sum | 686 |
| | Count | 25 |
| C) Summary of each position |
| | NAME | POSITION |
| | Manny Machado | 1 |
| | Eduardo Nunez | 2 |
| | Gleyber Torres | 2 |
| | Daniel Robertson | 1 |
| | Tray Tulowitzk | 1 |
| | Ozhaino Albies | 2 |
| | Starlin Castro | 2 |
| | Gravin Cecchini | 1 |
| | Aaron Altherr | 1 |
| | Antony Rendon | 1 |
| | Yolmer Sanchez | 2 |
| | Fransisco Linder | 1 |
| | Ian Kinsler | 2 |
| | Alsides Esobar | 1 |
| | Nick Gordon | 1 |
| | Ben Zobrist | 2 |
| | Euginio Suarez | 1 |
| | Josh Harrison | 1 |
| | Jedd Gyerko | 2 |
| | Carlos Correa | 1 |
| | Jedd Lowrie | 2 |
| | Mitch Haniger | 1 |
| | Roughened Odor | 2 |
| | Nolan Arenado | 1 |
| | Alexi Amarista | 2 |
| Comparison of two positions |
| | POSITION | HOME RUNS |
| | 1 | 153 |
| | 1 | 10 |
| | 1 | 224 |
| | 1 | 1 |
| | 1 | 32 |
| | 1 | 81 |
| | 1 | 24 |
| | 1 | 38 |
| | 1 | 2 |
| | 1 | 71 |
| | 1 | 45 |
| | 1 | 74 |
| | 1 | 32 |
| | 1 | 156 |
| | 2 | 49 |
| | 2 | 9 |
| | 2 | 19 |
| | 2 | 101 |
| | 2 | 23 |
| | 2 | 236 |
| | 2 | 159 |
| | 2 | 103 |
| | 2 | 90 |
| | 2 | 89 |
| | 2 | 21 |
| D)Inferential statistics |
| | | Column 1 | Column 2 | Column 3 |
| | Column 1 | 11.48 |
| | Column 2 | 12.44 | 17.45 |
| | Column 3 | 157.59 | 169.10 | 4283.98 |
| | Column 1- Years in The league |
| | column 2- Age |
| | column 3-home runs |
| T test based on age and years in the league |
| | t-Test: Paired Two Sample for Means |
| | | Variable 1 | Variable 2 |
| | Mean | 27.44 | 5.72 |
| | Variance | 18.17 | 11.96 |
| | Observations | 25.00 | 25 |
| | Hypothesized Mean Difference | 0.00 |
| | df | 24.00 |
| | t Stat | 52.93 |
| | P(T<=t) one-tail | 0.00 |
| | t Critical one-tail | 1.71 |
| | P(T<=t) two-tail | 0.00 |
| | t Critical two-tail | 2.06 |
| E) Charts , Graphs and Tables |
| | AGE | HEIGHT |
| | 21 | 1.85 |
| | 21 | 1.73 |
| | 22 | 1.74 |
| | 23 | 1.93 |
| | 24 | 1.72 |
| | 24 | 1.88 |
| | 24 | 1.81 |
| | 25 | 1.91 |
| | 25 | 1.8 |
| | 26 | 1.8 |
| | 27 | 1.95 |
| | 27 | 1.83 | | 686 |
| | 27 | 1.8 |
| | 27 | 1.88 |
| | 27 | 1.88 |
| | 28 | 1.84 |
| | 29 | 1.78 |
| | 29 | 1.7 |
| | 30 | 1.83 |
| | 30 | 1.83 |
| | 31 | 1.85 |
| | 33 | 1.91 |
| | 34 | 1.83 |
| | 35 | 1.83 |
| | 37 | 1.91 |
| THE SUMMARY ANALYSIS OF THE FINDINGS |
| Starting with the measures of central tendancy,it is importantant to perform such analysis |
| The mesures of central tendancy include the mean mode and the median. |
| The purpose of the measures is actually to identify where the cenyrte of this |
| distributions is actually loacated.This values representd the value for any probability distribution. |
| Sometimes they are reffered to as the centre of location.This measures provide |
| a summary of the whole data by just providing the value in the middle ,the average or the most appearing value |
| of the whole data.Now from our data the centre of the values is |
| | Mean | 27.44 |
| | Median | 27 |
| | Mode | 27 |
| The varibles that was used to determine the values of this central tendancy was AGE. |
| The mean age is 27.44 |
| This shows that most of the players in the teams has a mean age of 27. |
| The median age age is also 27 years. This sgows the centre value of age. |
| The most appearang age or the age that is common to the most |
| players is 27 years. |
| Therefor this three measures of central tendancy has actually given us |
| the analysis of the centre values of the whole data set. |
| The measure of spread describes the simililarity or haw varied is the data set based on a certain variable. |
| This measures of spread include the range ,quartiles, variance and statndard deviation. |
| It isalso called a measure of dispersion as it shows how the data are disprsed based on a specific |
| variable type.The main reason of measuring spread is to be able to see the |
| relationship it has with the measures of central tendancy discussed above. |
| This is because any measure of spread gives us an idea of how good the mean actually |
| represents the data.Fro our data the interested variable was the age. Some measures |
| of spread was calculated based on the age as the variable. The results were as follows. |
| | Standard Deviation | 4.26 |
| | Sample Variance | 18.17 |
| | Kurtosis | -0.19 |
| | Skewness | 0.53 |
| of a distribution. It describes the nature of the distribution. The standard normal distribution normally has the value of kurtosis as zero. |
| Positive kurtois indivcates heavy tailed. This in relation to skewness which will be discussed below. | Therefore kurtosis will be -0.19 |
| Therefore data set with high value of kurtois tend to have more outliers. |
| While data set with low kurtois value tend to lackoutliers. Therefore from our data ,the value |
| or kurtosis is -0.19 which is approximately zero. This shows that our data based on age |
| lacks outliers and therefore perfect. |
| The skewness is actually a measure of symetry or majorly lack of symetry. |
| If the data spread looks the same based on the head and the tail,the the data set is symmetrical. |
| Some data set have long tails than others and therefore not symmetrical.IT has to look the same both |
| to the left and to the right. |
| sSkewness can actuallybe quantified so as to describe the extent to which the distribution |
| differs from the normal distribution. |
| The sample variance is the average of the squared difference from the mean.It therefore refewrs to the |
| variation in the sample of a particular statistic. The variation is based on its mean. |
| It therefore shows how far is a set of data value from the mean. |
| Inferential statistics explains inferences about the entire population based on |
| the sample from the population.It allows making of the prediction based on the data. |
| It also explains the variation of the data. Using the |
| up with an hypothesis that can be used to analyse your prediction. Inferential statistics therefoe gives the prdiction. The t test |
| can be used to perform inferential data as used in our data set. Whe the t-statistic value is greater than the p value the we |
| reject the hypothesis. |
| The summary of the data can either be used to sort data so as to explain it in a |
| summarised way. For instance the summary of our data explains the name |
| of the player and the position the players plays in his team. Therefore data is easy |
| to see and understand.The second summery shows the sorting in a manner based |
| on position and the number of home runs. Thos playing as second baseman makes more runs than those playing as shortstop. |
| The graphs explains the comparison between the two variables. |
| The first graph compares the positon of the player and the years in the league. |
| The othe graphs explains the position of the players band the number of runs made. |
| It shows tha thos players who play as second basemen are likely to make |
| more runs as compared to players who play as shortstop. |
| Therefore the graphs try to give a comparison between two variables. |
| The table above gives the variable of the age and the heiaght of the players. |
| It can therefore be easy to read the age and the heights of the players. |
| The variable height and the number of runs were chosen as the additional variables. |