statistics report

necolas00073
ProjectOneJupyterScript.docx

Project One: Data Visualization, Descriptive Statistics, Confidence Intervals

This notebook contains the step-by-step directions for Project One. It is very important to run through the steps in order. Some steps depend on the outputs of earlier steps. Once you have completed the steps in this notebook, be sure to write your summary report.

You are a data analyst for a basketball team and have access to a large set of historical data that you can use to analyze performance patterns. The coach of the team and your management have requested that you use descriptive statistics and data visualization techniques to study distributions of key performance metrics that are included in the data set. These data-driven analytics will help make key decisions to improve the performance of the team. You will use the Python programming language to perform the statistical analyses and then prepare a report of your findings to present for the team’s management. Since the managers are not data analysts, you will need to interpret your findings and describe their practical implications.

There are four important variables in the data set that you will study in Project One.

Variable

What does it represent?

pts

Points scored by the team in a game

elo_n

A measure of the relative skill level of the team in the league

year_id

Year when the team played the games

fran_id

Name of the NBA team

The ELO rating, represented by the variable elo_n, is used as a measure of the relative skill of a team. This measure is inferred based on the final score of a game, the game location, and the outcome of the game relative to the probability of that outcome. The higher the number, the higher the relative skill of a team.

In addition to studying data on your own team, your management has assigned you a second team so that you can compare its performance with your own team's.

Team

What does it represent?

Your Team

This is the team that has hired you as an analyst. This is the team that you will pick below. See Step 2.

Assigned Team

This is the team that the management has assigned to you to compare against your team. See Step 1.

Reminder: It may be beneficial to review the summary report template for Project One prior to starting this Python script. That will give you an idea of the questions you will need to answer with the outputs of this script.

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Step 1: Data Preparation & the Assigned Team

game_id

year_id

fran_id

pts

opp_pts

elo_n

opp_elo_n

game_location

game_result

0

199511030CHI

1996

Bulls

105

91

1598.2924

1531.7449

H

W

1

199511040CHI

1996

Bulls

107

85

1604.3940

1458.6415

H

W

2

199511070CHI

1996

Bulls

117

108

1605.7983

1310.9349

H

W

3

199511090CLE

1996

Bulls

106

88

1618.8701

1452.8268

A

W

4

199511110CHI

1996

Bulls

110

106

1621.1591

1490.2861

H

W

printed only the first five observations...

Number of rows in the data set = 246

Step 2: Pick Your Team

game_id

year_id

fran_id

pts

opp_pts

elo_n

opp_elo_n

game_location

game_result

0

201211020ATL

2013

Hawks

102

109

1532.7664

1524.9491

H

L

1

201211040OKC

2013

Hawks

104

95

1551.4714

1640.7040

A

W

2

201211070ATL

2013

Hawks

89

86

1555.2542

1551.0842

H

W

3

201211090ATL

2013

Hawks

89

95

1547.6481

1667.3300

H

L

4

201211110LAC

2013

Hawks

76

89

1540.6207

1587.7803

A

L

printed only the first five observations...

Number of rows in the data set = 246

Step 3: Data Visualization: Points Scored by Your Team

Step 4: Data Visualization: Points Scored by the Assigned Team

Step 5: Data Visualization: Comparing the Two Teams

Step 6: Descriptive Statistics: Relative Skill of Your Team

Your Team's Relative Skill in 2013 to 2015

-------------------------------------------------------

Mean = 1539.22

Median = 1513.5

Variance = 4917.03

Standard Deviation = 70.12

Step 7 - Descriptive Statistics - Relative Skill of the Assigned Team

Assigned Team's Relative Skill in 1996 to 1998

-------------------------------------------------------

Mean = 1739.8

Median = 1751.23

Variance = 2651.55

Standard Deviation = 51.49

Step 8: Confidence Intervals for the Average Relative Skill of All Teams in Your Team's Years

Confidence Interval for Average Relative Skill in the years 2013 to 2015

------------------------------------------------------------------------------------------------------------

95% confidence interval (unrounded) for Average Relative Skill (ELO) in the years 2013 to 2015 = (1502.0236894390478, 1507.1824625533618)

95% confidence interval (rounded) for Average Relative Skill (ELO) in the years 2013 to 2015 = ( 1502.02 , 1507.18 )

Probability a team has Average Relative Skill LESS than the Average Relative Skill (ELO) of your team in the years 2013 to 2015

----------------------------------------------------------------------------------------------------------------------------------------------------------

Which of the two choices is correct?

Choice 1 = 0.3797

Choice 2 = 0.6203

Step 9 - Confidence Intervals for the Average Relative Skill of All Teams in the Assigned Team's Years

Confidence Interval for Average Relative Skill in the years 1996 to 1998

------------------------------------------------------------------------------------------------------------

95% confidence interval (unrounded) for Average Relative Skill (ELO) in the years 1996 to 1998 = (1487.6565859527095, 1493.6465501840999)

95% confidence interval (rounded) for Average Relative Skill (ELO) in the years 1996 to 1998 = ( 1487.66 , 1493.65 )

Probability a team has Average Relative Skill LESS than the Average Relative Skill (ELO) of Bulls in the years 1996 to 1998

----------------------------------------------------------------------------------------------------------------------------------------------------------

Answer = 0.9732