BI
Sports Analytics OVERVIEW
August 2019
‹#›
1
Welcome!
The goal of this deck is to provide faculty and students with a gentle introduction to the world of sports analytics
Our mission is to get you excited about learning analytical techniques (big data, stats, predictive modeling) so you can do class or club projects, create interesting capstone projects, and maybe leverage those things to get a job in the sports industry!
For faculty, our goal is to facilitate your use of sports examples to make your lectures even more interesting, and to encourage you to reach out to your own athletic departments as sources of projects
For students, our goal is to make learning analytics FUN!
‹#›
Types of Analytics Shift from History/Stats to Future/Analytics
Just like in industry, we have two broad categories of analytics
Descriptive – looking at history
Basketball: does Lebron James usually drive left or right?
Golf: do pro players have a harder time making putts on the last hole of a tournament when it means they will be in/out of the money?
Prescriptive – what’s the future?
American Football: it’s 3rd and 4 on the opponent 20 yard line. What play has the highest likelihood of success?
Hockey: when we have a penalty play with 6 on 5, given this opponent, what combination of players will give the highest probability of success?
‹#›
want to show the difference between Descriptive and Prescriptive analytics. Historically, sports has gathered a lot of stats on what happened, back-wards looking. But just like in the business world, building predictive models is where the value lies. In the examples on this page, I focused on Descriptive and Prescriptive questions for Team Ops. If you want to have a discussion, ask if they could build a similar set of questions for Business Ops. (You could also give this as a homework assignment).
3
How to make sense of all the ways to catalogue the users of analytics
A Taxonomy of Sports Analytics
‹#›
Start with People: Who Can Benefit from Sports Data/Analytics? How Would You Organize These?
Teams
Players
Coaches
Trainers
Leagues
Opponent Analysts
Referees
Owners
Marketing
Sales
Facility
Operations
Finance
Sports Social Media
Agents
Ticketing
Admissions
Can you think of other people in sports? E.g., ESPN broadcasters!
Equipment Managers
Scouts
Athletic Directors
Compliance Officers
Sponsors
Donor Development
‹#›
the more you think about it, the more people need to use data to do their daily job. Use this page to see if students can come up with examples of how each of these categories of people in the sports ecosystem uses data. Note that most of these positions have analogues in every sport – so their use of analytics is a transferable skill.
5
Taxonomy
We will focus on the use of analytics at the college level in a typical athletic department. Many of these users of analytics have direct analogues at the pros.
The categories map to particular people/jobs:
Athletic Directors – administrative analytics
Business Operations – ticket sales and financial concerns like scholarships and donors
Team Operations – coaches, recruiters, medical staff, trainers
League Commissioners – scheduling, tournament design
‹#›
6
6 Areas Where Analytics Can Drive Better Decisions
School
Front Office
Back Office
Conference
Fan / Donor
Analytics
Team Tactics
Analytics
Health/Safety
Analytics
Roster/Recruiting
Analytics
Athletic Department Administration Analytics
League Analytics
‹#›
Most Teams Divide the Analytics Users Into 2 Buckets
FRONT OFFICE
The people who are customer-facing, where customers could be fans, donors, business box suite buyers, and media
Other Front Office people are those who worry about the money and affordability. For typical sports organizations face cost constraints by leagues (pro drafts) or compliance and scholarship rules by the NCAA
BACK OFFICE
You will immediately think of coaches, but there are also recruiters/scouts, training and medical staff
The recruiting function spans both areas, actually – coaches/scouts and financials
In addition, there is the Office of the Athletic Director with administrative analytics for oversight functions, plus the League Commissioners
‹#›
Sample Decisions: Front Office
Ticket pricing
Season ticket holder retention
Fan engagement
Sponsorship valuation
Tools:
Customer relationship management
Digital marketing
Market research
Data visualization
‹#›
Sample Decisions – Team Operations
3 Focus Areas: Player acquisition, development, deployment
Who to recruit? Who to cut? Who to trade?
What combinations of players have the highest chance of success?
What plays to run under what situations?
What weaknesses do the opponents have, how to exploit?
What training approaches work best? How to develop athletes?
Keep them healthy?
Think “360 Degree View of the Athlete
Also think about the Athlete sequence of events, from being recruited to graduating or being cut
‹#›
same thing, except now we focus on Team Operations.
A key point to make is that these are not “new” decisions – they have always been made, but often without the benefit of data/deeper insights.
10
Examples
Later in this section, you’ll find a tab for each of these functions with a longer Powerpoint deck that walks through various examples
In the rest of this deck, we’ll give you a sampler of analytics for the Front and Back Office. Most of these projects involve:
Data Visualization, Dashboards
Predictive Models, often Regressions to pick out key Factors
Statistics
In a later deck, we will show you several Student Projects that have been done in various part of the Taxonomy
‹#›
Fan / Donor Analytics
‹#›
Dashboards for the Front Office
From the vendor JumpForward; there are many others!
Source: http://www.jumpforward.com/predictive-analytics-college-athletic-departments/#ad
‹#›
Early study: San Francisco Giants 2010 data for 12 games in 2010
10 of 29 regression model factors influence ticket prices:
Team Performance
Opponent from Same Division
Individual Player Reputations
Seat Location
Home Team Performance - Past 10 Games
Opponent Made Playoffs Previous Year
What’s His Earned Run Average?
Number of All Stars on Opponent’s Roster
Time-Related Variables
Game Start Time
Part of the Season
Days Before the Game
Which Pitcher?
© 2016 Teradata
Pricing in Sports - Previous Work on Single Tickets: Major League Baseball – Dynamic Pricing
‹#›
NBA Season Ticket - Retention Model Factors?
Positive Correlations
Tenure
Attendance
Discounts vs. Cost on
Secondary Market
Team Performance
15
Negative Factors
Business Account
Number of Seats
Distance
Time of Game, Day of Week
Source: MIT SSAC 2016 Conference, Presentation Notes – Talk by Matt Wolf
February 2018
+
-
What are the positive and negative factors that drive an NBA fan with a season ticket to renew?
‹#›
Donor Study – thesis at Bryant University
The analysis includes statistical models intended to identify which characteristics make an
individual likely to transition from non-donor to donor status, what ask techniques are most
successful for a philanthropic campaign, which individuals are most likely to provide large
donations, and which individuals will give consecutive gifts over several years. Statistical
modeling builds on current research within the field of university development office data
mining; it serves as an evaluation of several studies that indicate that a negative growth rate in
giving occurs around the retirement age; this does not appear to be the case at this particular
institution. In addition, it builds upon evidence suggesting which majors at predominantly
business colleges have the strongest likelihood of providing large gifts to their alma mater.
http://digitalcommons.bryant.edu/cgi/viewcontent.cgi?article=1003&context=honors_mathematics
‹#›
Donor Study – Thesis at JMU
http://commons.lib.jmu.edu/cgi/viewcontent.cgi?article=1118&context=diss201019
Sample Factors – which drive donations?
Attitudes towards universities as charities
Demographics: income, degree,
SAT Scores
Loans vs. scholarships
Satisfaction with student experience
Participation while on campus
Participation as an alum in events
Donations
Avoidance behaviors
Channel effectiveness
Customizing communications
‹#›
Recruiting Analytics
‹#›
Recruiting Tool - FrontRush
‹#›
Football Recruiting
This was a video created a while ago as a mockup of the kinds of tests and data that could be used to recruit wide receivers in football
Thanks to help from the Bryant University football coach, staff, and players to playing actors for this!
WATCH:
https://www.teradatauniversitynetwork.com/About-Us/Whats-New/BSI-Sports-Analytics-Precision-Football
‹#›
Team Tactics Analytics
‹#›
Basketball: Shot Heat Map for Lebron James
22
February 2018
Source: Grantland, The Shape of Cavs to Come, http://grantland.com/the-triangle/lebron-james-kevin-love-kyrie-irving-cavs-offense
Individual Athlete
Analysis in Sports
‹#›
See also http://grantland.com/features/how-lebron-james-transformed-game-become-highly-efficient-scoring-machine/
this visual from Grantland (a key magazine that covers basketball) shows what you can do with the information from each game. This is a shot analytic that shows, for a particular player, the spots on the court from which he shoots. You can see that LeBron James (who plays for the Cleveland Cavaliers) often shoots the ball near the basket (at the bottom of the picture). But it’s also noteworthy that he tends to like to “go left” when he dribbles down the court – which may be valuable information for a defender, especially if the percent of shots he makes on the left (41%) is higher than when he on the right (32%).
22
Great Machine Learning Application to Basketball: Second Spectrum (LA-Based)
23
February 2018
Sources: https://www.ted.com/talks/rajiv_maheswaran_the_math_behind_basketball_s_wildest_moves and http://www.secondspectrum.com/
‹#›
American Football – A Predictive Model
Field Goal Kicking – What Factors Matter?
MIT Sloan Sports Analytics Conference, February 2013
Research Paper Track
See also the 25 minute video on YouTube at : https://www.youtube.com/watch?v=scEhJdBytu8
‹#›
24
Going for 3 Points – Which Factors Matter? Binomial Logistic Regression Model for All FGs in 2000-2011 Seasons
Environmental Factors
distance,
cold temperature (<50),
field surface (artificial)
altitude (>4000ft)
precipitation,
wind (>10mph)
humidity (>60%)
Psychological Factors
regular vs. postseason,
situational pressure,
home/away,
whether kicker was iced
11,896 attempts,
3410 games, 51 stadiums
YES!
YES!
YES!
YES!
YES!
YES!
NOPE!
NOPE!
NOPE!
NOPE!
NOPE!
Source: ://www.sloansportsconference.com/wp-content/uploads/2013/Going%20for%20Three%20Predicting%20the%20Likelihood%20of%20Field%20Goal%20Success%20with%20Logistic%20Regression.pdf
‹#›
Video http://www.sloansportsconference.com/?p=10200
this example shows how an analytic model was built to determine which of many factors determine the success of field goal kicking. Should a team go for a 3 point play, and what does that depend on? Distance from the field goal, obvious, but do environmental factors like temperature, and weather like rain or snow matter? How about whether they are playing on grass or artificial turf? At an elevation like “Mile High” Denver vs. Miami? And does wind matter? And what about the psychological factors – like whether it is a regular game or a playoff game? Does home vs. away field matter? Crowd noise? Does “icing the kicker” (taking a timeout to cause the field goal kicker to have to “think about” the kick) matter? This is an animated slide – you can have some fun asking the students to guess which factors predict field goal success, then show them the (hopefully surprising) result of the model. On the last page, it shows in a very practical way that the decision a coach makes depends on the stadium – some are notoriously hard or easy with regard to field goals.
25
Visual Occlusion Training for Baseball Peter Fadde - Southern Illinois University
GameSense Sports research:
Pick up cues in the pitcher's delivery
Tell the type of pitch and where it will end up over the plate
Decide early whether to swing or adjust the swing
Results: a 10-17% improvement, used by Minor League team in summer 2017, use by a Florida camp in 2019
26
YouTube: https://www.youtube.com/watch?v=wvjas1oS5zg
Vimeo: https://vimeo.com/78455467
February 2018
Source: https://www.sporttechie.com/gamesense-sports-pitch-recognition-video-occlusion/
‹#›
26
Visualizing Soccer Plays Disney Research
February 2018
27
Source: http://www.sloansportsconference.com/wp-content/uploads/2015/02/SSAC15-RP-Finalist-Quality-vs-Quantity.pdf
See also http://www.sloansportsconference.com/wp-content/uploads/2014/02/2014_SSAC_Win-at-Home-Draw-Away.pdf
Why is the penalty kick success rate so high?
Are corner kicks a higher-payoff than open play in terms of goals scored?
‹#›
shows the results from a Disney Research paper at the MIT forum. Note that these visuals on different play types are quite similar to what could be done for football or basketball. You can assign the paper (available online) as a reading assignment if you want
27
Health and Safety Analytics
‹#›
Custom Sports Fueling (Hydration) - Gatorade
| Situation |
| Hydration needed for athlete health, performance Various training and game situations, e.g., high humidity and heat in Florida, cold in North Dakota |
| Problem |
| Athletes are not all equal Perspiration rates differ – sweat tests Electrolyte needs differs |
| Solution |
| Customized hydration system: Gatorade Gx LED cap, names on bottles, 9 different mix pods Control system so trainers/coaches can see who is hydrating correctly, text messages for problems Hydration reminders for athletes |
| Impact |
| Used by Brazil National Soccer team, World Cup At Rio Olympics: 3 golds, 1 silver – beach volleyball In trials at KC Chiefs, Denver Broncos, Florida high schools Seeing fewer soft tissue injuries |
http://gatorade.newsmarket.com/news-announcements/gatorade-gx-sports-fuel-customization/s/eb039b87-e5f8-4fe8-a706-340147b6c652
February 2017
‹#›
29
Company: Large wireless provider
Situation: Customer attrition in the wireless industry is approximately 2.0% per month and with customer base of 40 million customers and an average monthly bill of $54, this add up to over $43 million loss per month.
Problem: Attrition models are able to identify who’s likely to leave, however your model is only as good as the data available. The model was built based on usage data, however they weren’t able to effectively predict customers at risk of leaving. Each 1% improvement in the model translates to identifying 8000 customers where you can influence behavior.
Solutions: By consolidating the data into one centralized warehouse, the analyst was able to use demographic data to improve models by 2%, then incorporated Customer satisfaction data to give the model a boost of 10% and finally use web log data to identify who was shopping for new plans which is a great indicator of churn. Web log boost the model by 10-12%, a total improvement of 25%. By identifying these customers in advance and provide rules identify likely reason for churn, they were able to take action to alter behavior.
Impact:
Churn rate was 30% higher in control group.
Contact renewal 59% higher in treated group
Lowest churn rate of all Wireless providers < 1.5%
Source: restricted. No public use of customer name
Sensors at University of Arkansas
Reporter Roland Liwag sports articles
Athlete Moses Kingsley story
Source: http://www.arkansasrazorbacks.com/superhumans-among-us-fastest40/
‹#›
Moses the Athlete Vs. Roland the Reporter – Workout Comparison Using Zephyr Vest
The Zephyr measures two sets of data points: internal and external loads:
Internal load stats include heart rate, respiration, and core temperature, while
External loads measured include peak acceleration, activity level, and forward, backward and lateral (side-to-side) movement.
Ideally, Razorback coaches want to see low internal and high external loads.
Think about it in terms of performance; you want to get the most return from running and jumping up-and-down on a basketball court while spending the least amount of energy and effort.
Source: http://www.arkansasrazorbacks.com/superhumans-among-us-fastest40/
‹#›
Moses Kingsley vs. Reporter Roland – No Contest
Source: http://www.arkansasrazorbacks.com/superhumans-among-us-fastest40/
‹#›
Source: http://www.arkansasrazorbacks.com/superhumans-among-us-fastest40/
‹#›
The Effects of Sleep Extension on the Athletic Performance of Collegiate Basketball Players
11 healthy students on the Stanford University men's varsity basketball team
Subjects maintained their habitual sleep-wake schedule for a 2–4 week baseline, followed by a 5–7 week sleep extension period, a minimum goal of 10 hours in bed each 24 hours
Measures of athletic performance specific to basketball were recorded after every practice including a timed sprint and shooting accuracy.
34
February 2018
Source: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3119836/
‹#›
Stanford Findings
Subjects demonstrated a faster timed sprint following sleep extension
16.2 ± 0.61 sec at baseline vs.
15.5 ± 0.54 sec at end of sleep extension, (P < 0.001)
Shooting accuracy improved, with
free throw percentage increasing by 9% and
3-point field goal percentage increasing by 9.2% (P < 0.001)
35
February 2018
Source: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3119836/
‹#›
35
A Few Reading Suggestions
‹#›
Best Book I’ve Found – Across Many Sports
37
February 2018
From AMAZON.com:
In Scorecasting, University of Chicago behavioral economist Tobias Moskowitz teams up with veteran Sports Illustrated writer L. Jon Wertheim to overturn some of the most cherished truisms of sports, and reveal the hidden forces that shape how basketball, baseball, football, and hockey games are played, won and lost.
Source: http://www.amazon.com/Scorecasting-Hidden-Influences-Behind-Sports/dp/0307591808
‹#›
you may want to buy, or recommend to students that they buy and share these books. I’ve already recommended the Scorecasting book.
37
I’m giving this out as Christmas gifts
Best Book on American Football
February 2018
38
‹#›
38
Amazing Book - Just On Soccer
39
February 2018
From AMAZON.com:
Moneyball meets Freakonomics in this myth-busting guide to understanding—and winning—the most popular sport on the planet - now with a new afterword on the 2014 World Cup! Innovation is coming to soccer, and at the center of it all are the numbers—a way of thinking about the game that ignores the obvious in favor of how things actually are. In The Numbers Game, Chris Anderson, a former professional goalkeeper turned soccer statistics guru, teams up with behavioral analyst David Sally to uncover the numbers that really matter when it comes to predicting a winner. Investigating basic but profound questions—How valuable are corners? Which goal matters most? Is possession really nine-tenths of the law? How should a player’s value be judged?—they deliver an incisive, revolutionary new way of watching and understanding soccer.
Source: https://www.amazon.com/Numbers-Game-Everything-About-Soccer/dp/014314560/
‹#›
I know that many international students are interested in soccer, and this book is fascinating. One of the authors, Chris Anderson has presented often at MIT events. You might ask the students to Google him to find out more.
39
New Book – The Next Generation of Moneyball
The original Moneyball book (and movie) showed how the Oakland As got a competitive edge in recruiting by using analytics to find new factors – not used by other teams - that could predict success.
This new book takes the view of how to improve players that you have already recruited by using data to train them better
‹#›
Wrap Up
After reading this deck, you should be familiar with
At least 6 areas where analytics can help make better sport decisions
Examples that show how analytics can help the Front Office
Examples that show how analytics can help the Back Office
A few books that will take your learning to a deeper level
‹#›
Thank you.
©2018 Teradata
Thank you.
©2019Teradata
‹#›