Draw Conclusions on software "R" using a dataset provided
3 years ago
20
bikeshare_project.pdf
- BIKESHARE.Rdata
bikeshare_project.pdf
ST 625: Fall 2023 Prof. Cherveny
Individual Data Assignment Due: Thursday, Dec 7, 11:59pm
Goals
1. Demonstrate mastery of ST625 content
2. Communicate results of a statistical analysis to a general audience
Description
You’ll prepare a report for your client (played by me). The project description is on the next page. The primary components of your report should be:
1. Abstract: A brief statement (few sentences at most) summarizing the purpose of the report as well as the results and what they mean in substantive rather than statistical terms. Be brief and to the point, stimulating your reader’s interest. You essentially have 15 seconds to let the reader know what’s in the report and if they should read it.
2. Introduction: Give background and motivate the question to be investigated. Intro- duce the data, perhaps with visualization or descriptive statistics, but only if you think it significantly adds to the narrative. A reader could jump from here to Results. While a formal research study has explicit hypotheses, you likely won’t have any, but at least try to suggest which variables in what form you expect to be important predictors.
3. Methodology: Describe the approach used to analyze the problem while keeping in mind that your reader likely never took or doesn’t remember ST625. What was your approach? Why is your approach is appropriate? What are the assumptions? Are there any concerns about these assumptions?
4. Results: Present the recommendation simply and clearly. Use graphical display, table, and discussion as you see appropriate. You are providing an answer to the question outlined in the introduction.
5. Conclusion: Briefly summarize everything. Does the model make sense? Do the predictors seem reasonable? What does it all mean? Suggestions for further analysis or other data might be appropriate.
Guidelines
• The report must be typed and self-contained, with equations or plots incorporated.
• While there is no length requirement, your report will likely be a few pages. Significantly shorter is probably not thorough, while much longer is likely too wordy and contains irrelevant information... your client will not appreciate either of these!
ST 625: Fall 2023 Prof. Cherveny
• Your report should be statistically correct, professional, engaging, efficient, and written at the correct level. Correct level means that your client with only a general education should be able to follow it. I’m not interested in how much statistics you can show off, and in fact doing so will hurt your grade. In your professional life, you’ll need to get the statistics right on your own and communicate the results to people who aren’t statistics professors!
• On that note, undigested computer output is most definitely not appropriate.
• When writing the methodology, please focus on what will interest your reader or be important to them. Many issues you can gloss over or dismiss with a single sentence if there was nothing extraordinary. You certainly should not give a detailed rundown of every single model you tried.
• Any plots should be fully labeled and also referenced in the report.
Further Comments
• There is no “correct” final model or a minimum R2. Rather, there are things you should check and try, and things you should uncover.
• This project involves no outside research on bike sharing, weather, Washington DC, etc. Just do as much as you can with the dataset provided and the context.
• I’m always willing to give general advice, but in the interest of time I won’t be reading rough drafts from everybody. It takes forever! I’m also not going to say what to try, or if your model is “good enough”. It’s on you to decide if it’s good enough!
• You’ll turn in the report by uploading a Word .doc file to Blackboard.
• This is an individual assignment. Submitting your report is affirmation that you alone did the work.
INCLUDE A SCREENSHOT OF THE FINAL MODEL(S) SUMMARY IN AN APPENDIX
ST 625: Fall 2023 Prof. Cherveny
The Problem: Bike Sharing
Bike sharing is a 21st century take on traditional bike rental in which the whole process from membership to rental to return has been automated. Through these systems, a user is able to rent a bike from a particular location and return the bike at another point easily. Understanding the factors influencing demand for bikes is important for addressing traf- fic, environmental and health issues. Fortunately, because the system is fully automated, extensive data about each trip is readily available.
Capital Bikeshare is metro DC’s bike sharing system (operated by the same parent com- pany as Boston’s Bluebikes). With more than 4,500 bikes available 24/7 from 500+ stations across seven jurisdictions of Washington DC, Capital Bikeshare offers a large bike sharing network for the nation’s capital. Users rent bikes either for a single 30-minute ride or a 24- hour period (casual), or subscribe to a 30-day or annual membership plan (registered).
The data set BIKESHARE contains 752 one-hour observations of the following variables:
casual Number of unregistered/non-subscription rentals registered Number of subscription rentals season Season hr Hour (6 to 23, where 6 means 6am-7am and 23 means 11pm-12am) holiday Federal holiday or DC Emancipation Day (1 is yes and 0 is no) day Day of the week (1 to 7, where 1 is Sunday and 7 is Saturday) weather Weather situation, labelled by
1 = Clear or Partly cloudy 2 = Cloudy and/or Mist 3 = Light Rain or Light Snow 4 = Heavy Rain or Heavy Snow or Ice Pellets
temp Temperature (Fahrenheit) feelslike “Feels like” temperature (Fahrenheit) hum Humidity (percent) windspeed Windspeed (miles per hour)
Assignment
Using the range of models, tools, and techniques studied in ST625, build and present two models: one for predicting casual bike rentals using the independent variables, and another for predicting registered bike rentals. Your ultimate goal is to explain the factors that influence bike demand, and as such your model/variable selection should be based both on context and statistics. Model interpretation will be very important part of your report! What influences demand for bikes, and how, and to the extent plausible why?
You should at a minimum perform linear regression using all the available independent variables as well as consider some types of complex model (terms that are higher-order, interaction, dummy), then perform variable selection/compare models. And to be very clear, casual should not be a variable used to predict registered, nor vice-versa.
ST 625: Fall 2023 Prof. Cherveny
Further Remarks
• The dataset is a curated sample assembled from https://www.capitalbikeshare.com
as well as historical weather databases. It is simple random sample of a size sufficient for a good deal of analysis but not so large that everything you try looks significant. I tailored the number of holiday and heavy rain/snow observations.
• Rides between 12am and 6am are intentionally omitted. Although it’s possible to rent bikes during these times, round-the-clock rentals are likely periodic after controlling for other factors and so naturally modeled with trig functions... I didn’t want that.
• The exact date of each rental is also intentionally omitted. Feel free to ask for the specific calendar date of a very small number of observations if you think it will be of use.
INCLUDE A SCREENSHOT OF THE FINAL MODEL(S) SUMMARY IN AN APPENDIX
- homework 4
- response essay to Harry N. Rothschild. Wu Zhao: China’s Only Woman Emperor.
- Primary Task Response: Within the Discussion Board area, write up to 3 paragraphs 350-400 words that respond to the following questions with your thoughts, ideas, and comments. This will be the foundation for future discussions by your classmates.
- Strategic Management in Healthcare
- Use your selected country (Jamaica) and define the level of transnational organized crime in which they participate or to which they are subject. What is the predominant form of transnational crime? Where do they operate?
- Oral Comm
- LT-A Assignment
- Select three different five-card combinations or five-card hands from your favorite card game that utilizes a standard 52-card deck containing...
- Comparison of Juvenile Delinquency Theory
- Marketing Plan