Project

RMO619
ProjectProposal.docx

1. Introduction

As we have witnessed in the past 5 years, commercialized bike sharing business has boomed, especially in densely populated metropolitan cities like New York and Seattle, and gives us a good opportunity to analyze the data and come up with solutions and predictions. From start to finish, the whole process which involves membership, rental, and returns, has become automatic. With the increasing user friendliness of this service and environmental awareness in people, there is quite some potential for data analysis.

There is a ton of research that has been done on traditional public transport systems like Subways and Busses (also Uber and Lime), with this project we will be diving deeper into this new upcoming industry.

2. Project Aim

The project we have selected to work on represents various causative factors that play a role in the business of capital bike sharing ridership in Washington DC. We aim to evaluate the role of multiples variables from given data set, identify trends and make predictions of count of ridership.

3. Dataset and Plan of Action

The data set we are working consists of various variables conditions that affect the ridership of rental bikes. These conditions include weather, temperature, humidity, windspeed etc. We aim to evaluate the impact of these multiple variables on the ridership.

4. Tools Or Algorithms To be used in the project

Machine learning techniques have been used in prediction of the trend and identifying key variables that affect upward or downward trend. To study the impact of the chosen variables, we plan to use three different modelling methods:

Logistic Regression

Decision Tree

Hypothesis test

5. Conclusion

In this project we will analyze existing correlations; positive or negative, between each of the variables and ridership. Parallelly, we will show relations of various conditions that cause increase or decrease of the ridership.