Probability Density Function and Binomial Convergence using excel

profilehrutikkshroff
ProjectMath117Spring2019.pdf

Pace University Math 117 Instructor: Vinh Ly Spring 2019 Data Analysis Project March 13th, 2019

Final Project

Conceptual Problems: Probability Density Function and Binomial Convergence

Problem 1

a/Use Excel to find the discrete probability and cumulative probability of the Binomial

distribution with probability of success p = 0.6 and n = 120. Find its mean and variance.

b/Based upon the chart on Excel, what can you conclude about the binomial convergence?

Hint: Use binom.dist function on Excel and sketch the curve.

Problem 2

Use Excel/Megastat to create the density probability plot of normal distribution. Take µ = 80

and σ = 2.3. Set up the range values of x from 60 to 100.

Hint: Use norm.dist function and sketch the curve with tick = 0.1

Problem 3 (extra credit)

Demonstrate Central Limit Theorem(CLT) of the sample mean by sampling a 100 uniform

distribution data with 50 variables. Verify the result by computing the sample mean, sample

variance and sketch the histogram on Excel/Megastat.

Hint: Generate 100 datasets of 50 variables and calculate 50 sample means to determine the

distribution of X̅ and SX̅. It should converge to a model that we’ve learned in class.

Practical Problems: Historical Trends and Patterns in Stock Prices

Choose two publicly traded companies (Crypto is accepted) with a sample size of 50 and a time

range (long term/short term) for its stock prices so that you can download the data from

quote.yahoo.com or google.com/finance. Make sure to STATE the COMPANY HISTORY and WHY

YOU PICK THESE 2 STOCKS for comparison.

Problem 4

Use a statistical package (Excel) to find the summary measurements: min, max, mean, median, standard deviation of the historical prices of 2 stocks

a/Draw a Box Plot for both stocks that you picked. Identify the highest and lowest stock prices.

b/Repeat this exercise with histogram/bar-charts and scatter plots for each stock.

c/Write a short paragraph describing how the two stock prices are different. Hint: Base your answer on any notable differences you observe in the two Box Plots.

Pace University Math 117 Instructor: Vinh Ly Spring 2019 Data Analysis Project March 13th, 2019

Problem 5

Use Megastat/Excel to find the value of the linear correlation coefficient between these 2 stock prices? Is the correlation significant? Explain the reason for your answer.

a) If the correlation is significant, what does it imply about the trend in the predicted stock?

b) Find the equation for the least squares regression (LSR) line.

c) Interpret the meaning of the slope of the LSR line.

d) Based on the equation of LSR line, what is the “best predicted” value for the stock that you treated as a dependent variable?

Problem 6

Run the hypothesis testing for difference in mean prices of 2 stocks that you picked from the

sample size of at least 50 datasets for each stock. Assume the difference prices between the

two and compare the result with equal/unequal variances or matched-paired hypothesis.

Hint: we can use F – test = 𝜎1

2

𝜎2 2 to compare the variances of 2 stocks before choosing the right

model (equal or unequal variances) to test the hypothesis.

Problem 7

Test the hypothesis that the periodic returns follow a normal distribution, as required by

certain models of mathematical finance. Predict the periodic returns that you want to be tested

for both population stock prices and run the hypothesis testing based upon your prediction.

Hint: Use z points on the table to test for linearity and periodic return r = 𝑃𝑟𝑖𝑐𝑒2−𝑃𝑟𝑖𝑐𝑒1

𝑃𝑟𝑖𝑐𝑒1 .

Problem 8 (extra credits)

Investigate models which predict the stock price at time t as a function of the stock price at

previous times (say St-1 and St-2 for example).

Hint: Run a multiple linear regression model (time series model). Forecast are made on St+1 with

respect to St, St-1 and St-2.

Problem 9 (extra credits)

Categorize these stock returns of each company (continue from question 7) into 4 or 5 different

categories. You can choose to collect a number of frequencies on each assumptions. For

example: high returns (2% or more), low return (between 1% and 2%), no return (between 1%

to 0%), negative return (0% to 1%) and highly negative return (-1% or more) and so on. Test on

the assumed proportions (you need to set up the hypothetical proportions).

Hint: Use Chi-square test on goodness-of-fit.

Notes: You are ONLY allowed for 1 extra credits problem.