Probability Density Function and Binomial Convergence using excel
Pace University Math 117 Instructor: Vinh Ly Spring 2019 Data Analysis Project March 13th, 2019
Final Project
Conceptual Problems: Probability Density Function and Binomial Convergence
Problem 1
a/Use Excel to find the discrete probability and cumulative probability of the Binomial
distribution with probability of success p = 0.6 and n = 120. Find its mean and variance.
b/Based upon the chart on Excel, what can you conclude about the binomial convergence?
Hint: Use binom.dist function on Excel and sketch the curve.
Problem 2
Use Excel/Megastat to create the density probability plot of normal distribution. Take µ = 80
and σ = 2.3. Set up the range values of x from 60 to 100.
Hint: Use norm.dist function and sketch the curve with tick = 0.1
Problem 3 (extra credit)
Demonstrate Central Limit Theorem(CLT) of the sample mean by sampling a 100 uniform
distribution data with 50 variables. Verify the result by computing the sample mean, sample
variance and sketch the histogram on Excel/Megastat.
Hint: Generate 100 datasets of 50 variables and calculate 50 sample means to determine the
distribution of X̅ and SX̅. It should converge to a model that we’ve learned in class.
Practical Problems: Historical Trends and Patterns in Stock Prices
Choose two publicly traded companies (Crypto is accepted) with a sample size of 50 and a time
range (long term/short term) for its stock prices so that you can download the data from
quote.yahoo.com or google.com/finance. Make sure to STATE the COMPANY HISTORY and WHY
YOU PICK THESE 2 STOCKS for comparison.
Problem 4
Use a statistical package (Excel) to find the summary measurements: min, max, mean, median, standard deviation of the historical prices of 2 stocks
a/Draw a Box Plot for both stocks that you picked. Identify the highest and lowest stock prices.
b/Repeat this exercise with histogram/bar-charts and scatter plots for each stock.
c/Write a short paragraph describing how the two stock prices are different. Hint: Base your answer on any notable differences you observe in the two Box Plots.
Pace University Math 117 Instructor: Vinh Ly Spring 2019 Data Analysis Project March 13th, 2019
Problem 5
Use Megastat/Excel to find the value of the linear correlation coefficient between these 2 stock prices? Is the correlation significant? Explain the reason for your answer.
a) If the correlation is significant, what does it imply about the trend in the predicted stock?
b) Find the equation for the least squares regression (LSR) line.
c) Interpret the meaning of the slope of the LSR line.
d) Based on the equation of LSR line, what is the “best predicted” value for the stock that you treated as a dependent variable?
Problem 6
Run the hypothesis testing for difference in mean prices of 2 stocks that you picked from the
sample size of at least 50 datasets for each stock. Assume the difference prices between the
two and compare the result with equal/unequal variances or matched-paired hypothesis.
Hint: we can use F – test = 𝜎1
2
𝜎2 2 to compare the variances of 2 stocks before choosing the right
model (equal or unequal variances) to test the hypothesis.
Problem 7
Test the hypothesis that the periodic returns follow a normal distribution, as required by
certain models of mathematical finance. Predict the periodic returns that you want to be tested
for both population stock prices and run the hypothesis testing based upon your prediction.
Hint: Use z points on the table to test for linearity and periodic return r = 𝑃𝑟𝑖𝑐𝑒2−𝑃𝑟𝑖𝑐𝑒1
𝑃𝑟𝑖𝑐𝑒1 .
Problem 8 (extra credits)
Investigate models which predict the stock price at time t as a function of the stock price at
previous times (say St-1 and St-2 for example).
Hint: Run a multiple linear regression model (time series model). Forecast are made on St+1 with
respect to St, St-1 and St-2.
Problem 9 (extra credits)
Categorize these stock returns of each company (continue from question 7) into 4 or 5 different
categories. You can choose to collect a number of frequencies on each assumptions. For
example: high returns (2% or more), low return (between 1% and 2%), no return (between 1%
to 0%), negative return (0% to 1%) and highly negative return (-1% or more) and so on. Test on
the assumed proportions (you need to set up the hypothetical proportions).
Hint: Use Chi-square test on goodness-of-fit.
Notes: You are ONLY allowed for 1 extra credits problem.