Statistical reports, R

profiledavid123367
STAT3201_Final_Project.docx

Final Project Assignment (50 points)

Written Report Due Date: 12/09/2019 (On Carmen)

Instructions: All aspects of this project must be done independently and cannot be discussed with anyone other than the instructor. You are responsible for ensuring that other students do not have access to your project. Any violation of these instructions constitutes academic misconduct and will be reported to the Committee on Academic Misconduct (COAM).

Introduction: Computer simulations are frequently used to evaluate proposed statistical techniques. Typically, these simulations require that we obtain observed values of random variables with a specific prescribed distribution. Most computer systems contain a subroutine that provides observed values of a random variable U~Uniform(0,1). In R, one observed value of this random variable can be generated using the function:

u=runif(1,0,1).

The question that arises is: How can we generate observed values of other random variables, such as the Exponential with mean β, using only observed values of the random variable U. One approach is called the Inverse Transform Method.

The general approach to this problem proceeds as follows. We know that a Uniform(0,1) random variable can take values in the interval [0,1] only. For the purpose of this project, let’s ignore the boundaries resulting in (0,1). Let F denote the cumulative distribution function (cdf) for a general random variable X. Then, from properties of cdfs we know that F is non-decreasing. Therefore, for any , there is a unique value x such that . Then, the inverse of F, denoted as F-1 will provide a value x of the random variable X that corresponds to . I will demonstrate this in more detail in class. The Inverse Transform Method allows one to use observed values of U and the form of F-1 to generate observed values of the random variable X. The key requirement is that F-1 can be obtained in closed form. We will restrict our attention to this case.

The general steps to deriving the Inverse Transform Method are as follows (also see Example 6.5 on page 306 in your book for guidance):

1. Derive the cdf F of X.

2. Set the cdf F, a function of x, equal to u and find the inverse of the cdf F-1 as a function of u (i.e., write x in terms of u). This provides the transformation needed to generate one observed value of the random variable X.

3. Then, to generate such a value on a computer, first generate one observed value of the random variable U using runif, denoted by u, and transform it using the formula derived in (2) to generate an observed value of the random variable X.

To generate multiple observed values of the random variable X, repeat step (3) multiple times.

Project Assignment: The aim of this project is to derive and implement a computer algorithm to generate observed values (random samples) of three different random variables using the Inverse Transform Method. The three random variables of interest are: (1) X1~Uniform(θ1,θ2), (2) X2~Exponential(β), and (3) X3~Cauchy(θ). We have not introduced the Cauchy random variable in this class but its probability density function (pdf) is given by:

-∞ < x < ∞, -∞ < θ < ∞.

The first part of your assignment is to derive the above procedure for the three random variables X1, X2 and X3.

The second part of your assignment is to implement the method in R and provide the following results:

1. Generate a random sample of size 100,000 from a Uniform(-2,1).

2. Generate a random sample of size 100,000 from an Exponential(3).

3. Generate a random sample of size 100,000 from a Cauchy(2).

For cases (1) and (2), report the mean and variance of your random sample, and display its histogram. Compare the mean, variance and histogram obtained using your approach to those obtained via a random sample using the built-in R functions runif(100000,-2,1) and rexp(100000,1/3). For case (3), report the median of your random sample and compare it to the median obtained via a random sample using the built-in R function rcauchy(100000,2). In all cases, DO NOT print the actual random samples you have generated. Those are not important and will destroy hundreds of trees.

The written report should include all of the relevant derivations of the algorithm, the results as well as a clear description of the results. The derivations should be accompanied by text that explains each step in the procedure. Also, please make sure that all plots are easy to read and that the results are well-organized in the write-up. Part of your grade will be devoted to the presentation of results in your report. The entire write-up should be no longer than three pages with 12 point font size. In addition, you should include all of the R code that generated the results as an Appendix. If you have questions about the project, please feel free to see me during office hours or ask during lecture.