elem econ homework

profileshazuanzhe
Statsreviewslides1.pptx

ECON 4400 Statistics Review 08/23/18

Random variables

Random Variable- A variable that takes on numerical values and has an outcome that is determined by an experiment.

Example: The number of pips that show when you roll a 6-sided die. Here, the possible numerical values are in the set {1,2,3,4,5,6} and the experiment is rolling the die.

Notation: We usually denote a random variable with a capital letter. For example we can denote the “number of pips” random variable with the letter X.

Discrete Random Variables

The random variable X that we just defined is an example of the first of two types of random variables: a discrete random variable.

A discrete random variable is one that takes on a finite number of values.

We denote the probability that X takes on some specific value, x, by P(X=x). In our case we have P(X=x)=1/6 for all our possible values (assuming that we aren’t being cheated by a weighted die).

Continuous Random Variables

The other type of random variable is a continuous random variable.

A continuous random variable is one that takes on infinitely many values.

A continuous random variable takes on any real value with probability zero. That is for any continuous RV, Y, P(Y=y)=0 for all possible y.

This may seem like an odd result at first blush, but lets imagine an infinitely sided die. Let’s let the number of sides be denoted by S. Then the probability of the die landing on any particular side is 1/S. As we take S to infinity, this value goes to 0.

Probability Density Functions

A probability density function (pdf) assigns probabilities to outcomes of a random variable. We usually denote the probability of a random variable by the lower case letter f.

In the case of a discrete random variable, f(x)=P(X=x) which in our case is 1/6 for all possible values.

In the case of a continuous random variable since P(Y=y)=0 we don’t measure probabilities of individual outcomes, but rather ranges of outcomes. So we measure events such as for constants a and b. This is measured by the area under the pdf between points a and b.

Cumulative distribution functions

For continuous random variables, working with cumulative distribution functions (cdf) is more convenient for calculating probabilities of ranges of outcomes. We usually denote the cdf of a random variable by the capital letter F.

The cdf of a random variable Y is defined as .

Some key properties for calculating probabilities:

for any number c.

For continuous random variables since single points have zero probability. Not necessarily so for discrete.

PDFs and CDFs

Try it for yourself

From our die-rolling random variable X, calculate the value of F(1), F(3), and F(6).

Try it for yourself

For a random variable, Z, we have: . What is F(-1)?

We know from slide 6 that so then we have:

Joint distributions and independence

We can define a joint distribution for two discrete random variables X and Y by:

We can also define a joint distribution for continuous random variables, but that is more complicated and unnecessary for our purposes.

The joint distribution is important for the notion of independence of two random variables. X and Y are said to be independent if

In other words

Expected values

An expected value is a measure of central tendency of a probability distribution. We denote the expected value of a random variable X by E[X]

For a discrete random variable with k possible outcomes the expected value is calculated as:

Calculate the expectation of our die-rolling random variable X:

f(x)=1/6 for all of the outcomes 1 to 6, so

Properties of expected values

For any constant c, E[c]=c

For any constants a and b, E[aX+b]=aE[X]+b

For constants and random variables,

We refer to this third property as expectation being a linear operator.

Conditional expectations

If two random variables, say X and Y, are not independent then the outcome of one random variable may affect the distribution of the other. The conditional expectation function (CEF) of Y given X is denoted by E[Y|X]. Note that we refer to this as a function to distinguish that the expectation of the random variable Y will change with differing values of X.

A conditional expectation is a single point from the CEF and is denoted by E[Y|X=x] or E[Y|x] for short.

Conditional expectation example

Suppose we expand upon our die-rolling experiment by adding coin-tossing. Specifically, we first roll the die then depending on the outcome we flip a coin that many times. Define the random variable Y as the total number of heads from all of the coin-flips. It should be clear that the distribution of Y depends on the outcome of X.

Consider the CEF of Y given X for two possible outcomes of X, x=1 and x=2.

In the first case Y takes on two possible values {0,1} with probability ½ for each, thus E[Y|X=1]= ½

In the second case Y takes on three possible values {0,1,2} with probabilities {1/4, 1/2, 1/4}, thus E[Y|X=2]=1

Properties of CEFs

E[c(X)|X]=c(X) for any function c(X)

E[a(X)Y+b(X)|X]=a(X)E[Y|X]+b(X) for functions a() and b()

If X and Y are independent, then E[Y|X]=E[Y]

Law of iterated expectations: E[E[Y|X]]=E[Y]

Try it for yourself

Suppose that E[X]=3 and E[Y|X]=2. What is E[X+YX]?

First we will use linearity of expectations:

Next, E[X]=3, and the law of iterated expectations tells us that Thus,

Further, we have the properties of CEFs that for any function c(X), E[c(X)Y|X]=c(X)E[Y|X], therefore We also have given that E[Y|X]=2. So to finish:

Note the step toward the end where we pulled the 2 out of the expectation operator. If you are not comfortable with this step you should go back and look at the properties of expectations.

Variance

The variance of a random variable is a measure of the spread or variability of a probability distribution.

We calculate the variance of a random variable X as

I will omit the proof of this as it is good practice for properties of expectations. Note, however, that E[X] is a constant.

The standard deviation (sd) is the square root of the variance.

Properties of the variance:

If there exists a constant c such that P(X=c)=1, then Var(X)=0

For any constants a and b,

For any constants a and b and random variables X and Y,

Covariance and correlation coefficient

The covariance of two random variables X and Y is given by

If X and Y are independent then Cov(X,Y)=0

Covariance measures linear dependence of two random variables, but magnitudes of the relationship are not easily interpretable from the covariance.

The correlation coefficient between X and Y overcomes this issue by standardizing the covariance. The correlation coefficient is:

The correlation coefficient is between -1 and 1

Populations and samples

Our goal in statistical inference is to answer some interesting question about a population. A population in statistics refers to a well-defined group of subjects. If I was interested in answering a question about class performance, then this class would be the population of interest.

Many populations of interest are near impossible to get data for (imagine trying to survey every person in the United States). Instead we usually perform data analysis on samples. A sample is a subset of a particular population. For example this class is a (non-random) sample of students at OSU.

Parameters vs. sample statistics

When we are measuring random variables at the population level we talk about population parameters (e.g. expected value and variance).

Often though, we are working with samples instead. Measurements at the sample level are called sample statistics. Sample statistics will vary depending on the sample chosen, however population parameters are constants.

Common sample statistics

Consider a sample of size n drawn from a population with sample outcomes given by

The sample average is given by . We usually denote the sample average of a random variable X as

The sample variance is given by

The sample standard deviation is simply the square root of the sample variance.

Estimators and bias

The sample average and sample variance are estimators for the expected value and variance of a distribution. An estimator is a function of sample data used to estimate population parameters.

An estimator W for the parameter θ is said to be unbiased if . Otherwise the estimator is said to be biased.

The reason that (n-1) is in the denominator of the sample variance calculation is so that the sample variance is an unbiased estimator of the population variance (assuming random sampling, more on that next week).

Notation of parameters and sample stats

Sampling variance

We are often interested in looking at the variance of an estimator. This is called the sampling variance.

For example the variance of the sample average of a random sample drawn from the distribution of X is where denotes the population variance.

Types of data

Data: facts and numbers about a sample or population.

Data structures:

Cross-sectional: a sample of individuals, households, firms, cities, states, etc. at a given point in time. Example: This class today.

Time-series: observations on a variable(s) over time. Example: Information on a single students over the entire semester.

Panel (longitudinal) data: a time series for each cross-sectional member in a data set. Example: information for all students over the entire semester.