EX22.pdf

ECO 6380 Prof. Tom Fomby Predictive Analytics for Economists Spring 2019

EXERCISE 2

Purpose: To learn how to bin continuous data and do stratified random sampling in XLMiner ©. Go to the website for this course and download the file “Boston_Housing.xlsx”. Use this file and XLMINER to answer part b) of this exercise. Hand in your work on Tuesday, February 5 on CANVAS.

a) Using the XLMiner © help file and looking under “DataUtilities”, briefly define the following terms:

A Sample Sampling with replacement Simple random sampling Strata Stratified random sampling Proportionate to stratum size Binning continuous data

b) Consider the data file Boston_Housing.xlsx. “Bin” the variable “AGE” into 4 categories (bins) where the bins are made with an equal count in each bin and the ranked value of the binned value runs from 1 to 4. Create an Excel file that has the variables “row id”, “AGE”, and “Binned_AGE”. Print the first 25 observations of this file out and hand in the printout with this exercise.

c) Again consider the data file Boston_Housing.xlsx. We are going to do stratified

random sampling to draw a sample size of approximately 100 out of the original population of 506 cases. Use the variable RAD as the stratifying variable (it has 9 strata) with the variables Crim, ZN, and Indus being in the sampled data. Then randomly draw an approximate sample of 100 cases selecting the records proportionate to stratum size. (Use the random number seed of 12345.) Print out the stratified random sample that XLMiner produces in this case.

  • Predictive Analytics for Economists Spring 2019