econsoftware
1st Statistical Software Assignment.
Directions for turning in: Save your workspace and turn in that file. You may use STATA or R.
1. Use the Data in WAGE1 for this exercise:
a. Find the average education level in the sample. What are the highest and lowest years of education?
b. Find the average hourly wage in the sample. Does it seem high or low?
c. The wages are reported in 1976 dollars. Using the internet, find the CPI for the years 1976 and 2013.
d. Use the CPI data from part (c) to find the average hourly wage in 2013 dollars.
e. How many women are in the sample? How many men?
2. Use the data in BWGHT to answer this question.
a. How many women are in the sample, and how many smoked during pregnancy?
b. What is the average number of cigarettes smoked per day? Is the average a good measure of the “typical” woman in this case? Explain.
c. Among women who smoked during pregnancy, what is the average number of cigarettes smoked per day? How does this compare with your answer from part (b), and why?
d. Find the average of in the sample. Why are only 1192 observations used to compute this average?
e. Report the average family income and its standard deviation in dollars.
3. Use the data in COUNTYMURDERS to answer this question. Use only the year 1996. The variable is the number of murders reported in the county. The variable is the number of executions that took place of people sentenced to death in a given county. Most states in the U.S. have the death penalty, but several do not.
a. How many counties are there in the data set? Of these, how many have zero murders? What percentage of counties have zero executions?
b. What is the largest number of murders? What is the largest number of executions? Why is the average number of executions so small?
c. Compute the correlation coefficient between and and describe what you find.
d. You should have computed a positive correlation in part (c). Do you think that more executions cause mor murders to occur? What might explain the positive correlation?
4. The data set in ALCOHOL contains information on a sample of men in the U.S. Two key variables are self-reported employment status and alcohol abuse (along with many other variables). The variables and are both binary, or indicator, variables: they take on only the values of zero and one.
a. What is the percentage of the men in the sample that report abusing alcohol? What is the employment rate?
b. Consider the group of men who abuse alcohol. What is the employment rate?
c. What is the employment rate for the group of men who do not report abusing alcohol?
d. Discuss the difference in your answers to parts (b) and (c). Does this allow you to conclude that alcohol abuse causes unemployment? Explain.