HW1
Homework #1
SOCY 3115
Spring 20
Read the Syllabus and FAQ on how to do your homework before beginning the assignment!
To get consideration for full credit, you must:
· Follow directions;
· Show all work required to arrive at answer (statistical calculations often require multiple steps, so you need to write these down, not just skip to the final answer)
· Use appropriate statistical notation at all times (e.g. if you are calculating a population mean, begin with the equation for population mean)
· Use units in your answer, where appropriate (e.g. a mean time would be “6.5 hours” rather than just “6.5”)
Understanding the Structure of Data
1. For the following rectangular dataset:
|
Id |
Highest degree |
Works full-time |
Annual income cat |
|
1 |
Did not grad HS |
Yes |
Low |
|
2 |
HS dip |
Yes |
Low |
|
3 |
HS dip |
No |
Med |
|
4 |
BA |
No |
Low |
|
5 |
BA |
Yes |
Med |
|
6 |
MA |
Yes |
High |
|
7 |
HS dip |
Yes |
Med |
a. What is the unit-of-analysis of the dataset?
b. How many variables are in the dataset?
c. How many observations/cases are in the dataset?
d. For each variable that is not named “id”:
i. What is the variable name?
ii. What is the level-of-measurement?
iii. What are the values for the variable?
iv. If you had to make a guess, what do you think the “question” was that was asked of the unit-of-analysis to get these data? (for example, if we had a continuous variable called “num_pets” the question might be “How many pets live in your household?”)
2. For the following rectangular dataset:
|
Id |
num_bdrms |
num_bthrms |
sqft |
Ranch |
|
1 |
4 |
3 |
3200 |
Yes |
|
2 |
2 |
1.5 |
2800 |
Yes |
|
3 |
2 |
1 |
1200 |
Yes |
|
4 |
3 |
2 |
1500 |
No |
|
5 |
2 |
2 |
1100 |
No |
a. What is the unit-of-analysis of the dataset?
b. How many variables are in the dataset?
c. How many observations/cases are in the dataset?
d. For each variable that is not named “id”:
i. What is the variable name?
ii. What is the level-of-measurement? Before answering, be sure to consult the slide called “Level of measurement – language to use”. Use the formal language!
iii. What are the values for the variable?
iv. If you had to make a guess, what do you think the “question” was that was asked of the unit-of-analysis to get these data? (for example, if we had a continuous variable called “num_pets” the question might be “How many pets live in your household?”)
3. For each of the following questions (1) construct a dataset with one variable and three observations (2) add data that could have theoretically been collected (just make up the actual responses to the question); and (3) indicate the level-of-measurement of the variable. I’ve done two examples for you.
Example#1:
What is your current age? (individual is the unit-of-analysis)
id age
1 25
2 32
3 61
The age variable is continuous/interval ratio.
Example#2:
What is the size of this hospital based on number of beds? (hospital is the unit-of-analysis)? Answers can be small (1-100 beds), medium (101-500 beds), large (501 beds to 1000 beds), extra large (1001+ beds)
id hosp_size
1 med
2 med
3 extra large
The hospital size variable is categorical ordinal
Now you do it:
a. Should the Senate vote to remove Trump from office, yes or no? (individual is the unit-of-analysis)
b. What was your total income for 2019? (individual is the unit-of-analysis)
c. Which of the following income categories contains your total income for 2019? Answers can be $0; $1-$49,999; $50,000-$99,999; $100,000+ (individual is the unit-of-analysis)
d. Do you think we should abolish the Electoral College system of electing president? Answers can be yes absolutely; yes probably; not sure; probably not; absolutely not. (individual is the unit-of-analysis)
4. For each of the following datasets: (a) identify the unit-of-analysis and (b) indicate the level of measurement for each variable.
Dataset 1
|
Country |
Gini index of inequality |
Population (millions) |
|
Austria France US UK Russia Mexico |
26 31 41 34 41 47 |
8 66 316 66 143 122 |
Dataset 2
|
Credit Score |
Race/ethnicity |
PersonID |
|
5 6 2 9 10 2 |
1 1 3 2 1 2 |
1 2 3 4 5 6 |
Dataset 3
|
Company |
Mean age of workers |
Percent of workers w/bach degree |
|
Hulu Hooli Garigo Placebo |
42 41 38 26 51 |
82 86 82 70 64 |
Dataset 4
|
Average temperature |
State |
Crime Rate (crimes/10,000 people) |
|
60 52 56 53 56 51 |
New Mexico New York Wyoming Indiana Kansas Colorado |
35 46 66 35 40 34 |
5. For each of the following survey questions, identify the type of variable/level-of-measurement that would be created from the data collected.
a. What do you think is the ideal number of children to have?
b. Do you have any pets living in your household?
c. What is the highest level of education you have achieved up to this point?
· Less than high school
· High school diploma or equivalence (e.g. GED)
· Some college
· Bachelor’s degree
· Graduate degree
d. Which of the following best describes your support for the U.S. plan to construct a Death Star Galactic Superweapon?
· Strongly support
· Support
· No opinion
· Do not support
· Strong do not support
e. How many living cousins do you have?
f. In your opinion, does your use of marijuana impact your ability to complete your school work?
· No
· Yes
g. How many minutes does it take for you to commute to campus on a typical day?
h. On how many occasions in your adult life have you slept in your car because you had nowhere else to sleep?
· 0
· 1-3
· 4 or more times
6. This bulleted items below is actual Denver homicide data!
· In 2015, there were 50 homicides in Denver.
· In 2016, there were 56 homicides in Denver.
a. From 2015 to 2016, the annual number of Denver homicides increased by ________ homicides. Is this a relative or absolute measure?
b. From 2015 to 2016, the annual number of Denver homicides increased by ______ percent. Is this a relative or absolute measure?
· In 2015, 35 of Denver’s 50 homicides were by gunshot.
· In 2016, 44 of Denver’s 56 homicides were by gunshot.
a. In 2015, _________ percent of Denver’s homicides were by gunshot. Is this a relative or absolute measure?
b. In 2016, _________ percent of Denver’s homicides were by gunshot. Is this a relative or absolute measure?
c. From 2015 to 2016, the percent of Denver’s homicides that were gunshot homicides increased ________ percentage points. Is this a relative or absolute measure?
d. From 2015 to 2016, the percent of Denver’s homicides that were gunshot homicides increased __________ percent. Is this a relative or absolute measure?
7. Assume we collected data on 20 students in this class. We asked them these questions:
· Do you feel you are mostly republican, democrat, or independent?
· What is your current age?
· I’d like to ask your opinion on the amount of spending in different areas by the US government. For each type of spending, indicate whether you think the US government is spending too little, too much, or just right.
i. Spending on the military
ii. Spending on social programs
|
|
|
|
|
|
Id |
military_spending |
socialprog_spending |
party_id |
|
1 |
too little |
too much |
repub |
|
2 |
too much |
just right |
dem |
|
3 |
too little |
too much |
repub |
|
4 |
just right |
just right |
indep |
|
5 |
too little |
too much |
indep |
|
6 |
just right |
just right |
dem |
|
7 |
just right |
too much |
repub |
|
8 |
just right |
too little |
dem |
|
9 |
too little |
too much |
dem |
|
10 |
just right |
just right |
indep |
|
11 |
too much |
too much |
repub |
|
12 |
too much |
just right |
dem |
|
13 |
too much |
just right |
dem |
|
14 |
just right |
too little |
dem |
|
15 |
too little |
too much |
dem |
|
16 |
just right |
just right |
indep |
|
17 |
too much |
too much |
repub |
|
18 |
too much |
just right |
dem |
|
19 |
too much |
just right |
dem |
|
20 |
just right |
just right |
indep |
a. Using this data, create two separate frequency tables, one that summarizes opinions on military spending and one that summarizes opinions on social program spending. The tables should include columns for frequency, percent, and cumulative percent. Be sure each has a title at the top, labels at the bottom, column headings. Be sure to include “total” row at the bottom (see text for examples). Do this by hand, not by computer software.