HW1_UgQuant_Sp201.docx

Homework #1

SOCY 3115

Spring 20

Read the Syllabus and FAQ on how to do your homework before beginning the assignment!

To get consideration for full credit, you must:

· Follow directions;

· Show all work required to arrive at answer (statistical calculations often require multiple steps, so you need to write these down, not just skip to the final answer)

· Use appropriate statistical notation at all times (e.g. if you are calculating a population mean, begin with the equation for population mean)

· Use units in your answer, where appropriate (e.g. a mean time would be “6.5 hours” rather than just “6.5”)

Understanding the Structure of Data

1. For the following rectangular dataset:

Id

Highest degree

Works full-time

Annual income cat

1

Did not grad HS

Yes

Low

2

HS dip

Yes

Low

3

HS dip

No

Med

4

BA

No

Low

5

BA

Yes

Med

6

MA

Yes

High

7

HS dip

Yes

Med

a. What is the unit-of-analysis of the dataset?

b. How many variables are in the dataset?

c. How many observations/cases are in the dataset?

d. For each variable that is not named “id”:

i. What is the variable name?

ii. What is the level-of-measurement?

iii. What are the values for the variable?

iv. If you had to make a guess, what do you think the “question” was that was asked of the unit-of-analysis to get these data? (for example, if we had a continuous variable called “num_pets” the question might be “How many pets live in your household?”)

2. For the following rectangular dataset:

Id

num_bdrms

num_bthrms

sqft

Ranch

1

4

3

3200

Yes

2

2

1.5

2800

Yes

3

2

1

1200

Yes

4

3

2

1500

No

5

2

2

1100

No

a. What is the unit-of-analysis of the dataset?

b. How many variables are in the dataset?

c. How many observations/cases are in the dataset?

d. For each variable that is not named “id”:

i. What is the variable name?

ii. What is the level-of-measurement? Before answering, be sure to consult the slide called “Level of measurement – language to use”. Use the formal language!

iii. What are the values for the variable?

iv. If you had to make a guess, what do you think the “question” was that was asked of the unit-of-analysis to get these data? (for example, if we had a continuous variable called “num_pets” the question might be “How many pets live in your household?”)

3. For each of the following questions (1) construct a dataset with one variable and three observations (2) add data that could have theoretically been collected (just make up the actual responses to the question); and (3) indicate the level-of-measurement of the variable. I’ve done two examples for you.

Example#1:

What is your current age? (individual is the unit-of-analysis)

id age

1 25

2 32

3 61

The age variable is continuous/interval ratio.

Example#2:

What is the size of this hospital based on number of beds? (hospital is the unit-of-analysis)? Answers can be small (1-100 beds), medium (101-500 beds), large (501 beds to 1000 beds), extra large (1001+ beds)

id hosp_size

1 med

2 med

3 extra large

The hospital size variable is categorical ordinal

Now you do it:

a. Should the Senate vote to remove Trump from office, yes or no? (individual is the unit-of-analysis)

b. What was your total income for 2019? (individual is the unit-of-analysis)

c. Which of the following income categories contains your total income for 2019? Answers can be $0; $1-$49,999; $50,000-$99,999; $100,000+ (individual is the unit-of-analysis)

d. Do you think we should abolish the Electoral College system of electing president? Answers can be yes absolutely; yes probably; not sure; probably not; absolutely not. (individual is the unit-of-analysis)

4. For each of the following datasets: (a) identify the unit-of-analysis and (b) indicate the level of measurement for each variable.

Dataset 1

Country

Gini index of inequality

Population (millions)

Austria

France

US

UK

Russia

Mexico

26

31

41

34

41

47

8

66

316

66

143

122

Dataset 2

Credit Score

Race/ethnicity

PersonID

5

6

2

9

10

2

1

1

3

2

1

2

1

2

3

4

5

6

Dataset 3

Company

Mean age of workers

Percent of workers w/bach degree

Google

Hulu

Hooli

Garigo

Placebo

42

41

38

26

51

82

86

82

70

64

Dataset 4

Average temperature

State

Crime Rate (crimes/10,000 people)

60

52

56

53

56

51

New Mexico

New York

Wyoming

Indiana

Kansas

Colorado

35

46

66

35

40

34

5. For each of the following survey questions, identify the type of variable/level-of-measurement that would be created from the data collected.

a. What do you think is the ideal number of children to have?

b. Do you have any pets living in your household?

c. What is the highest level of education you have achieved up to this point?

· Less than high school

· High school diploma or equivalence (e.g. GED)

· Some college

· Bachelor’s degree

· Graduate degree

d. Which of the following best describes your support for the U.S. plan to construct a Death Star Galactic Superweapon?

· Strongly support

· Support

· No opinion

· Do not support

· Strong do not support

e. How many living cousins do you have?

f. In your opinion, does your use of marijuana impact your ability to complete your school work?

· No

· Yes

g. How many minutes does it take for you to commute to campus on a typical day?

h. On how many occasions in your adult life have you slept in your car because you had nowhere else to sleep?

· 0

· 1-3

· 4 or more times

6. This bulleted items below is actual Denver homicide data!

· In 2015, there were 50 homicides in Denver.

· In 2016, there were 56 homicides in Denver.

a. From 2015 to 2016, the annual number of Denver homicides increased by ________ homicides. Is this a relative or absolute measure?

b. From 2015 to 2016, the annual number of Denver homicides increased by ______ percent. Is this a relative or absolute measure?

· In 2015, 35 of Denver’s 50 homicides were by gunshot.

· In 2016, 44 of Denver’s 56 homicides were by gunshot.

a. In 2015, _________ percent of Denver’s homicides were by gunshot. Is this a relative or absolute measure?

b. In 2016, _________ percent of Denver’s homicides were by gunshot. Is this a relative or absolute measure?

c. From 2015 to 2016, the percent of Denver’s homicides that were gunshot homicides increased ________ percentage points. Is this a relative or absolute measure?

d. From 2015 to 2016, the percent of Denver’s homicides that were gunshot homicides increased __________ percent. Is this a relative or absolute measure?

7. Assume we collected data on 20 students in this class. We asked them these questions:

· Do you feel you are mostly republican, democrat, or independent?

· What is your current age?

· I’d like to ask your opinion on the amount of spending in different areas by the US government. For each type of spending, indicate whether you think the US government is spending too little, too much, or just right.

i. Spending on the military

ii. Spending on social programs

Id

military_spending

socialprog_spending

party_id

1

too little

too much

repub

2

too much

just right

dem

3

too little

too much

repub

4

just right

just right

indep

5

too little

too much

indep

6

just right

just right

dem

7

just right

too much

repub

8

just right

too little

dem

9

too little

too much

dem

10

just right

just right

indep

11

too much

too much

repub

12

too much

just right

dem

13

too much

just right

dem

14

just right

too little

dem

15

too little

too much

dem

16

just right

just right

indep

17

too much

too much

repub

18

too much

just right

dem

19

too much

just right

dem

20

just right

just right

indep

a. Using this data, create two separate frequency tables, one that summarizes opinions on military spending and one that summarizes opinions on social program spending. The tables should include columns for frequency, percent, and cumulative percent. Be sure each has a title at the top, labels at the bottom, column headings. Be sure to include “total” row at the bottom (see text for examples). Do this by hand, not by computer software.

b. Using this data, create two bar graphs, one that summarizes opinions on military spending and one that summarizes opinions on social program spending. The graph should have a title at the bottom and labels where appropriate. Do this by hand, not by computer software.