STAT 300 Assignment 3 Inferential Statistics Analysis and Writeup
STAT200: Assignment #1 - Descriptive Statistics Analysis Plan - Template
Page 1 of 3
University of Maryland University College
STAT200 - Assignment #1: Descriptive Statistics Data Analysis P lan
Identifying Information
Student (Full Name): Mandy Francisco
Class: STAT200 6392
Instructor: Professor Roger Davis
Date: June 2, 2020
Scenario:
From the dataset, I chose UniqueID# 36. I am married with an annual income of $99610. I am the head of the household and I am 36. It is only two of us, me and my spouse. We both have our bachelor’s degree and are both employed. Our annual income is a total of both our incomes together in the household.
Table 1. Variables Selected for the Analysis
Variable Name in the Data Set
Description
(See the data dictionary for describing the variables.)
Type of Variable
(Qualitative or Quantitative)
Variable 1: “Income”
Annual household income in USD.
Quantitative
Variable 2: “Age Head Household”
Head of household’s age group
Quantitative
Variable 3: “Family Size”
Household family size
Qualitative
Variable 4:” Food Expenditures”
Total amount of expenditure of food annually
Quantitative
Variable 5:” Entertainment Expenditures”
Total amount of expenditure on entertainment annually
Quantitative
Reason(s) for Selecting the Variables and Expected Outcome(s):
Variable 1: “Income” – Having more than one employed person in the household helps increase the total annual income. This gives us a bit more to spend with when it comes to our budget plan. This variable will show the annual income in USD altogether.
Variable 2: “ Age Head of Household “- I chose age group because each age group has a general spending pattern/habit. There are a “need”and “want” that differs between age groups.
This will show the age of the head of household.
Variable 3: “ Family Size “ - This variable will show the size of the family in the household. I expect double the expenditure with two people.
Variable 4: “ Food Expenditures “ This variable shows the annual expenditure of food in the household. I expect it to be double.
Variable 5: “ Entertainment Expenditures “ - This shows the entertainment expenditure per household per year. This should double since there is two as well.
Data Set Description :
Proposed Data Analysis:
Measures of Central Tendency and Dispersion
Complete Table 2. Numerical Summaries of the Selected Variables and briefly explain why you choose those measurements. Note: The information for the required variable, “Income,” has already been completed and can be used as a guide for completing information on the remaining variables.
Table 2. Numerical Summaries of the Selected Variables
Variable Name
Measures of Central Tendency and Dispersion
Rationale for Why Appropriate
Variable 1:
“Income”
- Number of Observations
- Median
- Sample Standard Deviation
I am using median for two reasons:
- If there are any outliers or the data is not normally distributed, the median is the best measure of central tendency.
- The variable is quantitative.
I am using sample standard deviation for three reasons:
- The data is a sample from a larger data set.
- It is the most commonly used measure of dispersion.
- The variable is quantitative.
Variable 2:”Age Group”
- Number of Observations
- Median
- Standard Deviation
I am using mode for two reasons:
1. Median variable of age will be better than a mean for then it will look asymmetric
2. The variable is Quantitative
I am using sample standard deviation for two reasons:
1. It shows the spread of data
2. It is the most commonly used for standard interpretation
Variable 3: “Family Size”
- Number of Observations
- Mean
- Standard Deviation
I chose mean because:
- The variable is quantitative
- The range is smaller with only little outliers
I chose standard deviation because:
- The variable is quantitative
- It is most commonly used when it comes to measuring dispersion of data
Variable 4: “Annual Food Expenditure”
- Number of Observations
- Mean
- Standard Deviation
I chose mean because:
- The variable is quantitative
- The range is smaller with only little outliers
- Mean is a good choice if there are no extreme values
I chose standard deviation because:
1.The variable is quantitative
2.It is most commonly used when it comes to measuring dispersion of data
Variable 5: “Annual Entertainment Expenditure”
- Number of Observations
- Mean
- Standard Deviation
I am using mean because:
1. The variable is quantitative
2. Mean is the best when it comes to data with not much outliers and is normally distributed
I chose standard deviation because:
1. It shows the spread of data
2. It is a quantitative variable
Graphs and/or Tables
Table 3. Type of Graphs and/or Tables for Selected Variables
Variable Name
Graph and/or Table
Rationale for why Appropriate?
Variable 1:
“Income”
Graph: I will use the histogram to show the normal distribution of data.
Histogram is one of the best plot to show the normal distribution of quantitative level data .
Variable 2:”Age Group”
Graph: I will use box plot in order to show the income distribution
To show the distribution of income. Box plots are best used to show data that is highly skewed like income.
Variable 3: “Family Size”
Graph: I will use pie chart to show the family size
The pie chart is an ideal graph to show the size of a family
Variable 4: “Food Expenditures”
Graph: I will use the histogram
to show the annual food expenditures
The histogram is best to show the distribution of quantitative data
Variable 5: “Entertainment Expenditures”
Graph: I will use the histogram to show the annual entertainment expenditures
The histogram is the best plot when it comes to showing distribution of quantitative level data