Brilliant Answer

SkillBuilderUnitofAnalysis.docx

Home >Business & Finance homework help >Brilliant Answer

Recognize Units of Analysis in Research Scenarios

In a survey, a sample of participants is asked a series of questions. Each question becomes a variable in the study. By answering the questions, each person provides values for the variables. In this example, the individual is considered a unit of analysis, the real-world entity that is observed and for which data are recorded and used in statistical analysis. The unit of analysis is recognized as the basic building block of statistical analysis. Why? Because in most studies, the researcher will make observations using several variables and obtain a value on each of the variables for each unit of analysis. An individual person is often, but not always, the unit of analysis. Units of analysis can be groups of people or animals, organizations, physical objects, batches of physical objects, events, and other real-world entities.

Identify Units of Analysis in Data Files

The units of analysis are real-world entities clearly linked to your data. All of the units of analysis in a study comprise the sample, defined as a subset of all possible observations, and are the source of the information contained in the data set. Frequently, the number of units of analysis in the data set is denoted as n and is also called the sample size. For example, if the units of analysis are students and there are 50 students participating in the study, we would say that our sample size is 50 students (n = 50). Data are usually displayed in a matrix, with each row corresponding to the data from a single unit of analysis. In SPSS, your data will be a matrix with n rows where each row corresponds to one of the units of analysis in your study, and the columns correspond to variables, as shown in the image below.

Consider an example in which a psychologist is studying ways of managing stress. Participants are assigned to either a treatment group (in this condition, participants receive the treatment the psychologist wishes to study) or control group (in this condition, participants do not receive the treatment the psychologist is studying). At the end of the study, the psychologist will measure two aspects of managing stress: perceived control and coping ability.

In the table below, a first variable is an identification number (“Participant Identification”) that helps the researcher verify which unit of analysis or case is the source of the data in the row. The next column, a variable labeled “Treatment Condition,” can take on one of two values (Control or Experimental). The values for this variable tell whether the observation came from a unit of analysis in the control or experimental group. The final two columns show the values for the variables the researcher measured: “Perceived Control” and “Coping Ability.” By looking at this data set, you can tell that the individual with participant identification number 1 was assigned to the control group, and the researcher observed the perceived control and coping ability scores to be 7.65 and 9.16, respectively.

Participant Identification Number	Treatment Condition	Perceived Control	Coping Ability
1	Control	7.65	9.16
2	Control	6.47	12.98
3	Control	8.82	9.16
4	Experimental	70.00	58.02
5	Experimental	61.76	81.68

Term	Meaning
+∞	Positive infinity.
-.564	Observed value of the test statistic.
-∞	Negative infinity.
.004	p-value
.576	p-value
2-tailed	The alternative hypothesis states simply that there is a difference between the means but does not specify the direction of the difference.
61	61 is the degrees of freedom (df) calculated by n-2 (63-2)
alpha	The probability of a type I error.
box-plot	A graph that displays key elements of distribution.
categorical variables	Variables that have a limited number of possible values; participants in the study get placed into one of a small number of categories for the variable.
central limit theorem	regardless of the distribution of the population, if the sample size is relatively large (a rule of thumb is n > 30), the sampling distribution of sample means is close to normal.
cohen’s d	A measure of effect size.
confidence intervals	A range of values used to specify the likelihood that the population parameter is contained within a specified range.
continuous variable	A continuous variable is one based on an interval or ratio level of measurement. Between any two values for the variable, there is another possible value.
continuous variables	A continuous variable is one based on an interval or ratio level of measurement. Between any two values for the variable, there is another possible value.
control group	The collection of participants in the condition of an experiment who do not receive the treatment. A group receiving an actual treatment can then be compared to the control group.
dependent variable	A measure of the outcome that allows us to determine whether the independent variable has an effect.
discrete	A variable based on an ordinal, interval, or ratio levels of measurement and has a countable, not infinite, set of possible values.
distribution of a population	The distribution of all values for all elements of the population.
distribution of a sample	The distribution of actual observations based on the data that you collect.
distribution of the sample	Sample distribution (also called distribution of the sample) –for a variable, the distribution of values for the elements of the population that are actually observed. (note that Sample distribution is different from Sampling distribution).
element	an entity in the population that may be selected for the sample and then observed.
factor	The alternative hypothesis stated simply that there was a difference between the means, and does specify the direction of the difference.
frequency distribution	A table or graph that shows the values of a variable and the number (count) of observations associated with each value
general rule	Although different sources give slightly different information about assessing the strength of a correlation coefficient, we can use the following as a general rule for interpreting the correlation coefficient:.8 to 1: very strong.6 to .8: strong.4 to .6: moderate.2 to .4: weak0 to .2: very weak to no relationship
independent variable	The variable that is studied to see if it causes a change in a dependent variable.
interval	The level of measurement that addresses differences, or intervals, between entities.
interval estimates	A range of values that is likely to contain the population parameter.
levels of confidence	The probability that the population parameter is contained within a specified range of values. Usually, the level of confidence is 0.95 or 95%.
levels of measurement	Also called scale of measurement, describes the amount and type of information (nominal, ordinal, interval, and ratio) that is conveyed by the numbers or words assigned to real-world objects during the measurement process.
levene’s test	Tests the null hypothesis that the two populations show equal variance.
margin of error	The amount of estimated error in the point estimate of a population parameter determined by the level of confidence and the sampling distribution for the sample statistic. In estimating the population means, the margin of error equals a critical value for statistic times the standard error of the mean, e.g., Zα2*σn.
mean	The average of the scores for a variable.
median	An appropriate measure of central tendency when a measurement is at the ordinal, interval, or ratio level.
mode	The most frequently occurring value in the data set.
n	n = sample size
n1	n1 = the number of participants in sample 1
n2	n2 = the number of participants in sample 2
negative skew	This refers to the tail of the distribution appearing longer on the left-hand side of the distribution.
nominal	The lowest level of measurement, which addresses naming—identifying or categorizing objects using a name.
one-tailed	The alternative hypothesis is directional and states that one mean is greater than the other.
ordinal	The level of measurement above nominal that addresses ordering real-world entities.
outliers	Observation points that are distant from other observations.
p <.01	This indicates that the p-value (.000) is less than .01 and that the correlation test is statistically significant.
p-value	The probability of obtaining a result equal to or "more extreme" than what was actually observed, when the null hypothesis is true.
pictogram	A graphic character used in picture writing.
point estimate	An estimate of the unknown parameter of interest using a single value.
population	The set of all possible elements (entities and observations) to which the researcher wishes to generalize.
population distribution	for a variable, the distribution of all values for all elements of the population.
positive skew	This refers to the tail of the distribution appearing longer on the right side of the distribution.
qualitative	A variable based on nominal measurement.
quantitative	A variable with an ordinal, interval or ratio level of measurement.
r	r is the symbol indicating a Pearson’s correlation coefficient
r-squared	The proportion of variability in the dependent variable that is accounted for by your model.
random assignment	Random assignment is placing experimental units in treatment conditions or control conditions by use of a random process.
random sampling	The selection of experimental units so that each element in the population has the same chance of being selected for the sample.
random variable	A variable whose value is determined by a random process such as being selected in a survey or being observed in an experiment.
ratio	The level of measurement that addresses proportion, or ratios between entities.
ratio level	The level of measurement that addresses proportion, or ratios, between entities.
relative frequency distribution	A table or graph that shows the values of a variable and the proportion of observations associated with each value using decimal fractions or percentages.
research design	The overall plan for how a researcher will collect data.
sample	A subset of all possible observations.
sampling distribution	The distribution of a sample statistic.
sampling distribution of the sample mean	The distribution of values for the sample mean for all possible random samples of size n.
sampling error	The absolute value of a statistic minus the parameter being estimated.
simple random sampling	Each unit in the population has an equal chance of being selected into the sample.
statistical analyses	The use of probabilistic models to analyze data.
statistical inferences	the process of using sample information to make statements about population parameters.
statistical power	The probability of rejecting a null hypothesis if the null is false (i.e., the alternative is true).
statistically significant	Statistical significance means a null hypothesis has been rejected.
t-test for two independent groups	A statistical test used to examine whether two independent groups have different means on a dependent variable. This test is also sometimes referred to as an independent samples t-test.
two-tailed	The alternative hypothesis states simply that there is a difference between the means but does not specify the direction of the difference.
type i error	Rejecting the null hypothesis if the null is actually true.
type ii error	Incorrectly retaining a false null hypothesis (a "false negative").
unit of analysis	The real-world entity that is observed and for which data are recorded and used in statistical analysis.
value	A single observation defined for a variable.
variable	The mathematical representation of the real-world entity being measured.
variance	Variance is a measure of variability in a set of observations based on the approximate average of squared deviations from the mean.
visual displays of data	Help researchers communicate the distribution and other key information (the story they are telling with their data) both effectively and efficiently.
µ1	mean for population 1
µ2	mean for population 2
β	β is the symbol researchers use when they report a standardized regression coefficient.
μ not primed	This indicates the population means for the “not primed” condition.
μ not primed - μ primed >0	The alternative hypothesis specifies that the “not primed” condition will score higher than the “primed” condition.
μ primed	This indicates the population means for the “primed” condition.