Brilliant Answer

hottboy561
SkillBuilderUnitofAnalysis.docx

Recognize Units of Analysis in Research Scenarios

In a survey, a sample of participants is asked a series of questions. Each question becomes a variable in the study. By answering the questions, each person provides values for the variables. In this example, the individual is considered a unit of analysis, the real-world entity that is observed and for which data are recorded and used in statistical analysis. The unit of analysis is recognized as the basic building block of statistical analysis. Why? Because in most studies, the researcher will make observations using several variables and obtain a value on each of the variables for each unit of analysis. An individual person is often, but not always, the unit of analysis. Units of analysis can be groups of people or animals, organizations, physical objects, batches of physical objects, events, and other real-world entities.

Identify Units of Analysis in Data Files

The units of analysis are real-world entities clearly linked to your data. All of the units of analysis in a study comprise the sample, defined as a subset of all possible observations, and are the source of the information contained in the data set. Frequently, the number of units of analysis in the data set is denoted as n and is also called the sample size. For example, if the units of analysis are students and there are 50 students participating in the study, we would say that our sample size is 50 students (n = 50).  Data are usually displayed in a matrix, with each row corresponding to the data from a single unit of analysis. In SPSS, your data will be a matrix with n rows where each row corresponds to one of the units of analysis in your study, and the columns correspond to variables, as shown in the image below.

Consider an example in which a psychologist is studying ways of managing stress. Participants are assigned to either a treatment group (in this condition, participants receive the treatment the psychologist wishes to study) or control group (in this condition, participants do not receive the treatment the psychologist is studying). At the end of the study, the psychologist will measure two aspects of managing stress: perceived control and coping ability. 

In the table below, a first variable is an identification number (“Participant Identification”) that helps the researcher verify which unit of analysis or case is the source of the data in the row. The next column, a variable labeled “Treatment Condition,” can take on one of two values (Control or Experimental). The values for this variable tell whether the observation came from a unit of analysis in the control or experimental group. The final two columns show the values for the variables the researcher measured: “Perceived Control” and “Coping Ability.” By looking at this data set, you can tell that the individual with participant identification number 1 was assigned to the control group, and the researcher observed the perceived control and coping ability scores to be 7.65 and 9.16, respectively.

Participant Identification Number

Treatment Condition

Perceived Control

Coping Ability

1

Control

7.65

9.16

2

Control

6.47

12.98

3

Control

8.82

9.16

4

Experimental

70.00

58.02

5

Experimental

61.76

81.68

Term

Meaning

+∞

Positive infinity.

-.564

Observed value of the test statistic.

-∞

Negative infinity.

.004

p-value

.576

p-value

2-tailed

The alternative hypothesis states simply that there is a difference between the means but does not specify the direction of the difference.

61

61 is the degrees of freedom (df) calculated by n-2 (63-2)

alpha

The probability of a type I error.

box-plot

A graph that displays key elements of distribution.

categorical variables

Variables that have a limited number of possible values; participants in the study get placed into one of a small number of categories for the variable.

central limit theorem

regardless of the distribution of the population, if the sample size is relatively large (a rule of thumb is n > 30), the sampling distribution of sample means is close to normal.

cohen’s d

A measure of effect size.

confidence intervals

A range of values used to specify the likelihood that the population parameter is contained within a specified range.

continuous variable

A continuous variable is one based on an interval or ratio level of measurement. Between any two values for the variable, there is another possible value.

continuous variables

A continuous variable is one based on an interval or ratio level of measurement. Between any two values for the variable, there is another possible value.

control group

The collection of participants in the condition of an experiment who do not receive the treatment. A group receiving an actual treatment can then be compared to the control group.

dependent variable

A measure of the outcome that allows us to determine whether the independent variable has an effect.

discrete

A variable based on an ordinal, interval, or ratio levels of measurement and has a countable, not infinite, set of possible values.

distribution of a population

The distribution of all values for all elements of the population.

distribution of a sample

The distribution of actual observations based on the data that you collect.

distribution of the sample

Sample distribution (also called distribution of the sample) –for a variable, the distribution of values for the elements of the population that are actually observed. (note that Sample distribution is different from Sampling distribution).

element

an entity in the population that may be selected for the sample and then observed.

factor

The alternative hypothesis stated simply that there was a difference between the means, and does specify the direction of the difference.

frequency distribution

A table or graph that shows the values of a variable and the number (count) of observations associated with each value

general rule

Although different sources give slightly different information about assessing the strength of a correlation coefficient, we can use the following as a general rule for interpreting the correlation coefficient:.8 to 1: very strong.6 to .8: strong.4 to .6: moderate.2 to .4: weak0 to .2: very weak to no relationship

independent variable

The variable that is studied to see if it causes a change in a dependent variable.

interval

The level of measurement that addresses differences, or intervals, between entities.

interval estimates

A range of values that is likely to contain the population parameter.

levels of confidence

The probability that the population parameter is contained within a specified range of values. Usually, the level of confidence is 0.95 or 95%.

levels of measurement

Also called scale of measurement, describes the amount and type of information (nominal, ordinal, interval, and ratio) that is conveyed by the numbers or words assigned to real-world objects during the measurement process.

levene’s test

Tests the null hypothesis that the two populations show equal variance.

margin of error

The amount of estimated error in the point estimate of a population parameter determined by the level of confidence and the sampling distribution for the sample statistic. In estimating the population means, the margin of error equals a critical value for statistic times the standard error of the mean, e.g., Zα2*σn.

mean

The average of the scores for a variable.

median

An appropriate measure of central tendency when a measurement is at the ordinal, interval, or ratio level.

mode

The most frequently occurring value in the data set.

n

n = sample size

n1

n1 = the number of participants in sample 1

n2

n2 = the number of participants in sample 2

negative skew

This refers to the tail of the distribution appearing longer on the left-hand side of the distribution.

nominal

The lowest level of measurement, which addresses naming—identifying or categorizing objects using a name.

one-tailed

The alternative hypothesis is directional and states that one mean is greater than the other.

ordinal

The level of measurement above nominal that addresses ordering real-world entities.

outliers

Observation points that are distant from other observations.

p <.01

This indicates that the p-value (.000) is less than .01 and that the correlation test is statistically significant.

p-value

The probability of obtaining a result equal to or "more extreme" than what was actually observed, when the null hypothesis is true.

pictogram

A graphic character used in picture writing.

point estimate

An estimate of the unknown parameter of interest using a single value.

population

The set of all possible elements (entities and observations) to which the researcher wishes to generalize.

population distribution

for a variable, the distribution of all values for all elements of the population.

positive skew

This refers to the tail of the distribution appearing longer on the right side of the distribution.

qualitative

A variable based on nominal measurement.

quantitative

A variable with an ordinal, interval or ratio level of measurement.

r

    r is the symbol indicating a Pearson’s correlation coefficient

r-squared

The proportion of variability in the dependent variable that is accounted for by your model.

random assignment

Random assignment is placing experimental units in treatment conditions or control conditions by use of a random process.

random sampling

The selection of experimental units so that each element in the population has the same chance of being selected for the sample.

random variable

A variable whose value is determined by a random process such as being selected in a survey or being observed in an experiment.

ratio

The level of measurement that addresses proportion, or ratios between entities.    

ratio level

The level of measurement that addresses proportion, or ratios, between entities.

relative frequency distribution

A table or graph that shows the values of a variable and the proportion of observations associated with each value using decimal fractions or percentages.

research design

The overall plan for how a researcher will collect data.

sample

A subset of all possible observations.

sampling distribution

The distribution of a sample statistic.

sampling distribution of the sample mean

The distribution of values for the sample mean for all possible random samples of size n.

sampling error

The absolute value of a statistic minus the parameter being estimated.

simple random sampling

Each unit in the population has an equal chance of being selected into the sample.

statistical analyses

The use of probabilistic models to analyze data.

statistical inferences

the process of using sample information to make statements about population parameters.

statistical power

The probability of rejecting a null hypothesis if the null is false (i.e., the alternative is true).

statistically significant

Statistical significance means a null hypothesis has been rejected.

t-test for two independent groups

A statistical test used to examine whether two independent groups have different means on a dependent variable. This test is also sometimes referred to as an independent samples t-test.

two-tailed

The alternative hypothesis states simply that there is a difference between the means but does not specify the direction of the difference.

type i error

Rejecting the null hypothesis if the null is actually true.

type ii error

Incorrectly retaining a false null hypothesis (a "false negative").

unit of analysis

The real-world entity that is observed and for which data are recorded and used in statistical analysis.

value

A single observation defined for a variable.

variable

The mathematical representation of the real-world entity being measured.

variance

Variance is a measure of variability in a set of observations based on the approximate average of squared deviations from the mean.    

visual displays of data

Help researchers communicate the distribution and other key information (the story they are telling with their data) both effectively and efficiently.

µ1

mean for population 1

µ2

mean for population 2

β

β is the symbol researchers use when they report a standardized regression coefficient.

μ not primed

This indicates the population means for the “not primed” condition.

μ not primed - μ primed >0

The alternative hypothesis specifies that the “not primed” condition will score higher than the “primed” condition.

μ primed

This indicates the population means for the “primed” condition.