8210 wk2 assignment
Recognize Units of Analysis in Research Scenarios
In a survey, a sample of participants is asked a series of questions. Each question becomes a variable in the study. By answering the questions, each person provides values for the variables. In this example, the individual is considered a unit of analysis, the real-world entity that is observed and for which data are recorded and used in statistical analysis. The unit of analysis is recognized as the basic building block of statistical analysis. Why? Because in most studies, the researcher will make observations using several variables and obtain a value on each of the variables for each unit of analysis. An individual person is often, but not always, the unit of analysis. Units of analysis can be groups of people or animals, organizations, physical objects, batches of physical objects, events, and other real-world entities.
Identify Units of Analysis in Data Files
The units of analysis are real-world entities clearly linked to your data. All of the units of analysis in a study comprise the sample, defined as a subset of all possible observations, and are the source of the information contained in the data set. Frequently, the number of units of analysis in the data set is denoted as n and is also called the sample size. For example, if the units of analysis are students and there are 50 students participating in the study, we would say that our sample size is 50 students (n = 50). Data are usually displayed in a matrix, with each row corresponding to the data from a single unit of analysis. In SPSS, your data will be a matrix with n rows where each row corresponds to one of the units of analysis in your study, and the columns correspond to variables, as shown in the image below.
Consider an example in which a psychologist is studying ways of managing stress. Participants are assigned to either a treatment group (in this condition, participants receive the treatment the psychologist wishes to study) or control group (in this condition, participants do not receive the treatment the psychologist is studying). At the end of the study, the psychologist will measure two aspects of managing stress: perceived control and coping ability.
In the table below, a first variable is an identification number (“Participant Identification”) that helps the researcher verify which unit of analysis or case is the source of the data in the row. The next column, a variable labeled “Treatment Condition,” can take on one of two values (Control or Experimental). The values for this variable tell whether the observation came from a unit of analysis in the control or experimental group. The final two columns show the values for the variables the researcher measured: “Perceived Control” and “Coping Ability.” By looking at this data set, you can tell that the individual with participant identification number 1 was assigned to the control group, and the researcher observed the perceived control and coping ability scores to be 7.65 and 9.16, respectively.
|
Participant Identification Number |
Treatment Condition |
Perceived Control |
Coping Ability |
|
1 |
Control |
7.65 |
9.16 |
|
2 |
Control |
6.47 |
12.98 |
|
3 |
Control |
8.82 |
9.16 |
|
4 |
Experimental |
70.00 |
58.02 |
|
5 |
Experimental |
61.76 |
81.68 |
NEED
Within APA 7.0 there is a subsection called 'clarity.' There are two aspects of clarity. The first is the selection of words and sometimes immediate dependent clauses providing operational definitions to ensure a universal audience can readily access and understand. When using case specific terms [IE: mode], make sure to include insight so an audience is not inadvertently excluded. But the second aspect of clarity is to avoid the use of personal and demonstrative pronouns potentially excluding an audience. Take a look: They indicated their idea was in the best interest of all of them. Any questions? Pronouns cripple a passive audience. Do not be lazy or informal: use the antecedents.
When sharing nominal data with a general, passive [cannot respond] audience, an effective writer always provides an explanatory text just before each table, figure or graph. Informally, the writer offers [figuratively]: heads up on what is coming. There is going to be visual display of data and here is an explanation. The explanatory text identifies the nominal data to be discussed [IE: see figure 1] and orients an audience to what will be seen. The explanatory narrative considers:
- what is the purpose of the data in relationship to the general content discussed?
- what is the explanation of the data as organized [designed, presented] so audience is not distracted trying to orient to labels, columns,...?
- link data to the statistical analysis used, for example: "...the t test of independent samples indicates a significant change in mean after treatment."
An effective explanatory text has an audience anticipating as opposed to questioning.
NOTE: there is not a single personal or demonstrative pronoun within the announcement [excpet the obvious example].
INSTRUCTION FEEDBACK Please open and reflect upon the course announcement: Assignment Tutorial. Within you will see step-by-step expected product format. Title page is malformed. Initial heading does not inform. There is no topical sentence. First sentence starts with a pronoun [?]. "It is possible,...", "Inorder to..." and "It is importnat..." are phrases best avoided within quant methods but in addition marks a pattern inwriting nt allowed in APA. Where are the expected, separated, explained SPSS data displays within product?
Four Levels of Measurement
In social scientific research, measurement is the process of assigning numbers or words to observations made in the real world. It is the foundation of the research process because it results in data that can be examined using mathematical operations. Because different things have different properties, what is being measured (what type of variable) determines how it is measured—or more specifically, the level at which it can be measured. That’s what the term levels of measurement, describes, and there are four recognized levels.
· 1
1
Ratio
· 2
2
Interval
· 3
3
Ordinal
· 4
4
Nominal
The level at which something will be or was measured affects the values recorded for the units of analysis and the mathematical operations that can be performed on the data collected: In general, the higher the level of measurement the less restricted researchers are in the types of mathematical operations they can perform on the data. Researchers, therefore, want to collect data at the highest level possible.
The graphic shows the four levels of measurement in order from highest (least restricted/most informative) to lowest (most restricted/least informative).
Although the levels of measurement are ranked from most to least informative, all four levels have a role in conducting social science research. Again, knowing the level of measurement is important because those at lower levels may limit the kinds of statements you can make about your results. A closer look at the pyramid, starting at the bottom or lowest level, follows.
Nominal Measurement
Merely assigning objects to unique categories is a type of measurement. For example, when researchers label participants in a study as male or female, they are performing measurement; specifically, they are measuring the variable of gender using what is known as a nominal scale. The nominal, level of measurement addresses naming—that is, identifying or categorizing objects using a name. A variable that can be named is said to meet the naming criterion of measurement, which is that two things with the same name carry the same value. Also, note that the categories for a nominal variable should be mutually exclusive.
A nominal scale is the weakest form of measurement. Again, the values for variables measured at the nominal level merely identify or name the real-world entity being measured. Numbers are not even necessary for scale values when using a nominal scale; instead, they can have descriptive names. For gender, for example, you might have two measurements: female (=1) and male (=2). There is no relationship between the two numerical values (that is, the numbers should not be interpreted that males are "one unit more or larger than are females); they just stand-in for the name of the category. You could just as easily write female and male as the scale value. Other examples of nominal variables are numbers on a baseball player's uniform, social security numbers, gender, and race.
Ordinal Measurement
Moving up the pyramid, the ordinal, level of measurement is next; it addresses ordering real-world entities. A variable that can be ordered is said to meet the ordering criterion of measurement. More specifically, this criterion requires that objects can be placed in ascending or descending order.
Example 1: Emojis
A set of emojis can serve as a good starting point for understanding the ordinal measurement, as these scales are sometimes challenging:
The emojis of course symbolize real-world emotions ranging from very happy (left) to very sad (right). The set of emojis comprises a scale, and if you associate a number with each emotional state, the ordinal scale could look like this:
The larger the number, the happier the emotion. Note, though, that it is difficult to say that the difference between Very Sad and Sad is exactly the same as the difference between any other two consecutive values on the scale. This is a characteristic of the ordinal level of measurement: ordinal measurement allows us to assign values that are in rank order, but the distances between the values are not necessarily equal. Note, also, that there is no value that represents zero in the real world; that is, there is no such thing as zero emotion. Noting this feature is important for determining the level of measurement for a scale, which will be discussed more below in the section on ratio measurements. Researchers often treat ordinal data as being on an interval level of measurement; you will learn more about this as you continue your studies in statistics.
Example 2: Measuring the Hardness of Minerals
For now, consider a more academic example of ordinal measurement: that of measuring the hardness of minerals. Hard minerals have the property that they can scratch softer minerals, and the softer mineral cannot scratch the harder mineral. The measurement process involves taking two minerals and seeing which mineral scratches the other. Diamond is the hardest mineral and scratches all other minerals, and talc is the softest mineral and cannot scratch any other mineral. Geologists use this process, and the result is the Moh Hardness Scale, shown below.
|
Mineral |
Hardness |
|
Talc |
1 |
|
Gypsum |
2 |
|
Calcite |
3 |
|
Fluorite |
4 |
|
Apatite |
5 |
|
Orthoclase |
6 |
|
Quartz |
7 |
|
Topaz |
8 |
|
Corundum |
9 |
|
Diamond |
10 |
Mohs Hardness Scale. Source: http://geology.com/minerals/mohs-hardness-scale.shtml
Note that the differences in adjacent numbers do not reflect equal differences in hardness. Researchers cannot say, for example, that a diamond is ten times harder than talc. However, looking at the values above, you can say that diamond is harder than apatite, which is harder than talc. The claim is that the greater the number, the harder the mineral. The variable minerals, therefore, satisfies the ordering criterion of measurement.
Interval and Ratio Measurement
Interval Measurement
The level of measurement second from the top of the pyramid is interval, measurement, which addresses differences, or intervals, between entities. The interval criterion requires that differences in the real world correspond to differences in the values a scale uses. In addition to allowing researchers to categorize and rank-order entities, interval measurement has the additional property of the intervals on the scale being equivalent. That is, variables measured at the interval level contain the properties of both nominal and ordinal variables but also have equal distance between values on the scale. It is important to note, though, that the zero value for interval scales tends to be rather arbitrary and does not correspond to a meaningful real-world observation. In other words, a zero value on an interval scale does not carry meaning.
The measurement of temperature would be considered an interval level of measurement. Considering temperature measured in both degrees centigrade and degrees Fahrenheit, as presented in the table that follows, may help you to understand interval scales of measurement:
|
Real-World Entity Being Measured |
Degrees Centigrade |
Degrees Fahrenheit |
|
A |
0 |
32° |
|
B |
50 |
122° |
|
C |
50 |
122° |
|
D |
100 |
212° |
· bullet
First, notice the differences in temperature. The temperature difference between real-world entities A and B is the same as the difference between B and C, regardless of whether you use temperature measured in degrees Centigrade or Fahrenheit.
· bullet
Alternatively, the proportions are not the same. In Centigrade, D (100) has a value twice that of C (50), but in degrees Fahrenheit, D (212) is 1.73 times greater than C (122).
· bullet
Using the definition of degree, physicists would agree that neither scale measures an absolute zero, but rather that equal intervals on either scale have the same real-world meaning.
Interval scales are used in a variety of settings. They are popular, in part, because many statistical procedures are based on the premise that differences in scale values have real-world meaning. Examples of constructs that are often viewed as interval measurement include intelligence as measured by an IQ test, depression measured by a self-report questionnaire in which item scores are added together, and some subjective judgments.
Ratio Measurement
The ratio level of measurement is at the top of the pyramid. Ratio measurement addresses proportions, or ratios, between entities. It allows researchers to make a variety of statements about the data, including that scale values are in proportion to one another. The ratio criterion, then, requires that equal proportions on the measurement scale correspond to equal proportions in the real world. In short, the important test of the ratio criterion is that a measurement of zero has real-world meaning.
Here’s another way to think of it:
Variables that can be measured at the ratio level contain all of the properties of nominal, ordinal, and interval variables, but have the additional property that the zero value for the variable is meaningful. Throughout the process of measurement, determining whether a measurement meets the two lower levels of measurement (nominal and ordinal) is often easy, but determining whether the two higher levels (interval and ratio) are met can be more challenging.
The key is paying attention to how differences in a given real-world construct are mapped into numbers along with the real-world concept that is mapped into zero.
For example, it’s easy to tell that weight is a ratio variable because zero means an entity weighs nothing. Measurement for the number of items correct on a math test is also easily a ratio measurement because a person can get zero items correct. However, mathematical ability as a psychological construct measured by the math test would be harder to defend as a ratio scale. The numbers on the test may order individuals according to mathematical ability and the interval criterion may also be met, but what does it mean in the real world to have zero mathematical ability? The researcher is responsible for arguing, possibly using data or other research sources, that the scale being used has the properties of measurement being claimed.
In general, if you are counting items or comparing them to a physical standard, you can use a ratio scale.
· Measures of distance, weight, dollars, and other things where zero has clear real-world meaning are also usually easily identified as ratio scale.
· Often, you can identify a scale as a ratio scale by examining changes to different units of measurement.
· If intervals that are equal on one scale are also equal on a second scale, and zero is the same on both scales, then the measurement process is most likely ratio measurement.
The levels of measurement are, ultimately, cumulative or additive. That is, the ratio level of measurement satisfies all the measurement criteria required for the levels below it, the interval level those below it, the ordinal level the level below it. This progression is shown below, along with a summary of each level’s mathematical focus.
|
Term |
Meaning |
|
+∞ |
Positive infinity. |
|
-.564 |
Observed value of the test statistic. |
|
-∞ |
Negative infinity. |
|
.004 |
p-value |
|
.576 |
p-value |
|
2-tailed |
The alternative hypothesis states simply that there is a difference between the means but does not specify the direction of the difference. |
|
61 |
61 is the degrees of freedom (df) calculated by n-2 (63-2) |
|
alpha |
The probability of a type I error. |
|
box-plot |
A graph that displays key elements of distribution. |
|
categorical variables |
Variables that have a limited number of possible values; participants in the study get placed into one of a small number of categories for the variable. |
|
central limit theorem |
regardless of the distribution of the population, if the sample size is relatively large (a rule of thumb is n > 30), the sampling distribution of sample means is close to normal. |
|
cohen’s d |
A measure of effect size. |
|
confidence intervals |
A range of values used to specify the likelihood that the population parameter is contained within a specified range. |
|
continuous variable |
A continuous variable is one based on an interval or ratio level of measurement. Between any two values for the variable, there is another possible value. |
|
continuous variables |
A continuous variable is one based on an interval or ratio level of measurement. Between any two values for the variable, there is another possible value. |
|
control group |
The collection of participants in the condition of an experiment who do not receive the treatment. A group receiving an actual treatment can then be compared to the control group. |
|
dependent variable |
A measure of the outcome that allows us to determine whether the independent variable has an effect. |
|
discrete |
A variable based on an ordinal, interval, or ratio levels of measurement and has a countable, not infinite, set of possible values. |
|
distribution of a population |
The distribution of all values for all elements of the population. |
|
distribution of a sample |
The distribution of actual observations based on the data that you collect. |
|
distribution of the sample |
Sample distribution (also called distribution of the sample) –for a variable, the distribution of values for the elements of the population that are actually observed. (note that Sample distribution is different from Sampling distribution). |
|
element |
an entity in the population that may be selected for the sample and then observed. |
|
factor |
The alternative hypothesis stated simply that there was a difference between the means, and does specify the direction of the difference. |
|
frequency distribution |
A table or graph that shows the values of a variable and the number (count) of observations associated with each value |
|
general rule |
Although different sources give slightly different information about assessing the strength of a correlation coefficient, we can use the following as a general rule for interpreting the correlation coefficient:.8 to 1: very strong.6 to .8: strong.4 to .6: moderate.2 to .4: weak0 to .2: very weak to no relationship |
|
independent variable |
The variable that is studied to see if it causes a change in a dependent variable. |
|
interval |
The level of measurement that addresses differences, or intervals, between entities. |
|
interval estimates |
A range of values that is likely to contain the population parameter. |
|
levels of confidence |
The probability that the population parameter is contained within a specified range of values. Usually, the level of confidence is 0.95 or 95%. |
|
levels of measurement |
Also called scale of measurement, describes the amount and type of information (nominal, ordinal, interval, and ratio) that is conveyed by the numbers or words assigned to real-world objects during the measurement process. |
|
levene’s test |
Tests the null hypothesis that the two populations show equal variance. |
|
margin of error |
The amount of estimated error in the point estimate of a population parameter determined by the level of confidence and the sampling distribution for the sample statistic. In estimating the population means, the margin of error equals a critical value for statistic times the standard error of the mean, e.g., Zα2*σn. |
|
mean |
The average of the scores for a variable. |
|
median |
An appropriate measure of central tendency when a measurement is at the ordinal, interval, or ratio level. |
|
mode |
The most frequently occurring value in the data set. |
|
n |
n = sample size |
|
n1 |
n1 = the number of participants in sample 1 |
|
n2 |
n2 = the number of participants in sample 2 |
|
negative skew |
This refers to the tail of the distribution appearing longer on the left-hand side of the distribution. |
|
nominal |
The lowest level of measurement, which addresses naming—identifying or categorizing objects using a name. |
|
one-tailed |
The alternative hypothesis is directional and states that one mean is greater than the other. |
|
ordinal |
The level of measurement above nominal that addresses ordering real-world entities. |
|
outliers |
Observation points that are distant from other observations. |
|
p <.01 |
This indicates that the p-value (.000) is less than .01 and that the correlation test is statistically significant. |
|
p-value |
The probability of obtaining a result equal to or "more extreme" than what was actually observed, when the null hypothesis is true. |
|
pictogram |
A graphic character used in picture writing. |
|
point estimate |
An estimate of the unknown parameter of interest using a single value. |
|
population |
The set of all possible elements (entities and observations) to which the researcher wishes to generalize. |
|
population distribution |
for a variable, the distribution of all values for all elements of the population. |
|
positive skew |
This refers to the tail of the distribution appearing longer on the right side of the distribution. |
|
qualitative |
A variable based on nominal measurement. |
|
quantitative |
A variable with an ordinal, interval or ratio level of measurement. |
|
r |
r is the symbol indicating a Pearson’s correlation coefficient |
|
r-squared |
The proportion of variability in the dependent variable that is accounted for by your model. |
|
random assignment |
Random assignment is placing experimental units in treatment conditions or control conditions by use of a random process. |
|
random sampling |
The selection of experimental units so that each element in the population has the same chance of being selected for the sample. |
|
random variable |
A variable whose value is determined by a random process such as being selected in a survey or being observed in an experiment. |
|
ratio |
The level of measurement that addresses proportion, or ratios between entities. |
|
ratio level |
The level of measurement that addresses proportion, or ratios, between entities. |
|
relative frequency distribution |
A table or graph that shows the values of a variable and the proportion of observations associated with each value using decimal fractions or percentages. |
|
research design |
The overall plan for how a researcher will collect data. |
|
sample |
A subset of all possible observations. |
|
sampling distribution |
The distribution of a sample statistic. |
|
sampling distribution of the sample mean |
The distribution of values for the sample mean for all possible random samples of size n. |
|
sampling error |
The absolute value of a statistic minus the parameter being estimated. |
|
simple random sampling |
Each unit in the population has an equal chance of being selected into the sample. |
|
statistical analyses |
The use of probabilistic models to analyze data. |
|
statistical inferences |
the process of using sample information to make statements about population parameters. |
|
statistical power |
The probability of rejecting a null hypothesis if the null is false (i.e., the alternative is true). |
|
statistically significant |
Statistical significance means a null hypothesis has been rejected. |
|
t-test for two independent groups |
A statistical test used to examine whether two independent groups have different means on a dependent variable. This test is also sometimes referred to as an independent samples t-test. |
|
two-tailed |
The alternative hypothesis states simply that there is a difference between the means but does not specify the direction of the difference. |
|
type i error |
Rejecting the null hypothesis if the null is actually true. |
|
type ii error |
Incorrectly retaining a false null hypothesis (a "false negative"). |
|
unit of analysis |
The real-world entity that is observed and for which data are recorded and used in statistical analysis. |
|
value |
A single observation defined for a variable. |
|
variable |
The mathematical representation of the real-world entity being measured. |
|
variance |
Variance is a measure of variability in a set of observations based on the approximate average of squared deviations from the mean. |
|
visual displays of data |
Help researchers communicate the distribution and other key information (the story they are telling with their data) both effectively and efficiently. |
|
µ1 |
mean for population 1 |
|
µ2 |
mean for population 2 |
|
β |
β is the symbol researchers use when they report a standardized regression coefficient. |
|
μ not primed |
This indicates the population means for the “not primed” condition. |
|
μ not primed - μ primed >0 |
The alternative hypothesis specifies that the “not primed” condition will score higher than the “primed” condition. |
|
μ primed |
This indicates the population means for the “primed” condition. |