discussion
Research Methods Lecture on Chapter 8: Control, The Keystone of the Experimental
Method
Control has several different meanings in scientific research. A scientist has control over what
he or she studies; the individual decides what hypothesis to create and how best to test it. A
scientist controls how subjects are selected to participate in the research. A scientist controls
how the study will be set up, its basic design, how many groups, what will be done to the groups
in what order, etc. And finally there is a meaning of control that relates to logic...that is a logical
progression from cause to effect and how that is "captured" in the research design and analysis is
another meaning of control. Your textbook discusses each of these meanings of control and I'll
cover them here as well.
Control in Subject Selection and Assignment
Random Sampling is the single best way to select subjects to participate in your study. However,
a true random sample is not possible. you are limited in resources and time and distance. You
randomly sample from a select group and hope for the best. It's not good, but it's not too bad.
True random sample means everyone had an equal chance of being in
our experiment. Everyone? That's not really possible. And that's why
we can't every really randomly sample. So we use the next best thing,
the subject pool. This is at least a pool of people from a variety of
majors all taking a required general studies course. So in a sense they
are a random selection of college students in a given geographic
region. Once you randomly sample from the pool, you next randomly
assign them to groups. The laws of chance tell us that by randomly
assigning to groups, the groups will be equal on things like
intelligence, driving ability, personality, interests, even height and
weight! They will be the same as each other before you begin your
study. In your basic statistics class you learned about the Sampling
Distribution and the Normal Curve. Those are also the "laws of
chance" I'm talking about here that mean your sample of subjects, drawn at random, will be an
accurate representation of the population. A small sample of five people is sufficient to reflect
accurately the characteristics of a huge population! And that's what you want.
But what if you want to study a particular group like drug addicts, autistic kids, etc.? There are
ways to do that involving special sampling techniques within the special population. There are
ways to properly design and implement that kind of Quasi-Experiment or Program Evaluation,
and we will talk about that in this class. It's a whole other series of lectures!
In the previous paragraph I mentioned
random selection of subjects and
random assignment. Selection refers to
acquiring a single group of people from
a population. Assignment refers to
putting them into groups. So you
randomly select 20 people from the
subject pool (population), and then you
randomly assign them to two groups of
10. Typically, one group "gets the
treatment" and is called the
experimental group. The other group
does not get the treatment and is called
the control group. Random selection
relates to external validity. Only by
random selection can you be confident that your sample accurately represents the population and
if the sample is not representative of the population your results are "externally invalid" before
you even conduct the study! Random assignment relates to internal validity, the extent to
which the independent variable, the treatment, is the cause of the dependent variable, the
outcome or measurement. If you don't randomly assign, there's a good chance that your two
groups will be different before the
experiment and thus would of course be
different after the experiment. So the pre-
experiment difference is the true cause of the
post-experiment difference, instead of your
independent variable.
The most common way for us to randomly
assign subjects to groups is to use a random
number table. Assign each subject a
number, 1-20, then using the random
number table you assign them to groups. For
example, looking at this table I see 61424 in
the first column and can start there (although
you can start anywhere in the table you want and move any direction you want). Also remember
that with 20 subjects we are using two-digit numbers, so we must group the numbers in the
random number table in sets of two digits. So my first number is 61. I don't have 61 subjects so
I just ignore that go to the next set, which is 42. Again I ignore that and move on. And I see 42
again. Moving on I next see 04. Ah Ha! Now I do have a subject 4, so that person joins group
1. Next I see 19, and that means subject number 19 goes into group 2. Next comes 86, 54, 60,
and then 05. So subject 5 goes into group 1. And I keep going on like this until all 20 subjects
have been randomly assigned to the two groups.
What if I wanted an equal number of men and women in my two groups? This is called an
equating procedure. We could assign numbers 1-10 to the men and 11-20 to the women and
proceed to use the random number table. If we notice that one group is getting too many women
or men, we can simply reassign the extra man or woman to the other group.
Another potential issue that often comes up in research relates to assigning subjects to groups
based on arrival time. If you assign the first 10 people to group 1 and the second 10 to group 2,
you now have one group of early arrivals and one group of late arrivals and they are no longer
equal to each other. You may alternate them, so that first arrival goes into group 1, second
arrival goes into group 2, etc. This is better, but still has a bit too much regularity: 1,2,1,2,1,2,
etc. It's always 1, then 2. This we know can produce some small bias into our assignment and
we'd like to avoid bias, so instead we assign like this: 1,2,2,1,1,2,2,1,1, etc. Now you see that it's
1, 2...then 2,1....1,2...then....etc. This is called counterbalancing and is the preferred method.
By using random selection and random assignment we have created two groups that are the same
on all things before we begin our study! That means that the one difference we make between
the groups, the experimental treatment, the independent variable, would be the cause of any
measured differences (dependent variable) that we may find.
A study found that those who eat breakfast are healthier than those who don't. The researchers
claimed they used random selection so there should not be a problem. True? Well, this is an
observational study and cannot make cause-effect conclusions, like eating breakfast makes one
healthier than not eating breakfast. The researchers did not use random assignment! Remember,
for a true experiment, we must use both random selection and random assignment. So if they had
randomly assigned people to an "eat breakfast group" and to a "no breakfast group," and imposed
these conditions for 6 months or so, then measured the health of the subjects, we could then
conclude that eating breakfast makes one healthier.
Control in Experimental Design
A research design is the plan for collecting data. This includes the number of groups you will be
using, the levels of the independent variable (IV) to be used, and the strategy for measuring
behavior. A good plan, or good design, eliminates all threats to validity (see chapter two).
Eliminates? Yeah, right. Maybe in that Ideal World that doesn't exist! We can never eliminate
accident and error, but we can reduce. Random Sampling and Random Assignment help.
Another strategy most commonly used is "Holding Conditions Constant." By holding conditions
constant the effects of things like history and maturation are equally present in two groups, the
control group and the experimental group (and in the other groups if any). As long as we treat the
two groups exactly the same (meaning we hold all conditions constant) except for the IV, then
we can detect the effects of the IV despite history and maturation because those two are
happening equally for both groups. History is the passage of time and events, while maturation is
the natural growth and development of people. History can as simple as a researcher talking to
one group more than another group, now the two groups have different histories. Maturation is
both long-term and short-term. For example, it's well-known that drug addicts typically quit by
the time the reach the age of 50 or so (the ones that make to that age). So you can imagine a
drug rehab program that runs over a period of years. Since older drug addicts are more likely to
quit, it may not be the program that causes them to quit, but just plain old maturation.
The easiest and therefore most common experiment is the
posttest only design. Two groups, a control group and an
experimental group, are measured on some behavior. The
control group receives none of the IV, while the
experimental group does get the IV. In discussing designs we often use Code Letters. For
example: R - Grp Exp - T - M; indicates that subjects were Randomly assigned (R) to the
Experimental Group (Grp Exp) and were given some
treatment (T), then measured on some behavior (M).
The control group would then be: R - Grp Con - X -
M, where the X means no treatment given.
This design, the posttest only design controls for maturation and history because those two
factors are happening to both the control group and the experimental group equally.
There are better designs. Consider the pretest-posttest control group design. The name sounds
bad, but the design is very good. There are two groups, a control group and an experimental
group. They are both pretested (a Before measure).
The treatment is imposed on the experimental group,
then after the treatment both groups are post-tested
(the After measure). This design controls for all the
threats to internal validity. By pretesting and post-
testing a control group you can monitor the changes
taking place that are due to history, maturation,
instrumentation, mortality, etc. You do not eliminate
them, you control them by observing them, if they
occur, in the changes in the control group.
Another, better, design is the Solomon Four Group
design. This design is similar to the pretest posttest control group design, but with two additional
groups: another control and experimental group, but they don't get the pretest. This design lets
you observe the effects of pretesting, if any, and controls for carry-over effects and practice
effects in addition to all the other threats to internal validity. In the diagram you see the R's.
They mean Random Assignment. The O's indicate an Observation, which is a Measurement or
M, in this case it is a Pretest, but only for two
groups. The X here means the treatment or T. I
know it's a pain, but no one seems to agree on the
best way to abbreviate these designs, so you will
see them diagramed all kinds of ways. The final
O's here are the post-test observations or
measurements.
Four Characteristics of True Experiments
To be considered a true experiment such that a cause-effect relationship between the IV and the
DV can be found (if it exists), a study needs four things. First, subjects must be randomly
selected and randomly assigned to groups. Second, there must be at least two levels of the IV,
the treatment. At the very least you need to have one group "get" the treatment and one group
"not-get" the treatment. This is often referred to as Presence vs. Absence of the IV. You could
have more than some vs. none. For example you may want to give an amount, say one glass of
juice for one group, two glasses for another group, and no juice for a third group. Third, true
experiments control for threats to internal validity. These are discussed in detail in chapter
seven, but I've mentioned a couple of these here (maturation and history). We'll look at them in
detail later. Fourth, true experiments tend to compare alternative versions of a theory, or two
different theories or at the very least determine if an hypothesis is supported or not.
Control and the Logic of Experiments
How does the basic experiment give us truthful answers to questions? It's a process of
converting an idea or question, into some means of manipulation of a believed cause while
carefully watching for any changes in the believed effect., taking the resulting numbers (the
measurements/observations) and converting them into the answer to the question. Simple. Not.
This all gets very statistical very quickly. The basic inferential statistics like the t-ratio and the
F-ratio tell us if the difference we see between or among groups is due to chance or due to a real
effect of the treatment. The "ratio" part is the heart of it all. The top number in the ratio, the
numerator, is the differences you measured. Let's say it's 90. That means your groups differed
by 90 units of whatever you measured. The bottom number, the denominator, is the amount of
chance, called error, in your measurements. The ratio is thus "differences/error," or 90/error.
Let's say the chance factors (that produce error) is equal to 90 also. That means we have as much
difference as chance or error: 90/90. And in ratio this means 1.00. We don't like it when error
and difference are the same. It means the difference in our groups due to the treatment is the
same as the difference due to error. Our treatment is really error, it made no difference. So any
test statistic that equals 1.00, means the treatment did not affect the experimental group. F-ratios
(and t-ratios) that are greater than 1.00 tell us that the treatment did have an effect.
When we calculate the difference Between Groups we call it between-group variance. Variance
means differences. When we calculate the error we call it within-group variance. So anytime
you see "MS within-groups" think Error! When you see "MS between-groups" think treatment
effect. MS is another way of saying variance (mean square).
We also sometimes call it "Mean Square Error Term."
Within-group variance (error) gets larger with larger
individual differences in our subjects. So if we don't treat
them all the same, we see more error, larger "mean-squares."
Of course everything is error-prone. We make mistakes.
Confounds, or confounding variables, are one kind of
mistake we can make in conducting a study. The idea of a confound is that it may be the true
cause of any differences we see, instead of the treatment we gave (the IV). These in turn mean
we can make two kinds of errors in converting our numbers (statistics) into the answer to our
original question, conveniently called Type I and Type II. The Type I error (also called alpha) is
the chance that you decide the treatment had an
effect, but it didn't really. The Type II error (also
called beta) is the chance that you decide the
treatment had no effect, but it did really. The null
hypothesis says that the groups did not differ. So
the Type I error says reject the null hypothesis,
the groups did differ...but this is a mistaken
conclusion. The Type II error says accept the null
hypothesis, the groups do not differ...but this is a
mistake. The Type II error also relates to external
validity. Remember, external validity is whether
or not your results are true in the real world.
Conclusion
A true experiment includes random sampling and random assignment. It will involve at least two
groups or two conditions, one serving as a control and the other as the treatment. A true
experiment is designed to control for the threats to internal validity without which the cause-
effect relationship between the IV and DV cannot be determined.