Top#2
46 OCTOBER 2017 | EMSWORLD.com
T he practice of medicine has come
a long way over the past 150 years.
For example, routine use of leeches
to remove “bad blood” no longer
occurs, and everyone involved in health-
care knows to wear personal protective
equipment.
Changes to practice have (at times)
been slow in coming. However, we sim-
ply can no longer routinely rely on provid-
ing care without evidence that it works.
Investigators conduct rigorous studies to
determine the efficacy of treatments. This
philosophy of testing clinical practices
using research methods to validate their
efficacy and safety is known as evidence-
based medicine.1
To gather the data needed when evalu-
ating a treatment, researchers use a struc-
tured approach that utilizes critical think-
ing. EMS personnel and other clinicians
engage in critical thinking as a method of
problem-solving every day when deciding
upon the best course of action to help a
patient. These same skills are used in con-
ducting research. The Center for Critical
Thinking’s Linda Elder and Richard Paul
wrote (in part) that when engaged in sci-
entific study, we should examine the:
• Purpose of the inquiry;
• Best questions to ask;
• Types of inferences typically drawn;
• Viewpoints in the profession;
• Investigators’ assumptions;
• Implications of the inquiry;
• Types of data (information) to collect.2
These dimensions of critical thinking
yield different types of data. Authors
might differ slightly, but there are gen-
erally four accepted types (or levels) of
data. These are (from lowest level of rigor
to highest) nominal, ordinal, interval and
ratio. Each of these levels of data allows
increasingly complex and vigorous statis-
tical testing.
Nominal
These data are labels or categories that,
in themselves, cannot indicate increased
or decreased value. For example, two U.S.
states have different names. Despite the
love one might have for their home state,
one of these labeled areas is not more of
a state than the other. Nominal data are
variables such as sex/gender, race, ethnic-
ity, political affiliation and place of birth.
For instance, an investigator might col-
lect information on the numbers of males
and females who work each type of shift
schedule (e.g., 24-hour and 12-hour).
These labels for sex/gender are con-
structs; they are titles society has agreed
to use. Neither sex (nor gender) is more
valuable than the other. These data can
only be used for lower-level compari-
sons. For example, after gathering these
data it would be acceptable to report the
numbers and distributions of males and
females on each type of shift. The number
of each sex/gender is an objective mea-
sure that can be used for comparisons of
the groups.
This would allow for creation of graph-
ics (like line charts) and performance of
low-level statistical tests. A researcher
could calculate a chi square to find dif-
ferences between sex/genders. It would
be necessary to code sex/gender as a
dichotomous (i.e., 0 or 1) variable for this
calculation. Keep in mind that this still
does not indicate a greater value for one
gender/sex over the other.3–5
Ordinal
Sometimes an investigator wants to
know the order in which things occur. If
a researcher were to stand outside of an
emergency room and create a log of the
order in which ambulances arrived, this list
would contain ordinal data (Table 1). While
it is true that these are more powerful than
nominal data, they have a key weakness: If
the researcher only focuses upon the order
in which the ambulances arrive, they could
NOT ALL
EVIDENCE
IS CREATED EQUAL Changes in practice require the highest possible level of statistical testing
By Sandy Hunter, PhD, NRP
This is the first in a four-part series on evidence-based prac-
tice produced in partnership with the UCLA Prehospital Care
Research Forum. Visit www.cpc.mednet.ucla.edu/pcrf
48 OCTOBER 2017 | EMSWORLD.com
know which was first to arrive, which was
second and so on. They would not know the
time needed for the ambulances to reach
the emergency room—this would require
the additional (higher-level) information
included in the table.
While the table indicates the order of
ambulance arrival, other data could be
collected for better comparisons. You can
also see that the en route times (times to
drive to the hospital) are not the same for
all the ambulances. The researcher can
calculate a correlation between the en
route times and the number of minutes
on scene or the number of minutes out
at the hospital (as long as they captured
those data). They could also calculate the
level of correlation between the order of
the ambulances’ arrival and the time the
ambulances spent on scene or the number
of hours the crew had been on duty.3–5
Interval
This is the first of the continuous data,
meaning you’re observing data that have
equal distances between points of mea-
surement (as you would see on a measur-
ing tape). They also have the strengths
(but not the weaknesses) of nominal and
ordinal data.
A characteristic that distinguishes inter-
val data from the highest level of data is
that they do not have a (reasonable) zero
point on their scale of measure.4 For exam-
ple, a typical written exam would allow
scores to range from 0–100. If the exam
were administered to a group of paramedic
students and someone earned a zero
because they missed all the questions,
it is unlikely the student has absolutely
no knowledge about being a paramedic.
The score only indicates how the student
performed on this exam. Further, the dif-
ference between a score of 50 for one stu-
dent and a score of 100 for another stu-
dent does not indicate that one has twice
as much overall knowledge as the other.
Another example is body temperature.
When a thermometer is used, the heat
of the body is calculated on a scale that
has consistent markings (or digital incre-
ments). The body might reach a tempera-
ture of zero on commonly used scales, but
it is unlikely the patient would reach a tem-
perature of absolute zero, at which there is
no molecular movement.
Interval data allow for powerful calcu-
lations, including comparing one group
with another and looking for significant
differences, while controlling for vari-
ables that can affect your results, known
as extraneous variables.3–5 Most readers
are familiar with research results that note
Table 1: 12-Hour and 24-Hour Times Per Call
12-hour on scene
24-hour on scene
Dep. Scene
Arr. hosp.
En route time
10 35 11:00 11:10 10 mins.
12 24 11:02 11:14 12 mins.
6 53 11:10 11:16 6 mins.
14 4 11:24 11:38 14 mins.
5 30 11:50 11:55 5 mins.
61 10 11:00 12:01 61 mins.
20 9 11:42 12:02 20 mins.
9 5 12:00 12:09 9 mins.
5 11 12:10 12:15 5 mins.
14 7 12:01 12:45 44 mins.
O U R V E H I C L E S C O N N E C T A N D P R O T E C T P E O P L E A R O U N D T H E W O R L D E V E R Y D A Y . www.revgroup.com
Remount standards are always changing, so our
factory-certified technicians use quality standards, such
as Ford QVM and the largest chassis pool, to provide
performance without a hefty price tag. Who better to
trust than the ones who built it.
REVgroup.com/AmbulanceRemount
When your ambulance needs to perform, bring it home.
www.revgroup.com
VEHICLE SERVICES | DEALER SUPPORT | TECHNICAL ASSISTANCE
REV Remounts
For More Information Circle 41 on Reader Service Card
EMSWORLD.com | OCTOBER 2017 49
a study controlled for specific things; this
is the type of variable best used here. For
example, an investigator might want to
know whether length of shift (e.g., 12-hour
vs. 24-hour) plays a role in the number of
traffic accidents involving ambulances. It
is possible that the types of ambulances
involved also play some unknown role in
this phenomenon. However, since length of
shift is the focus, the researcher will con-
trol for ambulance type by including it at a
specific point in the statistical calculation.
Ratio
These are similar to interval data. They
are so similar that they can be mistaken
for (and used in place of) each other. Ratio
data are also measured on a scale that has
consistent intervals. The key difference is
that ratio data allow for the possibility of
a true zero on the scale used to measure
a variable. Two examples are speed and
hours on duty.
Two vehicles traveling at 30 mph and 40
mph have the same separation between
their speeds as two other vehicles traveling
at 55 mph and 65 mph. A vehicle traveling
at 100 mph is traveling twice as fast as one
traveling at 50 mph. Both of these objects
can be completely still. Similarly, the num-
ber of hours on duty can be expressed
along a scale of consistent intervals and
have a zero point.3–5
Each variable used in a study must be
evaluated for its strength if you want to use
it as grounds for making a change in clini-
cal practice. Authors led by AHRQ’s David
Atkins suggest we evaluate evidence col-
lected to decide whether a new or differ-
ent clinical approach is needed using the
GRADE (grading of recommendations
assessment, development and evalua-
tion) system.6,7
Within this system, evidence is graded
as high, moderate, low or very low.8 High-
quality evidence (e.g., a randomly con-
trolled trial [RCT]) leads to a conclusion,
and gathering more research would prob-
ably not influence the decision(s) being
made. Moderate-quality evidence leads
to a conclusion, but more research is likely
to influence the decision(s) being made.
With low-quality evidence, more research
is very likely to lead to different outcome. If
the evidence is very low quality, any deci-
sions being made based upon it should be
suspect.
An example of data that would be ini-
tially seen as high quality could come from
an RCT. These data could be demoted to a
weaker status if a review of the study finds
problems such as weak internal validity
(e.g., lack of randomization or blinding) or
weak external validity (e.g., small sample
size or a sample not representative of the
population).
Strengths of the GRADE system include
growing and general acceptance of the
model and ease of use.8 It does contain
subjective elements that might be an
issue (e.g., at what point does a researcher
decide data from an RCT should be down-
graded?). To lower the perceived strength
of evidence requires both at least a general
understanding of the research model being
used and an understanding of the subject
being investigated.
This can be seen in a research study on
Applying Traction Is Easy When you have the right splint
Use the Sager Traction Scale
to set the amount of traction
needed—Sager Splints do the
rest. The Sager’s dynamic function
permits traction to decrease
automatically as the muscle spasm
releases. Your patient will always
have the correct amount of safe,
secure, traction. It’s that easy!
One-person application.
Safely treats Proximal Third
and Mid-Shaft fractures. Ensures
optimal patient care.
Learn how easy Sager Splints are to use / for details on Sager Splint models visit www.sagersplints.com
Email: [email protected] / Call 800.642.6468 for the name of Your Authorized Distributor.
20270 Charlanne Drive
Redding, CA, 96002-9223
For More Information Circle 42 on Reader Service Card
50 OCTOBER 2017 | EMSWORLD.com
the effect of a new prehospital respira-
tory medication. If the study were carried
out in a laboratory setting, testing a drug
and a placebo with neither the patient
nor the treatment administrator knowing
which was used, this evidence would be
valued as high. If the trial were carried out
by asking a few paramedics to administer
the new medication to some patients dur-
ing a short period of time in the field, then
comparing these to patients who received
nothing, the evidence would be weaker.
EMS Example
When studying a clinical topic (such as
hypertension or hemorrhage control),
an investigator needs to use the highest
level(s) of data available to apply rigorous
and appropriate statistical tests. These
tests reduce the chance that results are
found by chance. An example would be
an agency that wanted to know whether
using 24-hour shifts vs. 12-hour shifts
would affect patient outcomes. This is a
broad question, and a primary step would
be to narrow the focus. So here, it will be
limited to: Does having two-person para-
medic units responding to nonarrest car-
diac calls during 24-hour shifts result in
higher patient morbidity and or mortality?
A directional hypothesis for this example
is: 12-hour crews will have statistically sig-
nificantly higher average patient morbid-
ity and mortality rates than 24-hour crews;
and 12-hour crews will have statistically
significantly worse patient condition out-
comes at discharge than 24-hour crews
(note: frame hypotheses as the opposite
of what you think is true). Some of the data
to be collected will be: shift length, crew
demographics (e.g., age, experience, etc.),
type of cardiac complaint, time of the call,
en route (elapsed) time, patient condition
at the emergency room, and patient final
condition at discharge.
The data listed above include some
nominal variables (e.g., shift schedule),
some extraneous variables to be con-
trolled for (e.g., crew demographics) and
some interval data (e.g., en route time).
Condition at the emergency room could be
coded so that it is interval data: You would
create a scale (e.g., from 1–10, with 1 being
dead and 10 being asymptomatic). Each
patient’s medical record would be reviewed
and placed into the appropriate category.
Coding for the scale based upon a pre-
determined group of symptoms and signs
is appropriate because there is an a priori
(deduced) argument that being dead is
much worse than being asymptomatic
and happy, and you could determine what
would constitute the other levels based
upon the patient’s condition. Some might
argue that data on condition and disposi-
tion are ratio-level data; that is a reason-
able postulate. Here that is acceptable.
Moving forward in this study, all the
nonarrest cardiac calls run over a pre-
determined time would be reviewed. An
investigator would need two groups from
which to collect data. These could be two
(or more) sets of paramedics working at
the same time (e.g., 12-hour and 24-hour
shifts over 6 months) or one set working
over a longer period (e.g., 12-hour shifts
for 6 months and then 24-hour shifts for
the next 6 months). The former allows
you to better control for extraneous vari-
ables such as changes in seasons or pay
or updates to protocols.
Reviewing data on the two groups of
paramedics allows reporting of descriptive
statistics. This would include numerical
and graphical representations of the mean,
median, mode and standard deviation of
each of the nominal, ordinal, interval and
ratio variables. Nominal and ordinal data
are important but should not be used to
make critical decisions. It is the higher lev-
els of data that allow for the most power-
ful testing if the data are still high-value
(using the GRADE system).
A researcher could use interval and
ratio data to compare the averages for
the patients’ conditions upon arrival at
the emergency room. These data allow
one to determine whether there is a sta-
tistically significant difference between
the groups (12-hour vs. 24-hour shifts). If
there is a significant difference, an agency
needs to consider the real-world impact
that difference represents.
This decision-making will be aided by a
combination of classic critical thinking and
the use of GRADE. For example, an agency
might find that after 400 cardiac calls
(n=200 of the 24-hour shift and 200 of
the 12-hour shift), software indicates there
is a statistically significant difference
between groups in patients’ conditions
at the emergency room. However, the
average for one group could be 8 and the
other 8.4 (on a scale of 1–10), or the data
might be gathered from a study with low
internal validity. This significant difference
might not be large enough or trustworthy
enough to disrupt the agency’s practices.
Summary
Each level of evidence is useful. Nominal
data allow for solid descriptive report-
ing. Ordinal data can allow for stronger
tests (e.g., correlation) and be controlled
for in complex testing. However, to per-
form the types of in-depth statistical
tests needed before changing a clinical
practice, an investigator should strive to
gather the highest levels of data available.
Interval and ratio data allow a researcher
to compare groups with precision and
confidence.
Therefore, as an investigator plans a
research project, it is incumbent upon her or
him to think about what types of questions
are being asked, collect the appropriate
level(s) of data and evaluate the strength of
the specific variables being used. Changes
in practice affect lives, and decisions relat-
ed to how medicine is practiced require the
strongest possible evidence.
REFERENCES
1. McAlister FA, Straus SE, Sackett DL. Why we need large, simple studies of the clinical examination: the problem and a proposed solution. Lancet, 1999 Nov 13; 354(9,191): 1,721–4.
2. Elder L, Paul R. The Thinker’s Guide to Analytic Thinking: How to Take Thinking Apart and What to Look For When You Do. Dillon Beach, CA: Foundation for Critical Thinking, 2007.
3. Forister JG, Blessing JD. Introduction to Research and Medical Literature for Health Professionals, 4th ed. Burlington, MA: Jones & Bartlett, 2016.
4. Gay LR, Mills GE, Airasian PW. Educational Research: Competencies for Analysis and Applications, 10th ed. Boston: PEAR, 2012.
5. Salkind NJ. Statistics for People Who (Think They) Hate Statistics: Excel 2010 Edition, 3rd ed. Thousand Oaks, CA: SAGE Publications, 2013.
6. Atkins D, Eccles M, Flottorp S, et al. Systems for grading the quality of evidence and the strength of recommendations I: critical appraisal of existing approaches, The GRADE Working Group. BMC Health Serv Res, 2004 Dec 22; 4(1): 38.
7. GRADE Working Group. GRADE, http://www.gradeworkinggroup.org/. 8. American Thoracic Society. The GRADE Approach (Part 2 of 12),
https://www.youtube.com/watch?v=IjxZ_-HI8BE.
ABOUT THE AUTHOR
Sandy Hunter, PhD, NRP, is a professor with the paramedic program at Eastern Kentucky University and a graduate of the doctoral program in educational psychology at the
University of Kentucky. He holds a master’s in health education and an undergraduate degree in emergency medical care. His research interests include diversity, self- ef ficacy, learning theories and EMS safety.
Copyright of EMS World is the property of SouthComm Inc. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use.