Summary
BASIC SAMPLING CONCEPTS
We begin by reviewing some terms associated with sampling—terms that are used primarily (but not exclusively) in quantitative research.
Populations
A population (the “P” of PICO questions) is the entire aggregation of cases in which a researcher is interested. For instance, if we were studying American nurses with doctoral degrees, the population could be defined as all U.S. citizens who are registered nurses (RNs) and who have a PhD, DNSc, DNP, or other doctoral-level degree. Other possible populations might be all patients who had cardiac surgery in Princess Alexandria Hospital in 2015, all women with irritable bowel syndrome in Sweden, or all children in Canada with cystic fibrosis. Populations are not restricted to humans. A population might consist of all hospital records in a particular hospital or all blood samples at a particular laboratory. Whatever the basic unit, the population comprises the aggregate of elements in which the researcher is interested.
It is sometimes useful to distinguish between target and accessible populations. The accessible population is the aggregate of cases that conform to designated criteria and that are accessible for a study. The target population is the aggregate of cases about which the researcher would like to generalize. A target population might consist of all diabetic people in New York, but the accessible population might consist of all patients with diabetes who attend a particular clinic. Researchers usually sample from an accessible population and hope to generalize to a target population.
TIP: Many quantitative researchers fail to identify their target population or to discuss the generalizability of the results. The population of interest needs to be carefully considered in planning and reporting a study.
Eligibility Criteria
Researchers must specify criteria that define who is in the population. Consider the population American nursing students. Does this population include students in all types of nursing programs? How about RNs returning to school for a bachelor’s degree? Or students who took a leave of absence for a semester? Do foreign students enrolled in American nursing programs qualify? Insofar as possible, the researcher must consider the exact criteria by which it could be decided whether an individual would or would not be classified as a member of the population. The criteria that specify population characteristics are the eligibility criteria or inclusion criteria. Sometimes, a population is also defined in terms of characteristics that people must not possess (i.e., exclusion criteria ). For example, the population may be defined to exclude people who cannot speak English.
In thinking about ways to define the population and delineate eligibility criteria, it is important to consider whether the resulting sample is likely to be a good exemplar of the population construct in which you are interested. A study’s construct validity is enhanced when there is a good match between the eligibility criteria and the population construct.
Of course, eligibility criteria for a study often reflect considerations other than substantive concerns. Eligibility criteria may reflect one or more of the following:
· Costs. Some criteria reflect cost constraints. For example, when non-English-speaking people are excluded, this does not usually mean that researchers are uninterested in non-English speakers but rather that they cannot afford to hire translators or multilingual data collectors.
· Practical constraints. Sometimes, there are other practical constraints, such as difficulty including people from rural areas, people who are hearing impaired, and so on.
· People’s ability to participate in a study. The health condition of some people may preclude their participation. For example, people with cognitive impairments, who are in a coma, or who are in an unstable medical condition may need to be excluded.
· Design considerations. As noted in Chapter 10 , it is sometimes advantageous to define a homogeneous population as a means of controlling confounding variables.
The criteria used to define a population for a study have implications for the interpretation and generalizability of the findings.
Example of Inclusion and Exclusion Criteria: Schallom and colleagues (2015) studied the relationship between gastric reflux and pulmonary aspiration in hospitalized patients receiving gastric tube feedings. To be eligible, patients had to have a confirmed gastric location of a feeding tube, be mechanically ventilated, and be aged 18 years or older. Patients were excluded if they were pregnant, had a documented history of GERD, had any airborne infectious disease, or had oral trauma.
Samples and Sampling
Sampling is the process of selecting cases to represent an entire population, to permit inferences about the population. A sample is a subset of population elements , which are the most basic units about which data are collected. In nursing research, elements most often are humans.
Samples and sampling plans vary in quality. Two key considerations in assessing a sample in a quantitative study are its representativeness and size. A representative sample is one whose key characteristics closely approximate those of the population. If the population in a study of patients who fall is 50% male and 50% female, then a representative sample would have a similar gender distribution. If the sample is not representative of the population, the study’s external validity and construct validity are at risk.
Certain sampling methods are less likely to result in biased samples than others, but a representative sample can never be guaranteed. Researchers operate under conditions in which error is possible. Quantitative researchers strive to minimize errors and, when possible, to estimate their magnitude.
Sampling designs are classified as either probability sampling or nonprobability sampling. Probability sampling involves random selection of elements. In probability sampling, researchers can specify the probability that an element of the population will be included in the sample. Greater confidence can be placed in the representativeness of probability samples. In nonprobability samples , elements are selected by nonrandom methods. There is no way to estimate the probability that each element has of being included in a nonprobability sample, and every element usually does not have a chance for inclusion.
Strata
Sometimes, it is useful to think of populations as consisting of subpopulations, or strata . A stratum is a mutually exclusive segment of a population, defined by one or more characteristics. For instance, suppose our population was all RNs in the United Kingdom. This population could be divided into two strata based on gender. Or, we could specify three strata of nurses younger than 30 years of age, nurses aged 30 to 45 years, and nurses 46 years or older. Strata are often used in sample selection to enhance the sample’s representativeness.
Staged Sampling
Samples are sometimes selected in multiple phases, in what is called multistage sampling . In the first stage, large units (such as hospitals or nursing homes) are selected. Then, in the next stage, individuals are sampled. In staged sampling, it is possible to combine probability and nonprobability sampling. For example, the first stage can involve the deliberate (nonrandom) selection of study sites. Then, people within the selected sites can be selected through random procedures.
Sampling Bias
Researchers work with samples rather than with populations because it is cost-effective to do so. Researchers seldom have the resources to study all members of a population. It may be possible to obtain reasonably accurate information from a sample, but data from samples can be erroneous. Finding 100 people willing to participate in a study may be easy, but it is usually hard to select 100 people who are an unbiased subset of the population. Sampling bias refers to the systematic over- or underrepresentation of a population segment on a characteristic relevant to the research question.
As an example of consciously biased selection, suppose we were investigating patients’ responsiveness to nurses’ touch and decide to recruit the first 50 patients meeting eligibility criteria. We decide, however, to omit Mr. Z from the sample because he has been hostile to nursing staff. Mrs. X, who has just lost a spouse, is also bypassed. These decisions to exclude certain people do not reflect bona fide eligibility criteria. This can lead to bias because responsiveness to nurses’ touch (the outcome variable) may be affected by patients’ feelings about nurses or their emotional state.
Sampling bias often occurs unconsciously, however. If we were studying nursing students and systematically interviewed every 10th student who entered the nursing school library, the sample would be biased in favor of library-goers, even if we are conscientious about including every 10th student regardless of age, gender, or other traits.
TIP: Internet surveys are attractive because they can be distributed to geographically dispersed people. However, there is an inherent bias in such surveys, unless the population is defined as people who have easy access to, and comfort with, a computer and the Internet.
Sampling bias is partly a function of population homogeneity. If population elements were all identical on key attributes, then any sample would be as good as any other. Indeed, if the population were completely homogeneous—exhibited no variability at all—then a single element would be sufficient. For many physiologic attributes, it may be safe to assume reasonably high homogeneity. For example, the blood in a person’s veins is relatively homogeneous and so a single blood sample is adequate. For most human attributes, however, homogeneity is the exception rather than the rule. Age, health status, stress, motivation—all these attributes reflect human heterogeneity. When variation occurs in the population, then similar variation should be reflected, to the extent possible, in a sample.
TIP: One easy way to increase a study’s generalizability is to select participants from multiple sites (e.g., from different hospitals, nursing homes, communities). Ideally, the different sites would be sufficiently divergent that good representation of the population would be obtained.
NONPROBABILITY SAMPLING
Nonprobability sampling is less likely than probability sampling to produce representative samples. Despite this fact, most studies in nursing and other health disciplines rely on nonprobability samples.
Convenience Sampling
Convenience sampling entails using the most conveniently available people as participants. For example, a nurse who conducts a study of teenage risk-taking at a local high school is relying on a convenience sample. The problem with convenience sampling is that those who are available might be atypical of the population with regard to critical variables.
Sometimes, researchers seeking people with certain characteristics place an advertisement in a newspaper, put up signs in clinics, or post messages on online social media. These “convenient” approaches are subject to bias because people select themselves as volunteers in response to posted notices and likely differ from those who do not volunteer.
Snowball sampling (also called network sampling or chain sampling) is a variant of convenience sampling. With this approach, early sample members (called seeds) are asked to refer other people who meet the eligibility criteria. This approach is often used when the population involves people who might otherwise be difficult to identify (e.g., people who are afraid of hospitals).
Convenience sampling is the weakest form of sampling. In heterogeneous populations, there is no other sampling approach in which the risk of sampling bias is greater. Yet, convenience sampling is the most commonly used method in many disciplines.
Example of a Convenience Sample: Krueger and colleagues (2015) studied fetal response (fetal heart rate and movement) to live and recorded maternal speech following a history of fetal exposure to a passage spoken by the mother. The study participants were a convenience sample of 21 pregnant women.
TIP: Rigorous methods of sampling hidden populations, such as the homeless or injection drug users, are emerging. Because standard probability sampling is inappropriate for such hidden populations, a method called respondent-driven sampling (RDS), a variant of snowball sampling, has been developed. RDS, unlike traditional snowballing, allows the assessment of relative inclusion probabilities based on mathematical models (Magnani et al., 2005). McCreesh and colleagues (2012) have undertaken a recent evaluation of RDS.
Quota Sampling
A quota sample is one in which the researcher identifies population strata and determines how many participants are needed from each stratum. By using information about population characteristics, researchers can ensure that diverse segments are represented in the sample, in the proportion in which they occur in the population.
Suppose we were interested in studying nursing students’ attitude toward working with AIDS patients. The accessible population is a school of nursing with 500 undergraduate students; a sample of 100 students is desired. The easiest procedure would be to distribute questionnaires in classrooms through convenience sampling. Suppose, however, that we suspect that male and female students have different attitudes. A convenience sample might result in too many men or women. Table 12.1 presents fictitious data showing the gender distribution for the population and for a convenience sample (second and third columns). In this example, the convenience sample overrepresents women and underrepresents men. We can, however, establish “quotas” so that the sample includes the appropriate number of participants from both strata. The far-right column of Table 12.1 shows the number of men and women required for a quota sample for this example.