Qualitative and Quantitative Methods
PSY326 Research Methods Week 2 Guidance
View the video on the Week 2 Overview screen and read Chapter 2 of your textbook.
After completing this instructional unit, you will be able to:
· Analyze various research designs and understand when each should be used.
· Compare and contrast characteristics of non-experimental and experimental research designs.
· Differentiate between qualitative and quantitative research approaches.
· Explain the concepts of validity, reliability, and variable measurement scales.
In the discussion this week, you will choose an appropriate research design from the ones described in section 2.1 of the textbook, explain why you chose that design for the proposed study, and give an overview of what would be done in the study. Your discussion post should cite at least two recent peer-reviewed sources, so be sure to search the Ashford Library resources for journal articles on research design. All references should be cited in APA format. You can find instructions and examples for APA style in the Ashford Writing Center, under Learning Resources in the left navigation panel.
Other assignments for the week are a quiz on concepts of research design and measurement, and a short paper on qualitative and quantitative research methods. Describe the characteristics of qualitative research and quantitative research, and explain their similarities and differences. Determine which approach was used in the study you selected in Week 1 and give evidence for your classification. Also, specify whether the study is experimental or non-experimental. Refer to the textbook and peer-reviewed sources from the Ashford Library to support your judgments.
Newman (2016) categorizes research designs into three types: descriptive, predictive, and experimental (including quasi-experimental). Descriptive and predictive research designs are non-experimental. Chapter 2 presents an overview of the different types of research designs. In the following weeks, we will cover each type in more detail. Another way of categorizing research designs is by whether they use qualitative or quantitative methods. The textbook provides some insight into the similarities and differences between these two approaches, but the following are additional characteristics to look for as you decide how to classify the methods used in a study.
Qualitative research:
· Relatively small sample sizes
· Uses purposeful (specific) sampling to get participants who have knowledge and experience of the topic of the research
· Narrative data (words, pictures, etc.)
· Starts with a broad research question but may end with a hypothesis
· Uses special coding methods for data analysis
· Value is based on trustworthiness
Quantitative research:
· Relatively large sample sizes
· Sampling techniques may be random to get a wide variety of participants representing the population
· Numerical data in the form of variables
· Hypothesis and/or research question(s) must be specified before data collection
· Uses statistical techniques for data analysis
· Value is based on validity and reliability
Additional features of qualitative and quantitative research can be found at James Neill’s website, http://wilderdom.com/research/QualitativeVersusQuantitativeResearch.html (Links to an external site.)Links to an external site. (Neill, 2007).
An important part of this week’s lesson concerns psychometrics, which is the field of psychological measurement. You are familiar with physical measurement instruments, such as tape measures, rulers, and scales, but did you know that tests, questionnaires, and surveys are measurement instruments for attitudes or mental characteristics? This is a quantitative research concept.
A construct is something that we believe exists, but it cannot be measured directly because it is something that occurs inside of the mind. To try to measure a construct, we have to measure a collection of things that we think are related to the construct, and then we put those things together. An example is happiness. How can we tell if a person is happy? How happy is the person? We might note that the person is smiling or saying positive things. When we see or hear this evidence, we might conclude that the person is happy. On the other hand, if the person is frowning or crying, or complaining about something, we might infer that the person is not happy. But the verdict on whether the person is happy or not is a conclusion we draw based on what we consider the evidence, not a direct measurement of happiness or the lack of it. We need good psychometrics to help us measure psychological states, because they cannot be measured directly like physical states.
Two very important concerns in psychometrics are validity and reliability. The concept of validity refers to how appropriately and meaningfully a construct is measured by an instrument. Does the instrument really measure what it claims to measure? For instance, some of the first intelligence tests really measured how well people could read English or how familiar they were with middle class American culture instead of how intelligent they were (Holah, 2008). Imagine deciding that a recent immigrant to the United States who was just starting to learn English and was too poor to own a television was of low intelligence. It took a while for people to realize that this was unfair and did not make sense. These days, psychometricians, the people who develop tests and questionnaires, are much more careful about watching out for things like this. The important thing with validity is how the measurement is interpreted. In the example noted above, it would have been appropriate to interpret a low score on knowledge of popular television shows as an indication that the person probably did not watch much television, instead of concluding that the person was “a moron.”
There are multiple types of evidence for the validity of using an instrument for a particular interpretation. These include content validity, construct validity, and criterion validity. When you are reading a quantitative research report, it should include information about the validity of the instruments they used for the purpose of their study.
Content validity has to do with whether or not the questions (called items) are well-written and represent all of the important aspects of the field being tested. For example, a licensing exam for a nurse would not have content validity if it left out some of the most important things that nurses have to do in their work, such as taking vital signs. Content validity is usually checked by having a group of experts in the field go over the test and see if they think anything important is missing or if anything in it is not important and needs to be removed.
Construct validity has to do with whether the combination of items on the instrument is consistent with the theory that explains the construct. How accurately does the instrument measure what it is supposed to measure? Sub-categories of construct validity are face validity, convergent validity, and discriminant validity. Face validity is simply that experts in the field believe the items are asking about the construct in question and not some other construct or situation. Convergent and discriminant validity actually involve calculating a correlation coefficient. With convergent validity, you would compare your instrument with instruments that are known to measure related constructs, and you would expect to find a positive correlation. With discriminant validity, you would compare your instrument to one which is known to accurately measure a different construct; in this case, you would expect the two instruments not to be correlated. If they are, you have a problem.
Finally, criterion validity relates the measurement of the construct to the measurement of the behavior that is expected when the construct is present. Two sub-categories within criterion validity are concurrent validity and predictive validity (Newman, 2016). Concurrent validity means you are determining how similar your instrument is to another instrument given about the same time that measures the same thing and has already been shown to be valid. The two instruments should have a high positive correlation if they really are measuring the same thing. One measure might be a series of knowledge questions, while the other might be a test of hands-on skills. Predictive validity is a little different, because usually the two measurements being compared are taken at different times. An example is the correlation between an aptitude test and actual job performance. If it can be shown that people who score high on that particular aptitude test tend to have good performance evaluations on a job using those aptitudes, then the aptitude test has predictive validity. The point is to use the right kind of test to predict the behavioral outcome (i.e., job performance) in which you are interested.
There is a saying in psychometrics, “reliability is a necessary but not sufficient condition for validity.” This means that unless the measures are reliable, they cannot be valid for a particular use or interpretation, and that the fact that measures are reliable does not necessarily mean that they are valid for the given interpretation or decision. If I teach a unit on t tests, but then I give a test on ANOVA instead, my test might be a reliable measurement of my students’ knowledge of ANOVA, but it is not a valid measurement of their knowledge of t tests. In other words, it might be a good test for one thing, but it is not the right test for what I am trying to measure.
We define the concept of reliability as the extent to which measurement is free from error, or how consistent a measure is. The difference between the true measurement of a construct and the test score for an individual is called error. This type of error does not mean a wrong answer on a test. It means that our measurement includes some things we did not intend and that we may not get consistent measurements because of this. Some items may be interpreted differently by different participants due to unfamiliar vocabulary or the use of words that might have more than one meaning. The environment in which the instrument is given may not be perfectly consistent between testing sites, such as different temperatures, lighting, or sound levels. Even within the same room, there are things that could make the testing experience different for different participants. For example, people whose seats were near a window might have been distracted by something happening outdoors. Characteristics of the participants may also influence test scores, such as illness, problems at home, or hearing something about the topic of the test on the radio on the way to the test site. These things may influence test scores in ways that have nothing to do with the characteristic being measured.
Measurement of reliability almost always takes the form of a correlation coefficient. The values of reliability coefficients range from zero, indicating no consistency between two sets of measures, to 1.00, indicating perfect consistency between the two sets of measures. As a rule of thumb, reliability coefficients should be greater than .50, although it is commonly accepted that reliability of .70 or higher is necessary for most purposes.
Although there are several types of reliability coefficients for tests or surveys, the most popular type is Cronbach’s alpha, a measure of internal consistency (also called interitem reliability). Before Cronbach developed the procedure for the alpha coefficient, researchers had to administer two different but equivalent tests to the same group of people (alternate forms reliability), give the same test to the same people on two different occasions (test-retest reliability), or split a large test into two parts and compare the half scores (split-half reliability). Coefficient alpha quickly caught on because it only requires one test, and even though the calculation is complex, it can be done easily and quickly by modern computers. (When first developed, it is said that it took a group of Cronbach’s graduate assistants about two weeks to perform all of the calculations by hand.) Cronbach’s alpha involves calculating the correlation of each item on the instrument with every other item on the instrument.
Here is an excellent video that helps explain Reliability and Validity beyond your text in different terms (Centennial, 2007). It's not required, but worth watching. http://www.youtube.com/watch?v=H7fiJLUNQxI (Links to an external site.)Links to an external site.
Your assignments are listed on the Overview page for Week 2 as well as in the Course Guide. If you have any questions about this week’s readings or assignments, email your instructor or post your question on the “Ask Your Instructor” forum. Remember, use the forum only for questions that may concern the whole class. For personal issues, use email.
References
Centennial, L. [LyndaCentennial]. (2007, November 14). Reliability and validity (Links to an external site.)Links to an external site. [Video file]. Retrieved from http://www.youtube.com/watch?v=H7fiJLUNQxI Accessibility Statement (Links to an external site.)Links to an external site. Privacy Policy (Links to an external site.)Links to an external site.
Holah, M. (n.d.). Summary of Stephen Jay Gould’s 1982 article: A nation of morons (Links to an external site.)Links to an external site. . Retrieved from http://www.holah.karoo.net/gouldstudy.htm
Neill, J. (2007, February 28). Qualitative versus quantitative research: Key points in a classic debate (Links to an external site.)Links to an external site. . Retrieved from http://wilderdom.com/research/QualitativeVersusQuantitativeResearch.html
Newman, M. (2016). Research methods in psychology (2nd ed.). San Diego, CA: Bridgepoint Education, Inc.
Contributors to this Guidance are Gary Boyles, Pamela Murphy, and Jessica Wayman.