Validity

wjm3774
VALIDITYWK6.docx

Week 6: Validity

Introduction

In the last 2 weeks, you examined reliability—test consistency or stability. Test validity is another important test characteristic. Consider how you might validate a test of honesty. Choosing items would be difficult, because dishonest people would tend to lie when they responded to your test. One approach would be to create many items that you think might work, give your test to newly hired employees, and then note any employees who are caught in dishonest acts while on the job. You could then see if your test distinguishes these employees from your other new hires. If you wanted to create a measure of “criminality,” you could create items based on expert opinion and personality theory. You could then test to see if your test discriminated between criminals and noncriminals with similar demographic characteristics. You would also need to demonstrate that your test was not affected by irrelevant factors. For instance, you would not want law-abiding persons who are less educated or who have certain cultural backgrounds to score high.

There is no magic formula for validity. Determining validity is based upon a preponderance of evidence, often accumulated over a period of years or even decades.

The example above represents one type of validity, although there are many different types. This week, you concentrate on construct validity and the impact of reliability on validity.

Objectives

Students will:

· Calculate scale transformation

· Calculate concurrent validity coefficient between a predictor scale and criterion measure

· Critique test items

· Analyze assessment of construct validity

· Analyze the influence of reliability on validity

Readings

· Anastasi, A., & Urbina, S. (1997). Psychological testing (7th ed.). Upper Saddle River, NJ: Prentice Hall.

. Chapter 5, “Validity: Basic Concepts”

. Chapter 6, ‘Validity: Measurement and Interpretation”

· American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (2014). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.

. Chapter 1, “Validity”

· Anastasi, A. (1986). Evolving concepts of test validation. Annual Review of Psychology37(1), 1–15. Retrieved from the Walden Library databases.

· Hunsley, J., & Meyer, G. J. (2003). The incremental validity of psychological testing and assessment: Conceptual, methodological, and statistical issues. Psychological Assessment15(4), 446–455. Retrieved from the Walden Library databases.

· Whitely, S. E. (1983). Construct validity: Construct representation versus nomothetic span. Psychological Bulletin93(1), 179–197. Retrieved from the Walden Library databases.

Knowledge Assessment

For this Knowledge Assessment, you calculate the concurrent validity coefficient between a predictor scale and criterion measure in the dataset provided. First, you will be guided through the process of how to create new variable scales. Then, you calculate the validity measure on one of the scales.

The MoneyData.sav dataset that you have been provided contains three scales that measure financial attitudes:

· LIFESTYLE (L1 to L6) measures the desire for a luxurious lifestyle

· DEPENDENCE (D1 to D6) measures the tendency to depend on others for financial support (high scores) vs. supporting others (low scores)

· RISKTAKING (R1 to R6) measures the tendency to take financial risks in investments and careers

Create Three New Variables Showing the Scores on These Three Scales

To create the RISKTAKING scale, click TRANSFORM>COMPUTE VARIABLE. In the “Target Variable” field, type “RISKTAKING.” In the “Numeric Expression” field, type SUM(R1 TO R6).

To create the DEPENDENCE scale click TRANSFORM>COMPUTE VARIABLE. In the “Target Variable” field, type “DEPENDENCE.” In the “Numeric Expression” field, type SUM(D1 TO D6).

On the LIFESTYLE items, item L6 (“I’d rather have a modest lifestyle because it is less stressful”) is scored in the reverse direction from the other items. People endorsing this item want a less extravagant lifestyle; endorsing the other items suggests the desire for a more extravagant lifestyle. The scoring on this item needs to be reversed. To create the reversed L6 item click TRANSFORM>COMPUTE VARIABLE. In the “Target Variable” field, type “L6R.” In the “Numeric Expression” field, type “6 – L6.” By subtracting the item responses from six, they are reversed: 5 becomes 1, 4 becomes 2, etc. To create the LIFESTYLE scale, click TRANSFORM>COMPUTE VARIABLE. In the “Target Variable” field, type “LIFESTYLE.” In the “Numeric Expression” field, type SUM(L1 TO L5, L6R).

Calculate a Validity Measure for One of the Scales

There are a number of other variables in the data file, such as income, sex, age, and marital status. Create a hypothesis about an expected correlation. Here is an example: You might expect financially dependent people to have lower incomes. So, you would predict a negative correlation between DEPENDENCE and participant income (INC1). If you use SPSS to calculate the correlation between Dependence and income, (ANALYZE>CORRELATE>BIVARIATE ) you get r = - .192, p < .001. This confirms the hypothesis and gives evidence for the validity of the Dependence scale.

Think of another relationship that might support the validity of one of the scales and then test your hypothesis using the data. You will need to submit:

· Your validity hypothesis and a brief explanation about why you expect the hypothesis to be supported

· The results of your statistical test of your validity hypothesis

· Your conclusion about validity, given the results of your statistical test

QUESTION 1

Submit: Your validity hypothesis and a brief explanation about why you expect the hypothesis to be supported.

QUESTION 2

Submit: The results of your statistical test of your validity hypothesis

QUESTION 3

Submit: Your conclusion about validity given the results of your statistical test.

Discussion 1: Item Critique

In a scholarly community, critique is an important process that fosters the spread of ideas and information, improves quality of work, and encourages academic discourse. Critiques should be grounded in academic knowledge, current literature, and professional experience, rather than unsupported opinions. In test development, specifically, experts may be called upon to write items for a test or to critique the items written by others. In this Discussion, you have the opportunity to share the test items you developed in Week 5 with your colleagues and provide constructive feedback on each other’s work. As you review your colleagues’ items, think about how well the items measure the construct and whether the items are clear and unambiguous.

With these thoughts in mind:

Post the test items you developed in Week 5 and a description of the construct that you are measuring. Then provide constructive feedback to two of your colleagues about their test items, including potential suggestions for revision.

Be sure to support your postings and responses with specific references to the Learning Resources.

Discussion 2: Construct Validity

Validity measures the usefulness of a test for specific purposes. For instance, tests of depression, such as the Beck Depression Inventory (BDI), correlate well with clinician assessments of depression, so they can be considered valid measures of depression. On the other hand, the BDI does not predict job performance well, so it would be invalid for that purpose.

Types of validity include criterion-related, content, construct, and face validity.

· Criterion-related validity looks at the correlation between test scores and a criterion that the test scores could be expected to predict. For instance, SAT scores for high school seniors could be expected to correlate with their first-year college GPA.

· Content validity generally depends on expert opinion and looks at whether the test adequately samples the content of interest. A math test for high school students should sample the types of math problems that are found in high school math textbooks and should cover the domains that high school math teachers say are important.

· Face validity is important but is not validity in the technical sense. A test has face validity if it appears to measure what it purportedly measures. For instance, a math test for truck drivers should ask questions about gas mileage and load weight. If it contained questions about cake recipe proportions, it would lack face validity.

· Construct validity is usually accumulated over time and represents the accumulation of many validity studies. It indicates whether the test appropriately measures a theoretical construct.

For this Discussion, you concentrate on construct validity. To prepare, choose a construct to use (one that was not used in the examples) and consider how you might assess the construct validity of a measure of that construct. Then think about how the reliability of your measure might influence its validity.

With these thoughts in mind:

Post a description of the construct you selected. Then explain how you might assess the construct validity of a measure of that construct. Finally, explain the influence of reliability of your measure on the magnitude of its validity. Support your response using the Learning Resources and the current literature.

Be sure to support your postings and responses with specific references to the Learning Resources.