Research paper writing
Validity Notes
Cronbach (1971) indicated that what needs to be valid is the meaning or interpretation of the
scores as well as any implications for action that this meaning entails. The extent to which score
meaning and action implications hold across persons or population groups and across settings or
contexts is a persistent and perennial empirical question. This is the main reason that validity is
an evolving property and validation a continuing process.
Messick (1989) indicated that validity is an overall evaluative judgment of the degree to which
empirical evidence and theoretical rationale support the adequacy and appropriateness of
interpretations and actions based on test scores or other modes of assessment. Validity is not a
property of the test or assessment as such, but rather of the meaning of the test scores. These
scores are a function not only of the items or stimulus conditions, but also of the persons
responding as well as the context of the assessment.
Messick (1994) elaborated that because validity, reliability, comparability, and fairness are not
just measurement issues; they are social values that have meaning and force outside of
measurement whenever evaluative judgments and decisions are made. Validation combines
scientific inquiry with rational argument to justify or nullify score interpretation and use.
Validity Types
Although validity is generally discussed as a unified concept, one can discuss these different
aspects of validity. Note that instrument developers often present more than one type of validity
evidence.
Construct Validity is the extent to which a particular instrument can be shown to measure a
hypothetical construct, that, “a theoretical construction about the nature of human behavior.”
Construct validity is important to consider when planning a research study that proposes to test a
hypothetical or latent construct. One might attempt to establish construct validity for a measure
of test anxiety or a measure of mathematical ability.
Content validity is the degree to which the sample of test items represents the content that the
test is designed to measure. The test does not need to cover all content given in a course of study
to be content valid, but must cover a representative sample of the content. Appraise content
validity by an objective comparison of the test items with the curriculum content. Content
validity is very important when selecting tests to use in experiments involving the effect of
teaching methods on achievement.
Criterion-Related Validity – (a) predictive validity and (b) concurrent validity.
Predictive validity is the degree to which the predictions made by a test are confirmed by
the later behavior of the subjects. A correlation is generated to indicate the degree of predictive
validity. The criterion measure is extremely important and should be valid and reliable.
Researchers usually cross-validate by administering the same test to a new sample drawn from
the same population.
Concurrent validity of a test is determined by relating test scores of a group of subjects to
a criterion measure administered at the same time or within a short interval of time.
Face validity is concerned with the degree to which a test appears to measure what it purports to
measure, whereas the other forms of test validity require evidence. Instrument developers do not
use face validity when establishing validity.
Other Validity Types
Sometimes instrument developers establish convergent validity and divergent validity for an
instrument. Instrument developers use a multitrait-multimethod matrix. There is evidence of
convergent validity when there is a high correlation between two instruments that measure the
same construct. There is evidence of divergent validity when there is a low correlation between
two instruments that measure different constructs.