Test Reliability

wjm3774
EnsuringTestReliability.docx

T & M

Discussion

Ensuring Test Reliability

According to classical test theory, test reliability is based on the notion that test score reliability comprises two parts: true scores and error. A true score is an expected score on a test over an infinite number of testing instances; it is a theoretical idea that can never be known for sure. Errors are inaccuracies that make actual (observed) test scores differ from true scores.

There are several different ways to measure a test’s reliability. Test–retest reliability looks at the correlation between original test administrations and retests. The span of time between the two administrations should be less that the time for the true scores to vary. Test–retest reliability looks at error due to time.

Alternate-form reliability looks at the correlation between two different versions of a test. Split-half reliability is similar to alternative-form reliability, splitting a single test into two halves, usually odd and even items, and correlating scores on the two halves. Cronbach’s coefficient alpha is also similar, essentially providing the average of all possible split-half reliabilities. Alternate-form, split-half, and Cronbach’s coefficient alpha all look at error due to content sampling.

For this Discussion, pretend that you have been contracted to create a test to assess intelligence. Think about factors you might consider and steps you might take to ensure reliability of this test.

With these thoughts in mind:

Post a description of factors you might consider and steps you might take to ensure reliability of the intelligence test. Explain why these factors and steps are important.

Be sure to support your postings and responses with specific references to the Learning Resources.

Assignment

Developing Test Items

In Week 3, you submitted your test specifications. Now, you should incorporate your Instructor’s feedback and create your test items. Remember to write your items so that they are clear and unambiguous. Avoid compound items (e.g., “I am happy most days and I usually like to smile”). Also make sure that the items are representative of the construct that you are measuring.

The Assignment

Submit by Day 7 test items for your assessment, based on your test specifications.

Support your Application Assignment with specific references to all resources used in its preparation. You are to provide a reference list for all resources, including those in the Learning Resources for this course.

Readings

· American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (2014). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.

. Chapter 2, “Reliability/Precision and Errors of Measurement”

· Cortina, J. M. (1993). What is coefficient alpha? An examination of theory and applications. Journal of Applied Psychology78(1), 98–104. Retrieved from the Walden Library databases.

· Wainer, H. (1986).  Can a test be too reliable?  Journal of Educational Measurement23(2), 171–173. Wainer, H., Can a Test Be Too Reliable? In Journal of Educational Measurement. Copyright 1986 Blackwell Publishing Journals. Used with permission from the National Council on Measurement in Education via the Copyright Clearance Center.

Test Specifications Template

Determine whether you want to measure a trait, ability, emotional state, disorder, interest, attitude, or other construct:

Ability, such as musical skill, writing skill, intelligence, or reading comprehension, Personality Trait, such as extroversion, creativity, or deviousness, Disorder, such as anxiety, depression, or psychotic thought disorder, Emotion, such as happiness or anger, Attitude, such as authoritarianism or prejudice, Interest, such as career-related interests.

Other: ____________________________

Describe the specific construct you want to measure in a word or two: Extroversion Personality Trait

Now describe the construct using several sentences. What behaviors are associated with the construct? Does it include more than one quality or dimension

The various behaviors associated with the introversion personality construct have been extensively explored by tests such as The Myers-Briggs Type Indicator (MBTI) which is based on Carl Jung’s theory of personality. This measurement method is one of the most popular personality inventories used with nonclinical populations; The MBTI measures individuals across four bi-polar dimensions:

· Attitudes: Extraversion-Introversion. This measures whether someone is “outward-turning” and action-oriented or “inward turning” and thought-oriented.

· The perceiving function: Sensing- Intuition. This measures whether someone understands and interprets new information using their five senses (sensing) or intuition.

· The judging function: Thinking-Feeling. This measures whether one tends to make decisions based on rational thought or empathic feeling.

· Lifestyle preferences: Judging-Perceiving. This measures whether a person relates to the outside world primarily using their judging function (which is either thinking or feeling) or their perceiving function (which is either sensing or intuition).

Describe your process for initially generating items. Will you interview experts? Review textbooks or journal articles? Look at diagnostic criteria in the DSM?

In order to initially generating items, I will be reviewing textbooks and journal articles developed by clinical psychologists, experts in the psychology of personality such as Carl Jung who has done extensive research in this field. His theory of personality laid a foundation on measures such as the which the Myers-Briggs Type Indicator was developed.

Think about the format and phrasing of your items. For instance, some tests use first-person statements, such as “I enjoy swimming,” while others use questions, such as “Do you enjoy swimming?” Other tests might use single-word prompts, such as “Swimming,” and ask for the test-taker to rate this and other words on a scale of 1–5 in order to indicate the degree of interest or enjoyment. Some tests use pictures rather than words, and some are administered to an informant other than the client, such as a parent or work supervisor.

Think about the response format for your items. Yes/No responses or a Likert scale are popular for personality tests. If you use a Likert scale, consider how many response options there will be and whether your scale will have a neutral midpoint. Multiple-choice is a format that is familiar in academic tests. (Some tests use open-ended responses, but this is difficult to score and too complex for this exercise.)

Now write one typical item for your test, demonstrating your item and response format:

Strongly Disagree Somewhat Agree No Opinion Somewhat Agree Strongly Agree

I enjoy alone time

How many items will your initial test include? Keep in mind that you need to create about twice as many test items initially, because you will discard about half of them during your item analysis. 40

Use the instructor comment to incorporate in this week’s Assignment

Week 3 Assignment

FAIR

 

Paper demonstrates a fair understanding of the concepts and key points as presented in the text/s and Learning Resources. Paper may be lacking in detail and specificity and/or may not include sufficient pertinent examples or provide sufficient evidence from the readings.

 

49-55% (13)

 

Paper is somewhat below graduate level writing style, with multiple smaller or a few major problems. Paper may be lacking in organization, scholarly tone, APA style, and/or contain many writing and/or spelling errors, or shows moderate reliance on quoting vs. original writing and paraphrasing. Paper may contain inferior resources (number or quality).

 

21-23 % (5.25)= 18.25= 73%= C

 

Instructor comments:  Week 3 Assignment.  Good and the following suggestions will improve your assignment.

1.  Check which Category you are testing

2. Is there a literature base for this assessment and cite sources.

3. Reviewing journal articles will not generate test items.

4. A Likert Scale 1-4 requires clarifying 1, 2, 3 and 4, not just 1 and 4.

Thanks, Dr. Jay,  jay.greiner@waldenu.edu

References

Assessing Personality | Boundless Psychology. (2019). Retrieved 8 September 2019, from https://courses.lumenlearning.com/boundless-psychology/chapter/assessing