2007年8月26日日曜日

Relationships among IRT item discrimination and item fit indices in criterion-referenced language testing.

Hudson, T. (1991). Relationships among IRT item discrimination and item fit indices in criterion-referenced language testing. Language Testing, 8(2), 160-181.

Hudson (1991) applied IRT to analyze CRTs. The Rasch model or the one-parameter logistic model was compared with the two-parameter logistic model and used to analyze two forms of general tests of English language proficiency (GTELP). He reported the Cronbach alpha internal consistency reliability that was used for norm-referenced tests, and found all the subtests to be highly reliable. The results indicated that strong correlations were found among point-biserial, infit, outfit, and slope parameter. He recommended the two-parameter logistic model over the Rasch model because the slope parameter was easier to interpret; the infit/outfit statistics could be a substitute for the slope parameter. He concluded that highly discriminant items should be omitted in the test development. Item difficulty should also be taken into account. Selecting items with item difficulty near the cut-off point should be included to arrive at more dependable pass/fail decisions.