Assessing fit of item response theory models

2006 2006

Other formats: Order a copy

Abstract (summary)

Item response theory (IRT) modeling is a statistical technique that is being widely applied in the field of educational and psychological testing. The usefulness of IRT models, however, is dependent on the extent to which they effectively reflect the data, and it is necessary that model data fit be evaluated before model application by accumulating a wide variety of evidence that supports the proposed uses of the model with a particular set of data.

This thesis addressed issues in the collection of two major sources of fit evidence to support IRT model application: evidence based on model data congruence, and evidence based on intended uses of the model and practical consequences. Specifically, the study (a) proposed a new goodness-of-fit procedure, examined its performance using fitting and misfitting data, and compared its behavior with that of the commonly used goodness-of-fit procedures, and (b) investigated through simulations the consequences of model misfit on two of the major IRT applications: equating and computer adaptive testing.

In all simulation studies, 3PLM was assumed to be the true IRT model, while 1PLM and 2PLM were treated as misfitting models. The study found that the new proposed goodness-of-fit statistic correlated consistently higher than the commonly used fit statistics with the true size of misfit, making it a useful index to estimate the degree of misfit, which is often of interest but unknown in practice. A major issue with the new statistic is its inappropriately defined null distribution and critical values, and as a result the new statistical test appeared to be less powerful, but less susceptible to type I error rate either.

In examining the consequences of model data misfit, the study showed that although theoretically 2PLM could not provide a perfect fit to 3PLM data, there was minimum consequence if 2PLM was used to equate 3PLM data and if number correct scores were to be reported. This, however, was not true in CAT given the significant bias 2PLM produced. The study further emphasized the importance of fit evaluation through both goodness-of-fit statistical tests and examining practical consequences of misfit.

Indexing (details)

Educational evaluation
0288: Educational evaluation
Identifier / keyword
Education; Computer adaptive testing; Equating; Goodness-of-fit; Item response; Model fits
Assessing fit of item response theory models
Lu, Ying
Number of pages
Publication year
Degree date
School code
DAI-A 67/01, Dissertation Abstracts International
Place of publication
Ann Arbor
Country of publication
United States
9780542522123, 0542522128
Hambleton, Ronald K.
University of Massachusetts Amherst
University location
United States -- Massachusetts
Source type
Dissertations & Theses
Document type
Dissertation/thesis number
ProQuest document ID
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Document URL
Access the complete full text

You can get the full text of this document if it is part of your institution's ProQuest subscription.

Try one of the following:

  • Connect to ProQuest through your library network and search for the document from there.
  • Request the document from your library.
  • Go to the ProQuest login page and enter a ProQuest or My Research username / password.