A comparison of item response theory true score equating and item response theory-based local equating

2007 2007

Other formats: Order a copy

Abstract (summary)

The need to compare students across different test administrations, or perhaps across different test forms within the same administration, plays a key role in most large-scale testing programs. In order to do this, these tests must be placed on the same scale. Placing test forms onto the same scale not only allows results from different test forms to be compared to each other, but also facilitates placing the results from different test scores onto a common reporting scale. The statistical method used to place these test scores onto a common metric is called equating.

Estimated true equating, one of the conditional equating methods described by van der Linden (2000), has been shown to be a dramatic improvement over classical based equipercentile equating under some conditions (van der Linden, 2006).

The purpose of the study is to investigate the relative performance of estimated true equating with IRT true score equating under a variety of conditions that are known to impact equating accuracy, namely: anchor test length, data misfit, scaling method, and examinee ability distribution, through simulation study. The results are evaluated based on root mean squared error (RMSE) and bias of the equating functions, as well as decision accuracy when placing examinees in to performance categories. A secondary research question of relative performance of the scaling methods is also investigated.

The results indicate that estimated true equating shows tremendous promise with the dramatically lower bias and RMSE values when compared to IRT true score equating. However, this promise does not bear out when looking at examinee classification. Despite the lack of significant gains in the area of decision accuracy, this new equating method shows promise in its reduction of error attributable to the equating functions themselves, and therefore deserves further scrutiny.

The results fail to indicate a clear choice for a scaling method for use with either equating method. Practitioners still must do their best to rely on the growing body of evidence, and consider the nature of their own testing programs, and the abilities of their examinee population when choosing a scaling method.

Indexing (details)

Educational evaluation
0288: Educational evaluation
Identifier / keyword
Education; Equating; Item response theory; Local equating; Scaling; True score equating
A comparison of item response theory true score equating and item response theory-based local equating
Keller, Robert R., III
Number of pages
Publication year
Degree date
School code
DAI-A 68/11, Dissertation Abstracts International
Place of publication
Ann Arbor
Country of publication
United States
Wells, Craig S.; Keller, Lisa A.
University of Massachusetts Amherst
University location
United States -- Massachusetts
Source type
Dissertations & Theses
Document type
Dissertation/thesis number
ProQuest document ID
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Document URL
Access the complete full text

You can get the full text of this document if it is part of your institution's ProQuest subscription.

Try one of the following:

  • Connect to ProQuest through your library network and search for the document from there.
  • Request the document from your library.
  • Go to the ProQuest login page and enter a ProQuest or My Research username / password.