Assessing the relative performance of local item dependence indexes
A Monte Carlo simulation study was designed to assess the relative performances of 10 local item dependence (LID) indexes: the Likelihood Ratio G 2, the power-divergence statistic, Q 3, the Fisher's r-to-z transformed Q3, the Wald Test statistic in logistic regression, the Likelihood Ratio (LR) Test statistic in logistic regression, the absolute mutual information difference (AMID), the mutual information difference (MID), the Modification Index from structural equation modeling, and residual correlation in factor analysis. The performances of the 10 LID indexes were examined under two conditions: local item independence (LII) was true and LII was false. The impact of three independent variables (test length, LID level, and locally dependent item percentage) on the relative performances of the 10 LID indexes was investigated. The relative performances of these LID indexes were evaluated in term of Type I error rate, power, and false positive rate.
Results indicated that the Type I error rate of Wald Test was conservative across the three test length conditions, whereas the Type I error rate of the LR Test was quite liberal. However, the Type I error rate of the Modification Index was relatively close to the conventional significance level (5%). The power of the Fisher's r-to-z transformed Q3 was the highest among the 10 LID indexes across all the LID conditions. Both the Wald Test and the LR Test provided the lowest power especially when the number of items was small. In general, the false positive rate of the Wald Test was the lowest, whereas the false positive rate of the LR Test was the highest.
There was no LID index that performed the best with respect to all three aspects: Type I error rate, power and false positive rate. However, Q3, MID, residual correlation, AMID and the Modification Index could be recommended for most of the LID conditions because of their relatively high power and low false positive rate. The Wald Test could be the best LID index for tests consisting of large number of items (e.g., 60 item tests) because of its low Type I error rate and false positive rate, and high power.
Key words. local item dependence indexes, local item independence, item response model, power, Type I error rate, false positive rate.
0632: Psychological tests