Abstract/Details

When to worry more: An empirical investigation of the effects of non-randomly missing data on regression analysis


1992 1992

Other formats: Order a copy

Abstract (summary)

Statistical remedies exist for most configurations of missing data, but these remedies require specific models and/or measures of nonresponse that are usually unavailable to the researcher. Consequently, the question of the conditions under which the threat to regression analysis posed by non-randomly missing data increases becomes relevant. This simulation study addresses that question empirically by assessing the effect of various configurations of non-randomly missing data on OLS regression analysis completed with different techniques for coping with the missing observations on samples drawn from varying populations. The configurations of missing data vary by which variable has missing observations, which variable drives the response mechanism, and the strength of the response mechanism. Five different techniques--listwise deletion, pairwise deletion, regression estimation without the addition of a residual, regression estimation with the addition of a residual, and EM estimation--for coping with the missing observations are compared. The regression analysis is completed on samples of different sizes drawn from populations that vary on the degree of correlation between the independent variables and the effect sizes. The effects of the non-randomly missing observations are assessed in terms of the deviation of the estimated coefficient from its true value and the increase or decrease in the associated standard error relative to its value based on known population parameters. The effect of the missing observations on inference is examined as well. In general, when the strength of the missing data mechanism is low, all techniques except pairwise deletion perform quite well. When the strength of the missing data mechanism is high, regression with the addition of a residual and EM estimation perform better than the other techniques. Pairwise deletion and regression without the addition of a residual consistently produce the worst results. Finally, the most troublesome situation occurs when the chances of observing a variable depend upon the value of the variable itself.

Indexing (details)


Subject
Statistics;
Social research
Classification
0463: Statistics
0344: Social research
Identifier / keyword
Social sciences, Pure sciences, missing data
Title
When to worry more: An empirical investigation of the effects of non-randomly missing data on regression analysis
Author
Sellers, Deborah Ellen
Number of pages
213
Publication year
1992
Degree date
1992
School code
0118
Source
DAI-B 53/02, Dissertation Abstracts International
Place of publication
Ann Arbor
Country of publication
United States
Advisor
Anderson, Andy B.
University/institution
University of Massachusetts Amherst
University location
United States -- Massachusetts
Degree
Ph.D.
Source type
Dissertations & Theses
Language
English
Document type
Dissertation/Thesis
Dissertation/thesis number
9219497
ProQuest document ID
304001466
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Document URL
http://search.proquest.com/docview/304001466
Access the complete full text

You can get the full text of this document if it is part of your institution's ProQuest subscription.

Try one of the following:

  • Connect to ProQuest through your library network and search for the document from there.
  • Request the document from your library.
  • Go to the ProQuest login page and enter a ProQuest or My Research username / password.