Abstract/Details

Low-cost and robust evaluation of information retrieval systems


2008 2008

Other formats: Order a copy

Abstract (summary)

Research in Information Retrieval has progressed against a background of rapidly increasing corpus size and heterogeneity, with every advance in technology quickly followed by a desire to organize and search more unstructured, more heterogeneous, and even bigger corpora. But as retrieval problems get larger and more complicated, evaluating the ranking performance of a retrieval engine gets harder: evaluation requires human judgments of the relevance of documents to queries, and for very large corpora the cost of acquiring these judgments may be insurmountable. This cost limits the types of problems researchers can study as well as the data they can be studied on.

We present methods for understanding performance differences between retrieval engines in the presence of missing and noisy relevance judgments. The work introduces a model of the cost of experimentation that incorporates the cost of human judgments as well as the cost of drawing incorrect conclusions about differences between engines in both the training and testing phases of engine development. Through adopting a view of evaluation that is more concerned with distributions over performance differences rather than estimates of absolute performance, the expected cost can be minimized so as to reliably differentiate between engines with less than 1% of the human effort that has been used in past experiments.

Indexing (details)


Subject
Computer science
Classification
0984: Computer science
Identifier / keyword
Applied sciences; Information retrieval; Retrieval engines
Title
Low-cost and robust evaluation of information retrieval systems
Author
Carterette, Benjamin A.
Number of pages
255
Publication year
2008
Degree date
2008
School code
0118
Source
DAI-B 69/12, Dissertation Abstracts International
Place of publication
Ann Arbor
Country of publication
United States
ISBN
9780549915607
Advisor
Allan, James
Committee member
Aslam, Javed A.; Croft, W. Bruce; Sitaraman, Ramesh K.; Staudenmayer, John
University/institution
University of Massachusetts Amherst
Department
Computer Science
University location
United States -- Massachusetts
Degree
Ph.D.
Source type
Dissertations & Theses
Language
English
Document type
Dissertation/Thesis
Dissertation/thesis number
3336974
ProQuest document ID
304566112
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Document URL
http://search.proquest.com/docview/304566112
Access the complete full text

You can get the full text of this document if it is part of your institution's ProQuest subscription.

Try one of the following:

  • Connect to ProQuest through your library network and search for the document from there.
  • Request the document from your library.
  • Go to the ProQuest login page and enter a ProQuest or My Research username / password.