Abstract/Details

Beyond bags of words: Effectively modeling dependence and features in information retrieval


2007 2007

Other formats: Order a copy

Abstract (summary)

Current state of the art information retrieval models treat documents and queries as bags of words. There have been many attempts to go beyond this simple representation. Unfortunately, few have shown consistent improvements in retrieval effectiveness across a wide range of tasks and data sets. Here, we propose a new statistical model for information retrieval based on Markov random fields. The proposed model goes beyond the bag of words assumption by allowing dependencies between terms to be incorporated into the model. This allows for a variety of textual and non-textual features to be easily combined under the umbrella of a single model. Within this framework, we explore the theoretical issues involved, parameter estimation, feature selection, and query expansion. We give experimental results from a number of information retrieval tasks, such as ad hoc retrieval and web search.

Indexing (details)


Subject
Computer science
Classification
0984: Computer science
Identifier / keyword
Applied sciences, Information retrieval, Learning to rank, Ranking functions, Search engines, Web search
Title
Beyond bags of words: Effectively modeling dependence and features in information retrieval
Author
Metzler, Donald A., Jr.
Number of pages
209
Publication year
2007
Degree date
2007
School code
0118
Source
DAI-B 68/11, Dissertation Abstracts International
Place of publication
Ann Arbor
Country of publication
United States
ISBN
9780549330523
Advisor
Croft, William B.
Committee member
Allan, James; Buonaccorsi, John; McCallum, Andrew
University/institution
University of Massachusetts Amherst
Department
Computer Science
University location
United States -- Massachusetts
Degree
Ph.D.
Source type
Dissertations & Theses
Language
English
Document type
Dissertation/Thesis
Dissertation/thesis number
3289243
ProQuest document ID
304845998
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Document URL
http://search.proquest.com/docview/304845998
Access the complete full text

You can get the full text of this document if it is part of your institution's ProQuest subscription.

Try one of the following:

  • Connect to ProQuest through your library network and search for the document from there.
  • Request the document from your library.
  • Go to the ProQuest login page and enter a ProQuest or My Research username / password.