Abstract/Details

Algorithms and inference for mixture models with application to protein sequence analysis


2010 2010

Other formats: Order a copy

Abstract (summary)

Mixture model-based clustering is a commonly used statistical tool. The first part of my dissertation describes new search algorithms for finding the partition that maximizes a criterion function, and new Markov chain Monte Carlo algorithms for drawing partitions from a target distribution. These algorithms are based on a neighborhood pruning technique that incorporates bottom-up hierarchical clustering methods. The second part of my dissertation gives a new estimator of mixture order for multivariate categorical data. The estimator is related to the finding mixture order via Bayes factors. The finite sample performance of the estimator is good, and its large sample behavior can be analyzed using rate distortion theory and is conjectured to not over-estimate mixture order, asymptotically. The third part of my dissertation uses a Bayesian mixture profile hidden Markov model to find the subfamilies in a protein family. Application to simulated and real datasets show that meaningful partitions with the correct numbers of components can be identified. As subfamilies usually differ in their functions, valuable insights can be gained through this cluster analysis.

Indexing (details)


Subject
Biostatistics;
Statistics
Classification
0308: Biostatistics
0463: Statistics
Identifier / keyword
Pure sciences; Biological sciences; Cluster analysis; Mixture models; Neighborhood pruning; Profile hidden Markov models; Protein family; Protein sequence
Title
Algorithms and inference for mixture models with application to protein sequence analysis
Author
Fong, Youyi
Number of pages
103
Publication year
2010
Degree date
2010
School code
0250
Source
DAI-B 71/05, Dissertation Abstracts International
Place of publication
Ann Arbor
Country of publication
United States
ISBN
9781109727777
Advisor
Wakefield, Jonathan C.; Rice, Kenneth M.
University/institution
University of Washington
University location
United States -- Washington
Degree
Ph.D.
Source type
Dissertations & Theses
Language
English
Document type
Dissertation/Thesis
Dissertation/thesis number
3406120
ProQuest document ID
275704246
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Document URL
http://search.proquest.com/docview/275704246
Access the complete full text

You can get the full text of this document if it is part of your institution's ProQuest subscription.

Try one of the following:

  • Connect to ProQuest through your library network and search for the document from there.
  • Request the document from your library.
  • Go to the ProQuest login page and enter a ProQuest or My Research username / password.