Abstract/Details

Reconstructing signaling pathways from high throughput data


2006 2006

Other formats: Order a copy

Abstract (summary)

Many bioinformatics problems can be tackled from a fresh angle offered by the network perspective. Taking into account the network constraints on gene interaction, we propose a series of logically-coherent approaches to reconstruct signaling pathways from high throughput expression profiling data. These approaches proceed in three consecutive steps: co-expression network construction with controlled biological and statistical significance, network constrained clustering, and reconstruction of the order of pathway components.

The first step relies on detecting pairwise co-expression of genes. We attack the problem from both frequentist statistics and Bayesian statistics perspectives. We designed and implemented a frequentist two-stage co-expression detection algorithm that controls both statistical significance (False Discovery Rate, FDR) and biological significance (Minimum Acceptable Strength, MAS) of the discovered co-expressions. In order to regularize variances of the correlation estimation in small sample scenario, we also designed and implemented a Bayesian hierarchical model, in which correlation parameters are assumed to be exchangeable and sampled from a parental Gaussian distribution. Using simulated data and the galactose metabolism data, we demonstrated advantages of our approaches and compared the differences among them.

The second problem considered is distance-based clustering that accounts for "network constraints" extracted from the Giant Connected Component (GCC) of the network discovered from the data. The clustering is performed using a "hybrid" distance matrix composed of direct distance between adjacent genes and "shortest-path" distance between non-adjacent genes in the network. The third problem considered is the reconstruction of the order of pathway components. We applied a first-order Markov model, originally developed and applied to a network tomography problem in telecommunication networks, to reconstruct three well-known signaling pathways from unordered pathway components. We suggest that the methods proposed here can also be applied to other high throughput data analysis problems.

Indexing (details)


Subject
Bioinformatics;
Biostatistics
Classification
0715: Bioinformatics
0308: Biostatistics
Identifier / keyword
Biological sciences; Clustering; Expression profiling; High-throughput data; Signaling pathways
Title
Reconstructing signaling pathways from high throughput data
Author
Zhu, Dongxiao
Number of pages
170
Publication year
2006
Degree date
2006
School code
0127
Source
DAI-B 67/07, Dissertation Abstracts International
Place of publication
Ann Arbor
Country of publication
United States
ISBN
9780542791468
Advisor
Hero, Alfred O., III
University/institution
University of Michigan
University location
United States -- Michigan
Degree
Ph.D.
Source type
Dissertations & Theses
Language
English
Document type
Dissertation/Thesis
Dissertation/thesis number
3224798
ProQuest document ID
305303599
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Document URL
http://search.proquest.com/docview/305303599
Access the complete full text

You can get the full text of this document if it is part of your institution's ProQuest subscription.

Try one of the following:

  • Connect to ProQuest through your library network and search for the document from there.
  • Request the document from your library.
  • Go to the ProQuest login page and enter a ProQuest or My Research username / password.