To insert individual citation into a bibliography in a word-processor,
select your preferred citation style below and drag-and-drop it into the document.
Statistical applications in genetics and molecular biology, Vol. 11, No. 3. (2012), doi:10.1515/1544-6115.1660 Key: citeulike:10571173
Formatted Citation
Show HTML
Likes
(beta)
This copy of the article hasn't been liked by anyone yet.
Most approaches for analyzing ChIP-Seq data are focused on inferring exact protein binding sites from a single library. However, frequently multiple ChIP-Seq libraries derived from differing cell lines or tissue types from the same individual may be available. In such a situation, a separate analysis for each tissue or cell line may be inefficient. Here, we describe a novel method to analyze such data that intelligently uses the joint information from multiple related ChIP-Seq libraries. We present our method as a two-stage procedure. First, separate single cell line analysis is performed for each cell line. Here, we use a novel mixture regression approach to infer the subset of genes that are most likely to be involved in protein binding in each cell line. In the second step, we combine the separate single cell line analyses using an Empirical Bayes algorithm that implicitly incorporates inter-cell line correlation. We demonstrate the usefulness of our method using both simulated data, as well as real H3K4me3 and H3K27me3 histone methylation libraries.
pi_i = P(Z_i=1) -> proba that gene is methylated ; pi = sum_i pi_i
fit model with EM algorithm
Multiple libraries
jointly analyze all K libraries to better estimate the pi_i's, adapt model from Datta and Zhao (Bioinformatics, 2008)
use "configurations" (vector of binary random variables)
assume that, given the true configuration, the read counts for a given gene in the K libraries are independent => equivalent to covariance of errors being diagonal?
a big contingency table of configurations (2^K cells) counts expected number of genes that have each possible configuration: find "best" log-linear model by forward selection using BIC, iterate until the difference in ratio between the estimates from two successive steps is below a cut-off threshold
the sharing of information alleviates the need for multiple testing corrections
CiteULike organises scholarly (or academic) papers or literature and provides bibliographic
(which means it makes bibliographies) for universities and higher education establishments.
It helps undergraduates and postgraduates. People studying for PhDs or in postdoctoral (postdoc) positions.
The service is similar in scope to EndNote or RefWorks or any other reference manager
like BibTeX, but it is a social bookmarking service for scientists and humanities researchers.