| |
Abstract
BACKGROUND:Membrane proteins form key nodes in mediating the cell's interaction with the surroundings, which is one of the main reasons why the majority of drug targets are membrane proteins.RESULTS:Here we mined the human proteome and identified the membrane proteome subset using three prediction tools for alpha-helices: Phobius, TMHMM, and SOSUI. This dataset was reduced to a non-redundant set by aligning it to the human genome and then clustered with our own interactive implementation of the ISODATA algorithm. The genes were classified ...
|
| |
posted to graph-randomization network-models
by poirel ✚
on 2013-03-22 22:04:12
Abstract
The interactions between the components of complex networks are often directed. Proper modeling of such systems frequently requires the construction of ensembles of digraphs with a given sequence of in- and out-degrees. As the number of simple labeled graphs with a given degree sequence is typically very large even for short sequences, sampling methods are needed for statistical studies. Currently, there are two main classes of methods that generate samples. One of the existing methods first generates a restricted class of ...
|
| |
by Rajesh Raju, Vishalakshi Nanjappa, Lavanya Balakrishnan, et al.Aneesha Radhakrishnan, Joji Kurian K. Thomas, Jyoti Sharma, Maozhen Tian, Shyam Mohan M. Palapetta, Tejaswini Subbannayya, Nirujogi Raja R. Sekhar, Babylakshmi Muthusamy, Renu Goel, Yashwanth Subbannayya, Deepthi Telikicherla, Mitali Bhattacharjee, Sneha M. Pinto, Nazia Syed, Manda Srinivas S. Srikanth, Gajanan J. Sathe, Sartaj Ahmad, Sandip N. Chavan, Ghantasala Sameer S. Kumar, Arivusudar Marimuthu, T. S. Prasad, H. C. Harsha, B. Abdul Rahiman, Osamu Ohara, Gary D. Bader, S. Sujatha Mohan, William P. Schiemann, Akhilesh Pandey
Abstract
We previously developed NetPath as a resource for comprehensive manually curated signal transduction pathways. The pathways in NetPath contain a large number of molecules and reactions which can sometimes be difficult to visualize or interpret given their complexity. To overcome this potential limitation, we have developed a set of more stringent curation and inclusion criteria for pathway reactions to generate high-confidence signaling maps. NetSlim is ...
|
| |
posted to intercellular-signaling review
by poirel ✚
on 2013-03-22 18:53:49
Abstract
Since the initial discovery of the oncogenic activity of WNT1 in mouse mammary glands, our appreciation for the complex roles for WNT signalling pathways in cancer has increased dramatically. WNTs and their downstream effectors regulate various processes that are important for cancer progression, including tumour initiation, tumour growth, cell senescence, cell death, differentiation and metastasis. Although WNT signalling pathways have been difficult to target, improved drug-discovery platforms and new technologies have facilitated the discovery of agents that can alter WNT signalling ...
|
| |
Abstract
Recent genome sequencing studies have shown that the somatic mutations that drive cancer development are distributed across a large number of genes. This mutational heterogeneity complicates efforts to distinguish functional mutations from sporadic, passenger mutations. Since cancer mutations are hypothesized to target a relatively small number of cellular signaling and regulatory pathways, a common practice is to assess whether known pathways are enriched for mutated genes. We introduce an alternative approach that examines mutated genes in the context of a genome-scale ...
|
| |
Abstract
Understanding the information-processing capabilities of signal transduction networks, how those networks are disrupted in disease, and rationally designing therapies to manipulate diseased states require systematic and accurate reconstruction of network topology. Data on networks central to human physiology, such as the inflammatory signalling networks analyzed here, are found in a multiplicity of on-line resources of pathway and interactome databases (Cancer CellMap, GeneGo, KEGG, NCI-Pathway Interactome ...
|
| |
Abstract
Cellular signal transduction generally involves cascades of post-translational protein modifications that rapidly catalyze changes in protein-DNA interactions and gene expression. High-throughput measurements are improving our ability to study each of these stages individually, but do not capture the connections between them. Here we present an approach for building a network of physical links among these data that can be used to prioritize targets for pharmacological intervention. Our method recovers the critical missing links between proteomic and transcriptional data by relating changes ...
|
| |
Abstract
BACKGROUND:Multigenic diseases are often associated with protein complexes or interactions involved in the same pathway. We wanted to estimate to what extent this is true given a consolidated protein interaction data set. The study stresses data integration and data representation issues.RESULTS:We constructed 497 multigenic disease groups from OMIM and tested for overlaps with interaction and pathway data. A total of 159 disease groups had significant overlaps with protein interaction data consolidated by iRefIndex. A further 68 disease overlaps were found only ...
|
| |
Abstract
Motivation: Large amounts of biological network data exist for many species. Analogous to sequence comparison, network comparison aims to provide biological insight. Graphlet-based methods are proving to be useful in this respect. Recently some doubt has arisen concerning the applicability of graphlet-based measures to low edge density networks—in particular that the methods are ‘unstable’—and further that no existing network model matches the structure found in real biological networks.Results: We demonstrate that it is the model networks themselves that are ‘unstable’ at ...
|
| |
Abstract
Complex diseases are caused by a combination of genetic and environmental factors. Uncovering the molecular pathways through which genetic factors affect a phenotype is always difficult, but in the case of complex diseases this is further complicated since genetic factors in affected individuals might be different. In recent years, systems biology approaches and, more specifically, network based approaches emerged as powerful tools for studying complex diseases. These approaches are often built on the knowledge of physical or functional interactions between molecules ...
|
| |
by Andrew Chatr-Aryamontri, Bobby-Joe J. Breitkreutz, Sven Heinicke, et al.Lorrie Boucher, Andrew Winter, Chris Stark, Julie Nixon, Lindsay Ramage, Nadine Kolas, Lara O'Donnell, Teresa Reguly, Ashton Breitkreutz, Adnane Sellam, Daici Chen, Christie Chang, Jennifer Rust, Michael Livstone, Rose Oughtred, Kara Dolinski, Mike Tyers
Abstract
The Biological General Repository for Interaction Datasets (BioGRID: http//thebiogrid.org) is an open access archive of genetic and protein interactions that are curated from the primary biomedical literature for all major model organism species. As of September 2012, BioGRID houses more than 500 000 manually annotated interactions from more than 30 model organisms. BioGRID maintains complete curation coverage of the literature for the budding yeast Saccharomyces ...
|
| |
by Andrea Franceschini, Damian Szklarczyk, Sune Frankild, et al.Michael Kuhn, Milan Simonovic, Alexander Roth, Jianyi Lin, Pablo Minguez, Peer Bork, Christian von Mering, Lars J. Jensen
Abstract
Complete knowledge of all direct and indirect interactions between proteins in a given cell would represent an important milestone towards a comprehensive description of cellular mechanisms and functions. Although this goal is still elusive, considerable progress has been made-particularly for certain model organisms and functional systems. Currently, protein interactions and associations are annotated at various levels of detail in online resources, ranging from raw data ...
|
| |
Abstract
Knowledge of the various interactions between molecules in the cell is crucial for understanding cellular processes in health and disease. Currently available interaction databases, being largely complementary to each other, must be integrated to obtain a comprehensive global map of the different types of interactions. We have previously reported the development of an integrative interaction database called ConsensusPathDB (http://ConsensusPathDB.org) that aims to fulfill this task. In this update article, we report its significant progress in terms of interaction content and web ...
|
| |
Abstract
The Gene Ontology (GO) Consortium (GOC, http://www.geneontology.org) is a community-based bioinformatics resource that classifies gene product function through the use of structured, controlled vocabularies. Over the past year, the GOC has implemented several processes to increase the quantity, quality and specificity of GO annotations. First, the number of manual, literature-based annotations has grown at an increasing rate. Second, as a result of a new ‘phylogenetic annotation’ process, manually reviewed, homology-based annotations are becoming available for a broad range of species. Third, ...
|
| |
by Kai Peng, Wei Xu, Jianyong Zheng, et al.Kegui Huang, Huisong Wang, Jiansong Tong, Zhifeng Lin, Jun Liu, Wenqing Cheng, Dong Fu, Pan Du, Warren A. Kibbe, Simon M. Lin, Tian Xia
Abstract
Disease and Gene Annotations database (DGA, http://dga.nubic.northwestern.edu) is a collaborative effort aiming to provide a comprehensive and integrative annotation of the human genes in disease network context by integrating computable controlled vocabulary of the Disease Ontology (DO version 3 revision 2510, which has 8043 inherited, developmental and acquired human diseases), NCBI Gene Reference Into Function (GeneRIF) and molecular interaction network (MIN). DGA integrates these resources ...
|
| |
by David Fazekas, Mihaly Koltai, Denes Turei, et al.Dezso Modos, Mate Palfy, Zoltan Dul, Lilian Zsakai, Mate S. Beko, Katalin Lenti, Illes Farkas, Tibor Vellai, Peter Csermely, Tamas Korcsmaros
Abstract
BACKGROUND:Signaling networks in eukaryotes are made up of upstream and downstream subnetworks. The upstream subnetwork contains the intertwined network of signaling pathways, while the downstream regulatory part contains transcription factors and their binding sites on the DNA as well as microRNAs and their mRNA targets. Currently, most signaling and regulatory databases contain only a subsection of this network, making comprehensive analyses highly time-consuming and dependent on specific data handling expertise. The need for detailed mapping of signaling systems is also supported ...
|
| |
Abstract
Motivation: Genome-wide fitness is an emerging type of high-throughput biological data generated for individual organisms by creating libraries of knockouts, subjecting them to broad ranges of environmental conditions, and measuring the resulting clone-specific fitnesses. Since fitness is an organism-scale measure of gene regulatory network behaviour, it may offer certain advantages when insights into such phenotypical and functional features are of primary interest over individual gene expression. Previous works have shown that genome-wide fitness data can be used to uncover novel gene ...
|
| |
posted to transcription-modules
by poirel ✚
on 2013-01-22 20:56:23
Abstract
Systematically perturbing a cellular system and monitoring the effects of the perturbations on gene expression provide a powerful approach to study signal transduction in gene expression systems. A critical step of revealing a signal transduction pathway regulating gene expression is to identify transcription factors transmitting signals in the system. In this paper, we address the task of identifying modules of cooperative transcription factors based on results derived from systems-biology experiments at two levels: First, a graph algorithm is developed to identify ...
|
| |
Abstract
Many techniques have been developed to compute the response network of a cell. A recent trend in this area is to compute response networks of small size, with the rationale that only part of a pathway is often changed by disease and that interpreting small subnetworks is easier than interpreting larger ones. However, these methods may not uncover the spectrum of pathways perturbed in a ...
|
| |
Abstract
Modern experimental technology enables the identification of the sensory proteins that interact with the cells' environment or various pathogens. Expression and knockdown studies can determine the downstream effects of these interactions. However, when attempting to reconstruct the signaling networks and pathways between these sources and targets, one faces a substantial challenge. Although pathways are directed, high-throughput protein interaction data are undirected. In order to utilize ...
|
| |
Abstract
BACKGROUND:Intracellular signal transduction is achieved by networks of proteins and small molecules that transmit information from the cell surface to the nucleus, where they ultimately effect transcriptional changes. Understanding the mechanisms cells use to accomplish this important process requires a detailed molecular description of the networks involved.RESULTS:We have developed a computational approach for generating static models of signal transduction networks which utilizes protein-interaction maps generated from large-scale two-hybrid screens and expression profiles from DNA microarrays. Networks are determined entirely by integrating ...
|
| |
Abstract
Networks are typically visualized with force-based or spectral layouts. These algorithms lack reproducibility and perceptual uniformity because they do not use a node coordinate system. The layouts can be difficult to interpret and are unsuitable for assessing differences in networks. To address these issues, we introduce hive plots (http://www.hiveplot.com) for generating informative, quantitative and comparable network layouts. Hive plots depict network structure transparently, are simple to understand and can be easily tuned to identify patterns of interest. The method is computationally ...
|
| |
Abstract
Transcription factors are key cellular components that control gene expression: their activities determine how cells function and respond to the environment. Currently, there is great interest in research into human transcriptional regulation. However, surprisingly little is known about these regulators themselves. For example, how many transcription factors does the human genome contain? How are they expressed in different tissues? Are they evolutionarily conserved? Here, we present an analysis of 1,391 manually curated sequence-specific DNA-binding transcription factors, their functions, genomic organization and ...
|
| |
Integrative biology : quantitative biosciences from nano to macro, Vol. 4, No. 11. (November 2012), pp. 1415-1427, doi:10.1039/c2ib20072d
Abstract
The rapid development of high throughput biotechnologies has led to an onslaught of data describing genetic perturbations and changes in mRNA and protein levels in the cell. Because each assay provides a one-dimensional snapshot of active signaling pathways, it has become desirable to perform multiple assays (e.g. mRNA expression and phospho-proteomics) to measure a single condition. However, as experiments expand to accommodate various cellular conditions, ...
|
| |
Abstract
The advent of functional genomics has enabled the molecular biosciences to come a long way towards characterizing the molecular constituents of life. Yet, the challenge for biology overall is to understand how organisms function. By discovering how function arises in dynamic interactions, systems biology addresses the missing links between molecules and physiology. Top-down systems biology identifies molecular interaction networks on the basis of correlated molecular ...
|
| |
Abstract
BACKGROUND:Cells process signals using complex and dynamic networks. Studying how this is performed in a context and cell type specific way is essential to understand signaling both in physiological and diseased situations. Context-specific medium/high throughput proteomic data measured upon perturbation is now relatively easy to obtain but formalisms that can take advantage of these features to build models of signaling are still comparatively scarce.RESULTS:Here we present CellNOptR, an open-source R software package for building predictive logic models of signaling networks by ...
|
| |
Abstract
Motivation: A promising way to make sense out of gene expression profiles is to relate them to the activity of metabolic and signalling pathways. Each pathway usually involves many genes, such as enzymes, which can themselves participate in many pathways. The set of all known pathways can therefore be represented by a complex network of genes. Searching for regularities in the set of gene expression profiles with respect to the topology of this gene network is a way to automatically extract ...
|
| |
Abstract
Motivation: One of the main goals of high-throughput gene-expression studies in cancer research is to identify prognostic gene signatures, which have the potential to predict the clinical outcome. It is common practice to investigate these questions using classification methods. However, standard methods merely rely on gene-expression data and assume the genes to be independent. Including pathway knowledge a priori into the classification process has recently been indicated as a promising way to increase classification accuracy as well as the interpretability and ...
|
| |
Computational Biology and Bioinformatics, IEEE/ACM Transactions on, Vol. 9, No. 6. (November 2012), pp. 1696-1708, doi:10.1109/tcbb.2012.128
Abstract
Reconstructing the topology of a signaling network by means of RNA interference (RNAi) technology is an underdetermined problem especially when a single gene in the network is knocked down or observed. In addition, the exponential search space limits the existing methods to small signaling networks of size 10-15 genes. In this paper, we propose integrating RNAi data with a reference physical interaction network. We formulate the problem of signaling network reconstruction as finding the minimum number of edit operations on a ...
|
| |
Abstract
BACKGROUND:Protein-protein interaction networks are key to a systems-level understanding of cellular biology. However, interaction data can contain a considerable fraction of false positives. Several methods have been proposed to assess the confidence of individual interactions. Most of them require the integration of additional data like protein expression and interaction homology information. While being certainly useful, such additional data are not always available and may introduce additional bias and ambiguity.RESULTS:We propose a novel, network topology based interaction confidence assessment method called CAPPIC ...
|
| |
Abstract
MOTIVATION: When analysing gene expression time series data, an often overlooked but crucial aspect of the model is that the regulatory network structure may change over time. Although some approaches have addressed this problem previously in the literature, many are not well suited to the sequential nature of the data. RESULTS: Here, we present a method that allows us to infer regulatory network structures that ...
|
| |
Abstract
Motivation: Gene prioritization aims at identifying the most promising candidate genes among a large pool of candidates—so as to maximize the yield and biological relevance of further downstream validation experiments and functional studies. During the past few years, several gene prioritization tools have been defined, and some of them have been implemented and made available through freely available web tools. In this study, we aim at comparing the predictive performance of eight publicly available prioritization tools on novel data. We have ...
|
| |
(18 August 1989)
posted to no-tag
by poirel ✚
on 2012-10-01 17:34:50
|
| |
Abstract
Embedded within large-scale protein interaction networks are signaling pathways that encode response cascades in the cell. Unfortunately, even for well-studied species like S. cerevisiae, only a fraction of all true protein interactions are known, which makes it difficult to reason about the exact flow of signals and the corresponding causal relations in the network. To help address this problem, we introduce a framework for predicting new interactions that aid connectivity between upstream proteins (sources) and downstream transcription factors (targets) of a ...
|
| |
Abstract
The Disease Ontology (DO) database (http://disease-ontology.org) represents a comprehensive knowledge base of 8043 inherited, developmental and acquired human diseases (DO version 3, revision 2510). The DO web browser has been designed for speed, efficiency and robustness through the use of a graph database. Full-text contextual searching functionality using Lucene allows the querying of name, synonym, definition, DOID and cross-reference (xrefs) with complex Boolean search strings. The DO semantically integrates disease and medical vocabularies through extensive cross mapping and integration of MeSH, ...
|
| |
Abstract
The impact of disease-causing defects is often not limited to the products of a mutated gene but, thanks to interactions between the molecular components, may also affect other cellular functions, resulting in potential comorbidity effects. By combining information on cellular interactions, disease--gene associations, and population-level disease patterns extracted from Medicare data, we find statistically significant correlations between the underlying structure of cellular networks and disease comorbidity patterns in the human population. Our results indicate that such a combination of population-level data ...
|
| |
|
| |
Abstract
BACKGROUND:Understanding the evolution of biological networks can provide insight into how their modular structure arises and how they are affected by environmental changes. One approach to studying the evolution of these networks is to reconstruct plausible common ancestors of present-day networks, allowing us to analyze how the topological properties change over time and to posit mechanisms that drive the networks' evolution. Further, putative ancestral networks can be used to help solve other difficult problems in computational biology, such as network alignment.RESULTS:We ...
|
| |
Abstract
We study the behavior of an algorithm derived from the cavity method for the prize-collecting steiner tree (PCST) problem on graphs. The algorithm is based on the zero temperature limit of the cavity equations and as such is formally simple (a fixed point equation resolved by iteration) and distributed (parallelizable). We provide a detailed comparison with state-of-the-art algorithms on a wide range of existing benchmarks, networks, and random graphs. Specifically, we consider an enhanced derivative of the Goemans-Williamson heuristics and the ...
|
| |
Abstract
The principle that 'popularity is attractive' underlies preferential attachment, which is a common explanation for the emergence of scaling in growing networks. If new connections are made preferentially to more popular nodes, then the resulting distribution of the number of connections possessed by nodes follows power laws, as observed in many real networks. Preferential attachment has been directly validated for some real networks (including the ...
|
| |
Abstract
BACKGROUND:Genome and metagenome studies have identified thousands of protein families whose functions are poorly understood and for which techniques for functional characterization provide only partial information. For such proteins, the genome context can give further information about their functional context.RESULTS:We describe a Bayesian method, based on a probabilistic topic model, which directly identifies functional modules of protein families. The method explores the co-occurrence patterns of protein families across a collection of sequence samples to infer a probabilistic model of arbitrarily-sized functional ...
|
| |
|
| |
Abstract
ABSTRACT: The use of biological molecular network information for diagnostic and prognostic purposes and elucidation of molecular disease mechanism is a key objective in systems biomedicine. The network of regulatory miRNA-target and functional protein interactions is a rich source of information to elucidate the function and the prognostic value of ...
|
| |
Abstract
Assessing functional associations between an experimentally derived gene or protein set of interest and a database of known gene/protein sets is a common task in the analysis of large-scale functional genomics data. For this purpose, a frequently used approach is to apply an over-representation-based enrichment analysis. However, this approach has four drawbacks: (i) it can only score functional associations of overlapping gene/proteins sets; (ii) it ...
|
| |
Abstract
In recent years, Markov clustering (MCL) has emerged as an effective algorithm for clustering biological networks-for instance clustering protein-protein interaction (PPI) networks to identify functional modules. However, a limitation of MCL and its variants (e.g. regularized MCL) is that it only supports hard clustering often leading to an impedance mismatch given that there is often a significant overlap of proteins across functional modules. ...
|
| |
Abstract
Signaling networks are essential for cells to control processes such as growth and response to stimuli. Although many “omic” data sources are available to probe signaling pathways, these data are typically sparse and noisy. Thus, it has been difficult to use these data to discover the cause of the diseases. We overcome these problems and use “omic” data to simultaneously reconstruct multiple pathways that are altered in a particular condition by solving the prize-collecting Steiner forest problem. To evaluate this approach, ...
|
| |
Abstract
Systematic chromatin immunoprecipitation (chIP-chip) experiments have become a central technique for mapping transcriptional interactions in model organisms and humans. However, measurement of chromatin binding does not necessarily imply regulation, and binding may be difficult to detect if it is condition or cofactor dependent. To address these challenges, we present an approach for reliably assigning transcription factors (TFs) to target genes that integrates many lines of direct and indirect evidence into a single probabilistic model. Using this approach, we analyze publicly available ...
|
| |
Abstract
ABSTRACT: Cancer sequencing projects are now measuring somatic mutations in large numbers of cancer genomes. A key challenge in interpreting these data is to distinguish driver mutations, mutations important for cancer development, from passenger mutations that have accumulated in somatic cells but without functional consequences. A common approach to identify ...
|
| |
Abstract
External information propagates in the cell mainly through signaling cascades and transcriptional activation, allowing it to react to a wide spectrum of environmental changes. High-throughput experiments identify numerous molecular components of such cascades that may, however, interact through unknown partners. Some of them may be detected using data coming from the integration of a protein–protein interaction network and mRNA expression profiles. This inference problem can be mapped onto the problem of finding appropriate optimal connected subgraphs of a network defined by ...
|
| |
Abstract
Various technologies can be used to produce genome-scale, or 'omics', data sets that provide systems-level measurements for virtually all types of cellular components in a model organism. These data yield unprecedented views of the cellular inner workings. However, this abundance of information also presents many hurdles, the main one being the extraction of discernable biological meaning from multiple omics data sets. Nevertheless, researchers are rising to the challenge by using omics data integration to address fundamental biological questions that would increase ...
|