Dynamical Systems for Discovering Protein Complexes and Functional Modules from Biological Networks
Recent advances in high throughput experiments and annotations via published literature have provided a wealth of interaction maps of several biomolecular networks, including metabolic, protein-protein, and protein-DNA interaction networks. The architecture of these molecular networks reveals important principles of cellular organization and molecular functions. Analyzing such networks, i.e., discovering the dense regions in the network, is an important way to identify protein complexes and functional modules. This task has been formulated as the problem of finding the heavy subgraphs, Heaviest k-Subgraph Problem (k-HSP), which itself is NP-hard. However, any method based on k-HSP requires the parameter k, and an exact solution of k-HSP may still end up as a "spurious" heavy subgraph; thus reducing its practicability in analyzing large scale biological networks. We proposed a new formulation called the rank-HSP and two dynamical systems to approximate its results. In addition, a novel metric called the Standard deviation and Mean Ratio (SMR) is proposed for use in "spurious" heavy subgraphs to automate the discovery by setting a fixed threshold. Empirical results on both the simulated graphs, and biological networks have demonstrated the efficiency and effectiveness of our proposal.