Abstract
Recent work in the analysis of biological networks have identified repeated functional modules, known as motifs, with biological and evolutionary significance. Motifs are present in various kinds of biological data, including gene sequences, metabolic networks, synaptic connection graphs, ecological systems. and protein-protein interaction networks. Motif detection in PPI networks is an NP-complete problem prompting previous solutions such as the methods used in Berg et. al. or those in Shen et. al. to utilize heuristics rather than an optimal approach. In this Project, we intend to design our own heuristic for the identification of motifs in protein-protein interaction networks by utilizing the novel application of feature-based indexing to the problem. Though feature-based indexing is commonly applied to such problems as multimedia fingerprinting and pattern recognition, it has yet to be applied to motif detection in PPI networks, where the large data sets are both spatially and computationally limiting. Our proposed method is evaluated by performing the same experiments as similar motif detection applications then comparing the obtained results for similarity and quality of detected motifs.
Team Members
Carlo PascoeRobert Kirchgessner
Plan of Action
-
Implementation
In order to detect motifs in PPI networks, we will identify a set of features which can be used to represent regions of significance in an input network. Once these features have been identified, a hashing function will be used to generate a characteristic fingerprint for these regions of the network. By storing these fingerprints in a database, motifs can be identified quickly and easily through a quantitative comparison between database entries.
-
Methods of Comparison
We are going to search for other applications (i.e. Kovosh) that can detect motifs in PPI networks, but use different algorithms. Once a sufficient list of candidate applications is determined, we will select one or two that have:
- A well defined/explained algorithm
- Well defined experiments with publicly available data sets
- Quantitative results for which to compare
- Available code to reproduce the experiments with execution times
We will then perform the same experiments with our approach, comparing the similarity and quality of detected motifs.
-
Data Sources
We will be using PPI network data from a variety of databases including:
-
Experiments and Measurements
In verifying this approach, we will compare the motifs discovered with existing tools for motif identification. We will measure:
- Execution Time
- Similarity in detected network motifs between our method and “Kavosh” (tbd)
- P-value to determine the significance of the motif
- Z-score to determine how random our motif network appears in a network
Work Distribution
# | Task | Who |
1 | Read research papers | [K, P] |
2 | Determine method for data mining PPI network data | [K] |
3 | Determine feature types | [K ,P] |
4 | Determine hashing function(s) | [K ,P] |
5 | Hash sorting / storage method | [K] |
6 | Hash searching/comparison method | [P] |
7 | Determine reference application(s) and contact authors for code/data sets | [P] |
8 | Reproduce experiments from reference application(s) paper(s) | [P] |
9 | Perform the same experiments with our newly implemented algorithm | [K,P] |
10 | Compare results | [K,P] |
Schedule & Gantt Chart
References
Juchang Hua, David Koes, Zhenzhen Kou: Finding Motifs in Protein-Protein Interaction Networks.
Elisabeth A. Wong, Brittany Baur: On Network Tools for Network Motif Finding: A Survey Study.
Roded Sharan, Adi Shabi, Tomer Benyami: Analysis of Biological Networks: Network Motifs PPI Networks: Data Processing and functional Annotation.
Alon N, Dao P, Hajirasouliha I, Hormozdiari F, Sahinalp SC: Biomolecular network motif counting and discovery by color coding.
E. Ziv, R. Koytcheff, M. Middendorf, and C. Wiggins: Systematic identificationof statistically significant network measures.
H. Ogata, W. Fujibuchi, S. Goto, and M. Kanehisa: A heuristic graph comparisonalgorithm and its application to detect functionally related enzyme clusters.
Giovanni Ciriello, Concettina Guerra: A review on models and algorithms for motif discovery in protein-protein interaction networks.
Michael Lassig. Johannes Berg: Local graph alignment and motif search in biological networks.
R Milo, S. Shen-Orr, S. Itzkovitz, N. Kashtan, Chklovskii D., and U. Alon: Network motifs: Simple Building Blocks of Complex Networks.
Zahra Kashani, Hayedeh Ahrabian. Elahe Elahi: Kavosh: a new algorithm for finding network motifs.