Prediction of Cancer using Network Topological Features

Fernanda Brito Correia, Joel P. Arrais, José Luis Oliveira


Several data mining methods have been applied to explore biological data and understand the mechanisms that regulate genetic and metabolic diseases. The underlying hypothesis is that the identification of signatures can help the clinical identification of diseased tissues. Under this principle many different methodologies have been tested mostly using unsupervised methods. A common trend consists in combining the information obtained from gene expression and protein-protein interaction networks analyses or, more recently, building series of complex networks to model system dynamics. Despite the positive results that these works present, they typically fail to generalize out of sample datasets. In this paper we describe a supervised classification approach, with a new methodology for extracting the network topology dynamics embedded in a disease system, to improve the capacity of cancer prediction, using exclusively the topological properties of biological networks as features. Four microarrays datasets were used, for testing and validation, three from breast cancer experiments and one from a liver cancer experiment. The obtained results corroborate the potential of the proposed methodology to predict a certain type of cancer and the necessity of applying different classification models to different types of cancer.


  1. Amberger, J. S., Bocchini, C. A., Schiettecatte, F., Scott, A. F. & Hamosh, A., 2015. Omim. Org: Online Mendelian Inheritance In Man (Omim®), An Online Catalog Of Human Genes And Genetic Disorders. Nucleic Acids Research, 43, D'9-D798.
  2. Arrais, J. P. & Oliveira, J. L., 2011. Using Biomedical Networks To Prioritize Gene-Disease Associations. Open Access Bioinformatics, 1, 123-130.
  3. Ay, A., Gong, D. & Kahveci, T., 2014. Network-Based Prediction Of Cancer Under Genetic Storm. Cancer Informatics, 13, 15.
  4. Barabási, A.-L., Gulbahce, N. & Loscalzo, J., 2011. Network Medicine: A Network-Based Approach To Human Disease. Nature Reviews. Genetics, 12, 56-68.
  5. Barabasi, A.-L. & Oltvai, Z. N., 2004. Network Biology: Understanding The Cell's Functional Organization. Nature Reviews Genetics, 5, 101-113.
  6. Barter, R., Schramm, S.-J., Mann, G. & Yang, Y. H., 2014. Network-Based Biomarkers Enhance Classical Approaches To Prognostic Gene Expression Signatures. Bmc Systems Biology, 8, S5.
  7. Chen, D. & Yang, H., 2014. Comparison Of Gene Regulatory Networks Of Benign And Malignant Breast Cancer Samples With Normal Samples. Genetics And Molecular Research: Gmr, 13, 9453.
  8. Chuang, H. Y., Lee, E., Liu, Y. T., Lee, D. & Ideker, T., 2007. Network Based Classification Of Breast Cancer Metastasis. Molecular Systems Biology, 3, 140.
  9. Consortium, T. U., 2014. Activities At The Universal Protein Resource (Uniprot). Nucleic Acids Research, 42, D191-D198.
  10. Demsar, J., Zupan, B. & Leban, G., 2007. Orange: From Experimental Machine Learning To Interactive Data Mining. White Paper, Faculty Of Computer And Information Science, University Of Ljubljana (2004).
  11. Dennis, G., Jr., Sherman, B. T., Hosack, D. A., Yang, J., Gao, W., Lane, H. C. & Lempicki, R. A., 2003. David: Database For Annotation, Visualization, And Integrated Discovery. Genome Biol, 4, P3.
  12. Dezfuly, M. & Sajedi, H,. 2015. Predict Survival Of Patients With Lung Cancer Using An Ensemble Feature Selection Algotithm And Classification Methods In Data Mining. Journal Of Information, 1, 1-11.
  13. Dominietto, M., Tsinoremas, N. & Capobianco, E., 2015. Integrative Analysis Of Cancer Imaging Readouts By Networks. Molecular Oncology, 9, 1-16.
  14. Farkas, I. J., Korcsmáros, T., Kovács, I. A., Mihalik, Á., Palotai, R., Simkó, G. I., Szalay, K. Z., Szalay-Beko, M., Vellai, T. & Wang, S., 2011. Network-Based Tools For The Identification Of Novel Drug Targets. Sci Signal, 4, Pt3.
  15. Furey, T. S., Cristianini, N., Duffy, N., Bednarski, D. W., Schummer, M. & Haussler, D. 2000. Support Vector Machine Classification And Validation Of Cancer Tissue Samples Using Microarray Expression Data. Bioinformatics, 16, 906-914.
  16. Jonsson, P. F. & Bates, P. A., 2006. Global Topological Features Of Cancer Proteins In The Human Interactome. Bioinformatics, 22, 2291-7.
  17. Kononenko, I. Estimating Attributes: Analysis And Extensions Of Relief. Machine Learning: Ecml-94, 1994. Springer, 171-182.
  18. Menche, J., Sharma, A., Kitsak, M., Ghiassian, S. D., Vidal, M., Loscalzo, J. & Barabási, A.-L., 2015. Uncovering Disease-Disease Relationships Through The Incomplete Interactome. Science, 347, 1257601.
  19. Milo, R., Shen-Orr, S., Itzkovitz, S., Kashtan, N., Chklovskii, D. & Alon, U., 2002. Network Motifs: Simple Building Blocks Of Complex Networks. Science, 298, 824-827.
  20. Mueller, L., Kugler, K., Graber, A., Emmert-Streib, F. & Dehmer, M., 2011. Structural Measures For Network Biology Using Quacn. Bmc Bioinformatics, 12, 492.
  21. Nancy, S. G. & Alias Balamurugan, S. A., 2013. A Comparative Study Of Feature Selection Methods For Cancer Classification Using Gene Expression Dataset. Journal Of Computer Applications (Jca), 6, 2013.
  22. Oti, M., Snel, B., Huynen, M. A. & Brunner, H. G., 2006. Predicting Disease Genes Using Protein-Protein Interactions. Journal Of Medical Genetics, 43, 691- 698.
  23. Ou-Yang, L., Dai, D.-Q., Li, X.-L., Wu, M., Zhang, X.-F. & Yang, P., 2014. Detecting Temporal Protein Complexes From Dynamic Protein-Protein Interaction Networks. Bmc Bioinformatics, 15, 335.
  24. Pradhan, M. P., Nagulapalli, K. & Palakal, M. J., 2012. Cliques For The Identification Of Gene Signatures For Colorectal Cancer Across Population. Bmc Systems Biology, 6, S17.
  25. Pržulj, N., 2007. Biological Network Comparison Using Graphlet Degree Distribution. Bioinformatics, 23, E177-E183.
  26. Pržulj, N., Corneil, D. G. & Jurisica, I., 2004. Modeling Interactome: Scale-Free Or Geometric? Bioinformatics, 20, 3508-3515.
  27. Ramani, R. G. & Jacob, S. G., 2013. Improved Classification Of Lung Cancer Tumors Based On Structural And Physicochemical Properties Of Proteins Using Data Mining Models. Plos One, 8, E58772.
  28. Ribeiro, P. & Silva, F,. 2014. G-Tries: A Data Structure For Storing And Finding Subgraphs. Data Mining And Knowledge Discovery, 28, 337-377.
  29. Robnik-Šikonja, M. & Kononenko, I., 2003. Theoretical And Empirical Analysis Of Relieff And Rrelieff. Machine Learning, 53, 23-69.
  30. Saeys, Y., Inza, I. & Larrañaga, P., 2007. A Review Of Feature Selection Techniques In Bioinformatics. Bioinformatics, 23, 2507-2517.
  31. Schult, D. A. & Swart, P. Exploring Network Structure, Dynamics, And Function Using Networkx. Proceedings Of The 7th Python In Science Conferences (Scipy 2008), 2008. 11-16.
  32. Siegel, R. L., Miller, K. D. & Jemal, A., 2015. Cancer Statistics, 2015. Ca: A Cancer Journal For Clinicians, 65, 5-29.
  33. Sokolova, M. & Lapalme, G., 2009. A Systematic Analysis Of Performance Measures For Classification Tasks. Information Processing & Management, 45, 427-437.
  34. Sonachalam, M., Shen, J., Huang, H. & Wu, X., 2012. Systems Biology Approach To Identify Gene Network Signatures For Colorectal Cancer. Frontiers In Genetics, 3.
  35. Trapé, A. P. & Gonzalez-Angulo, A. M., 2012. Breast Cancer And Metastasis: On The Way Toward Individualized Therapy. Cancer Genomics - Proteomics, 9, 297-310.
  36. Vidal, M., Cusick, M. E. & Barabasi, A.-L., 2011. Interactome Networks And Human Disease. Cell, 144, 986-998.
  37. Wang, J., Zuo, Y., Man, Y.-G., Avital, I., Stojadinovic, A., Liu, M., Yang, X., Varghese, R. S., Tadesse, M. G. & Ressom, H. W., 2015. Pathway And Network Approaches For Identification Of Cancer Signature Markers From Omics Data. Journal Of Cancer, 6, 54.
  38. Yu, H., Lin, C.-C., Li, Y.-Y. & Zhao, Z., 2013. Dynamic Protein Interaction Modules In Human Hepatocellular Carcinoma Progression. Bmc Systems Biology, 7, S2.

Paper Citation

in Harvard Style

Correia F., Arrais J. and Oliveira J. (2016). Prediction of Cancer using Network Topological Features . In Proceedings of the 9th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 3: BIOINFORMATICS, (BIOSTEC 2016) ISBN 978-989-758-170-0, pages 207-215. DOI: 10.5220/0005696202070215

in Bibtex Style

author={Fernanda Brito Correia and Joel P. Arrais and José Luis Oliveira},
title={Prediction of Cancer using Network Topological Features},
booktitle={Proceedings of the 9th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 3: BIOINFORMATICS, (BIOSTEC 2016)},

in EndNote Style

JO - Proceedings of the 9th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 3: BIOINFORMATICS, (BIOSTEC 2016)
TI - Prediction of Cancer using Network Topological Features
SN - 978-989-758-170-0
AU - Correia F.
AU - Arrais J.
AU - Oliveira J.
PY - 2016
SP - 207
EP - 215
DO - 10.5220/0005696202070215