INTEGRATING PATHWAY ENRICHMENT AND GENE NETWORK ANALYSIS PROVIDES ACCURATE DISEASE CLASSIFICATION

Maysson Al-Haj Ibrahim, Sabah Jassim, Michael A. Cawthorne, Kenneth Langlands

Abstract

At present, a range of clinical indicators are used to gain insight into the course a newly-presented individual’s disease may take, and so inform treatment regimes. However, such indicators are not absolutely predictive and patients with apparently low-risk disease may follow a more aggressive course. Advances in molecular medicine offer the hope of improved disease stratification and personalised treatment. For example, the identification of “genetic signatures” characteristic of disease subtypes is facilitated by high-throughput transcriptional profiling techniques (microarrays) in which gene expression levels for thousands of genes are measured across a range of biopsy samples. However, the selection of a compact gene set conferring the most clinically-relevant information from complex and high-dimensional microarray datasets is a challenging task. We reduced this complexity using a Pathway Enrichment and Gene Network Analysis (PEGNA) method, which integrates gene expression data with prior biological knowledge to select a group of strongly-correlated genes providing accurate discrimination of complex disease subtypes. In our method, pathway enrichment analysis was applied to a microarray dataset in order to identify the most impacted biological processes. Secondly, we used gene network analysis to find a group of strongly-correlated genes from which subsets of genes were selected to use for disease classification with a support vector machine classifier. In this way, we were able to more accurately classify disease states, using smaller numbers of genes, compared to other methods across a range of biological datasets.

References

  1. Ashburner, M., Ball, C. A., Blake, J. A., Botstein, D., Butler, H., Cherry, J. M., Davis, A. P., Dolinski, K., Dwight, S. S., Eppig, J. T. and others. (2000). Gene Ontology: tool for the unification of biology. Nature genetics, 25, 25-29.
  2. Asyali, M. H., Colak, D., Demirkaya, O. and Inan, M. S. (2006). Gene Expression Profile Classification: A Review. Current Bioinformatics, 1, 55-73.
  3. Bonome, T., Levine, D. A., Shih, J., Randonovich, M., Pise-Masison, C. A., Bogomolniy, F., Ozbun, L., Brady, J., Barrett, J. C., Boyd, J. and others. (2008). A gene signature predicting for survival in suboptimally debulked patients with ovarian cancer. Cancer Research, 68, 5478.
  4. Cheadle, C., Vawter, M. P., Freed, W. J. and Becker, K. G. (2003). Analysis of Microarray Data Using Z Score Transformation. Journal Of Molecular Diagnostics, 5, 73-81.
  5. Chen, X. and Wang, L. (2009). Integrating Biological Knowledge with Gene Expression Profiles for Survival Prediction of Cancer. Journal of Computational Biology, 16, 265-278.
  6. Chuang, H. Y., Lee, E., Liu, Y. T., Lee, D. and Ideker, T. (2007). Network-based classification of breast cancer metastasis. Molecular systems biology, 3, 140.
  7. Curtis, R. K., Oresic, M. and Vidal-Puig, A. (2005). Pathways to the analysis of microarray data. TRENDS In Biotechnology, 23, 429-435.
  8. Dahlquist, K. D., Salomonis, N., Vranizan, K., Lawlor, S. C. and Conklin, B. R. (2002). GenMAPP, a new tool for viewing and analyzing microarray data on biological pathways. Nature Genetics, 31, 19-93.
  9. Gene Expression Omnibus database. (n.d.). Retrieved June 2011, from http://www.ncbi.nlm.nih.gov/geo/
  10. GeneSifter® Analysis Edition. (n.d.). Retrieved January 2011, from http://www.genesifter.net
  11. Guo, Z., Zhang, T., Li, X., Wang, Q., Xu, J., Yu, H., Zhu, J., Wang, H., Wang, C., Topol, E., Wang, Q. and Rao, S. (2005). Towards precise classification of cancers based on robust gene functional expression profiles. BMC Bioinformatics, 6, 58.
  12. Hwang, T. and Park, T. (2009). Identification of differentially expressed subnetworks based on multivariate ANOVA. BMC bioinformatics, 10, 128.
  13. Ibrahim, M. A. H., Jassim, S., Cawthorne, M. A. and Langlands, K. (2011a). A Topology-based Score for Pathway Enrichment. In Press, Journal of Computational Biology .
  14. Ibrahim, M. A. H., Jassim, S., Cawthorne, M. A. and Langlands, K. (2011b). Pathway-based Gene Selection for Disease Classification. International Conference on Information Society (pp. 360-365). London: IEEE.
  15. Jain, A. and Zongker, D. (1997). Feature Selection: Evaluation, Application, and Small Sample Performance. IEEE Transactions On Pattern Analysis And Machine Intelligence PAMI, 19, 153-157.
  16. Jain, A. K., Duin, R. P. and Mao, J. (2000). Statistical Pattern Recognition: A Review. IEEE Transactions On Pattern Analysis And Machine Intelligence PAMI, 22, 4-37.
  17. Kanehisa, M. and Goto, S. (2000). KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Research, 28, 27.
  18. Kim, J. M., Sohn, H. Y., Yoon, S. Y., Oh, J. H., Yang, J. O., Kim, J. H., Song, K. S., Rho, S. M., Yoo, H. S., Kim, Y. S. and others. (2005). Identification of Gastric Cancer-Related Genes Using a cDNA Microarray Containing Novel Expressed Sequence Tags Expressed in Gastric Cancer Cells. Clinical Cancer Research, 5, 473.
  19. Kim, S. J., Nakayama, S., Miyoshi, Y., Taguchi, T., Tamaki, Y., Matsushima, T., Torikoshi, Y., Tanaka, S., Yoshida, T., Ishihara, H. and others. (2007). Determination of the specific activity of CDK1 and CDK2 as a novel prognostic indicator for early breast cancer. Annals of Oncology, 48, 68.
  20. Lawrence, J. A., Merino, M. J., Simpson, J. F., Manrow, R. E., Page, D. L. and Steeg, P. S. (1998). A high-risk lesion for invasive breast cancer, ductal carcinoma in situ, exhibits frequent overexpression of retinoid X receptor. Cancer Epidemiology Biomarkers & Prevention, 7, 29.
  21. Lo, Y. L., Yu, J. C., Chen, S. T., Hsu, G. C., Mau, Y. C., Yang, S. L., Wu, P. E. and Shen, C. Y. (2007). Breast cancer risk associated with genotypic polymorphism of the mitotic checkpoint genes: a multigenic study on cancer susceptibility. Carcinogenesi , 28, 1079.
  22. Nacu, S., Critchley-Thorne, R., Lee, P. and Holmes, S. (2007). Gene expression network analysis and applications to immunology. Bioinformatics, 23, 850.
  23. Pandey, R., Guru, R. K. and Mount, D. W. (2004). Pathway Miner: extracting gene association networks from molecular pathways for predicting the biological significance of gene expression microarray data. Bioinformatics, 20, 2156-2158.
  24. Rapaport, F., Zinovyev, A., Dutreix, M., Barillot, E. and Vert, J. (2007). Classification of microarray data using gene networks. BMC Bioinformatics, 8, 35.
  25. Span, P.N., Sweep, F. C. G. J., Wiegerinck, E. T. G., TjanHeijnen, V. C. G., Manders, P., Beex, L. V. A. M. and de Kok, J. B. (2004). Survivin Is an Independent Prognostic Marker for Risk Stratification of Breast Cancer Patients. Clinical Chemistry, 50, 1986.
  26. Shen, S. C., Huang, T. S., Jee, S. H. and Kuo, M. L. (1998). Taxol-induced p34cdc2 kinase activation and apoptosis inhibited by 12-O-tetradecanoylphorbol-13- acetate in human breast MCF-7 carcinoma cells. Cell Growth \& Differentiation: The Molecular Biology Journal Of The American Association For Cancer Research, 9, 23.
  27. Simon, R. (2003). Diagnostic and prognostic prediction using gene expression profiles in high-dimensional microarray data. British Journal Of Cancer, 89, 1599- 1604.
  28. Sotiriou, C., Neo, S. Y., McShane, L. M., Korn, E. L., Long, P. M., Jazaeri, A., Martiat, P., Fox, S. B., Harris, A. L. and Liu, E. T. (2003). Breast cancer classification and prognosis based on gene expression profiles from a population-based study. Proceedings of the National Academy of Sciences of the United States of America, 100, 10393.
  29. Sotiriou, C., Wirapati, P., Loi, S., Harris, A., Fox, S., Smeds, J., Nordgren, H., Farmer, P., Praz, V., HaibeKains, B. and others. (2006). Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis. Journal of the National Cancer Institute, 98, 262.
  30. Stirewalt, D. L., Meshinchi, S., Kopecky, K. J., Fan, W., Pogosova-Agadjanyan, E. L., Engel, J. H., Cronk, M. R., Dorcy, K. S., McQuary, A. R. and Hockenbery, D. (2008). Identification of genes with abnormal expression changes in acute myeloid leukemia. Genes Chromosomes And Cancer, 47, 8-20.
  31. Su, J.,Yoon, B. J. and Dougherty, E. R. (2009). Accurate and Reliable Cancer Classification Based on Probabilistic Inference of Pathway Activity. PLoS One, 4, 503-511.
  32. Subramanian, A., Tamayo, P., Mootha, V. K., Mukherjee, S., Ebert, B. L., Gillette, M. A., Paulovich, A., Pomeroy, S. L., Golub, T. R. and Lander, E. S. (2005). Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proceedings of the National Academy of Sciences of the United States of America, 102, 15545- 15550.
  33. Tai, F. and Pan, W. (2007). Incorporating prior knowledge of predictors into penalized classifiers with multiple penalty terms. Bioinformatics, 23, 1775-1782.
  34. The Kyoto Encyclopaedia of Genes and Genomes database. (n.d.). Retrieved May 2011, from http://www.genome.jp/kegg/
  35. Tian, L., Greenberg, S. A., Kong, S. W., Altschuler, J., Kohane, I. S. and Park, P. J. (2005). Discovering statistically significant pathways in expression profiling studies. Proceedings of the National Academy of Sciences of the United States of America, 102, 13544-13549.
  36. Yao, Y., Richman, L., Morehouse, C., de Los Reyes, M., Higgs, B. W., Boutrin, A., White, B., Coyle, A., Krueger, J., Kiener, P.A. and others. (2008). Type I interferon: potential therapeutic target for psoriasis. PLoS One ,3, e2737.
  37. Yousef, M., Ketany, M., Manevitz, L., Showe, L. and Showe, M. (2009). Classification and biomarker identification using gene network modules and support vector machines. BMC Bioinformatics, 10, 337.
Download


Paper Citation


in Harvard Style

Al-Haj Ibrahim M., Jassim S., A. Cawthorne M. and Langlands K. (2012). INTEGRATING PATHWAY ENRICHMENT AND GENE NETWORK ANALYSIS PROVIDES ACCURATE DISEASE CLASSIFICATION . In Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2012) ISBN 978-989-8425-90-4, pages 156-163. DOI: 10.5220/0003767901560163


in Bibtex Style

@conference{bioinformatics12,
author={Maysson Al-Haj Ibrahim and Sabah Jassim and Michael A. Cawthorne and Kenneth Langlands},
title={INTEGRATING PATHWAY ENRICHMENT AND GENE NETWORK ANALYSIS PROVIDES ACCURATE DISEASE CLASSIFICATION},
booktitle={Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2012)},
year={2012},
pages={156-163},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003767901560163},
isbn={978-989-8425-90-4},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2012)
TI - INTEGRATING PATHWAY ENRICHMENT AND GENE NETWORK ANALYSIS PROVIDES ACCURATE DISEASE CLASSIFICATION
SN - 978-989-8425-90-4
AU - Al-Haj Ibrahim M.
AU - Jassim S.
AU - A. Cawthorne M.
AU - Langlands K.
PY - 2012
SP - 156
EP - 163
DO - 10.5220/0003767901560163