Authors:
Farhad Maleki
1
;
Katie L. Ovens
1
;
Elham Rezaei
2
;
Alan M. Rosenberg
2
and
Anthony J. Kusalik
1
Affiliations:
1
Department of Computer Science, University of Saskatchewan, Saskatoon, SK, Canada
;
2
Department of Pediatrics, Royal University Hospital, Saskatoon, SK, Canada
Keyword(s):
Gene Set Analysis, Enrichment Analysis, Pathway Analysis, Gene Expression.
Abstract:
Gene set enrichment analysis is a well-established approach for gaining biological insight from expression data. With many gene set analysis methods available, a question is raised about the consistency of the results of these methods. In this paper, we answer this question with a systematic analysis of ten commonly used gene set analysis methods when applied to microarray data. The statistical analysis suggests that there is a significant difference between the results of these methods. Comparison of the 20 most statistically significant gene sets reported by these methods showed little to no agreement regardless of the dataset being used. This observation suggests that the outcome of a study can be highly dependent on the choice of the gene set analysis method. Comparing the 100 most statistically significant gene sets also led to the same conclusion. Furthermore, biological evaluation using a juvenile idiopathic arthritis dataset agreed with the results of the statistical analysis
. The 20 most statistically significant gene sets for some methods showed relevance to the biology of juvenile arthritis, supporting their utility, while most methods led to results that were irrelevant or marginally relevant to the known biology of the disease.
(More)