Authors:
Benjamin A. Neely
1
and
Paul E. Anderson
2
Affiliations:
1
National Institute of Standards and Technology, United States
;
2
College of Charleston, United States
Keyword(s):
Genomics, Trancriptomics, Proteomics, Proteogenomics, Proteotranscriptomics.
Abstract:
As the speed and quality of different analytical platforms increase, it is more common to collect data across
multiple biological domains in parallel (\textit{i.e.}, genomics, transcriptomics, proteomics, and metabolomics). There
is a growing interest in algorithms and tools that leverage heterogeneous data streams in a meaningful way.
Since these domains are typically non-linearly related, we evaluated whether results from one domain could
be used to prioritize another domain to increase the power of detection, maintain type 1 error, and highlight
biologically relevant changes in the secondary domain. To perform this feature prioritization, we developed
a methodology called Complementary Domain Prioritization that utilizes the underpinning biology to relate
complementary domains. Herein, we evaluate how proteomic data can guide transcriptomic differential expression
analysis by analyzing two published colorectal cancer proteotranscriptomic data sets. The proposed
strategy i
mproved detection of cancer-related genes compared to standard permutation invariant filtering approaches
and did not increase type I error. Moreover, this approach detected differentially expressed genes that
would not have been detected using filtering alone while also highlighted pathways that might have otherwise
been overlooked. These results demonstrate how this strategy can effectively prioritize transcriptomic data
and drive new hypotheses, though subsequent validation studies are still required.
(More)