5 CONCLUSIONS
The initial analysis of integrated data shown here
provides evidence that feature selection and model
training based on heterogeneous integrated datasets
is a potential tool to address the reproducibility cri-
sis of array expression experiments. We intend to
thoroughly investigate a variety of classification tech-
niques to explore the possibility of using data inte-
gration to develop robust disease biomarkers. Com-
bining datasets in this way should ameliorate much
of the reproducibility problem in diagnostic research,
and lead to a greater correlation between academic
research and clinical success. It is our hope that this
direction in research will not only lead to future diag-
nostic development, but to advancements in drug and
vaccine development as well.
ACKNOWLEDGEMENTS
Dartmouth College holds an Institutional Program
Unifying Population and Laboratory Based Sciences
award from the Burroughs Wellcome Fund, and C.
Bobak was supported by this grant (Grant#1014106).
A. Titus was supported by the Office of the U.S. Di-
rector of the National Institutes of Health under award
number T32LM012204. The content is solely the re-
sponsibility of the authors and does not necessarily
represent the official views of the National Institutes
of Health.
REFERENCES
Baker, M. (2016). 1,500 scientists lift the lid on repro-
ducibility. Nature, 533(7604):452–454.
Begley, C. G. and Ellis, L. M. (2012). Drug development:
Raise standards for preclinical cancer research. Na-
ture, 483(7391):531–533.
Blankley, S., Graham, C. M., Turner, J., Berry, M. P. R.,
Bloom, C. I., Xu, Z., Pascual, V., Banchereau, J.,
Chaussabel, D., Breen, R., Santis, G., Blankenship,
D. M., Lipman, M., and O’Garra, A. (2016). The
transcriptional signature of active tuberculosis reflects
symptom status in extra-pulmonary and pulmonary tu-
berculosis. PLOS ONE, 11(10):e0162220.
Bloom, C. I., Graham, C. M., Berry, M. P. R., Roza-
keas, F., Redford, P. S., Wang, Y., Xu, Z., Wilkinson,
K. A., Wilkinson, R. J., Kendrick, Y., Devouassoux,
G., Ferry, T., Miyara, M., Bouvry, D., Dominique, V.,
Gorochov, G., Blankenship, D., Saadatian, M., Van-
hems, P., Beynon, H., Vancheeswaran, R., Wickre-
masinghe, M., Chaussabel, D., Banchereau, J., Pas-
cual, V., pei Ho, L., Lipman, M., and O’Garra,
A. (2013). Transcriptional blood signatures distin-
guish pulmonary tuberculosis, pulmonary sarcoido-
sis, pneumonias and lung cancers. PLoS ONE,
8(8):e70630.
Cai, Y., Yang, Q., Tang, Y., Zhang, M., Liu, H., Zhang,
G., Deng, Q., Huang, J., Gao, Z., Zhou, B., Feng,
C. G., and Chen, X. (2014). Increased complement
c1q level marks active disease in human tuberculosis.
PLoS ONE, 9(3):e92340.
Collins, F. S. and Tabak, L. A. (2014). Policy: NIH plans to
enhance reproducibility. Nature, 505(7485):612–613.
Diaz-Uriarte, R. and de Andres, S. A. (2006). Gene selec-
tion and classification of microarray data using ran-
dom forest. BMC Bioinformatics, 7(1):3.
Fang, K., Liu, F., Wen, J., Liu, H., Xiao, S., and Li,
X. (2017). Association of tap1 and tap2 poly-
morphisms with risk and prognosis of pediatric
spinal tuberculosis. INTERNATIONAL JOURNAL
OF CLINICAL AND EXPERIMENTAL MEDICINE,
10(3):5769–5777.
Firszt, R. and Vickery, B. (2011). An interferon-inducible
neutrophil-driven blood transcriptional signature in
human tuberculosis. Pediatrics, 128(Supplement
3):S145–S146.
Goodman, S. N., Fanelli, D., and Ioannidis, J. P. (2016).
What does research reproducibility mean? Science
translational medicine, 8(341):341ps12–341ps12.
Haynes, W. A., Vallania, F., Liu, C., Bongen, E., Tomczak,
A., Andres-Terre, M., Lofgren, S., Tam, A., Deis-
seroth, C. A., Li, M. D., Sweeney, T. E., and Khatri,
P. (2016). Empowering multi-cohort gene expression
analysis to increase reproducibility. bioRxiv.
Ioannidis, J. P. (2005). Why most published research find-
ings are false. PLoS medicine, 2(8):e124.
Jenum, S., Bakken, R., Dhanasekaran, S., Mukherjee, A.,
Lodha, R., Singh, S., Singh, V., Haks, M. C., Otten-
hoff, T. H. M., Kabra, S. K., Doherty, T. M., Ritz, C.,
and Grewal, H. M. S. (2016). BLR1 and FCGR1a
transcripts in peripheral blood associate with the ex-
tent of intrathoracic tuberculosis in children and pre-
dict treatment outcome. Scientific Reports, 6(1).
Kim, B.-H., Shenoy, A. R., Kumar, P., Das, R., Tiwari, S.,
and MacMicking, J. D. (2011). A family of IFN-
-inducible 65-kD GTPases protects against bacterial
infection. Science, 332(6030):717–721.
Liaw, A. and Wiener, M. (2002). Classification and regres-
sion by randomforest. R News, 2(3):18–22.
Liu, Y., Jiang, J., Wang, X., Zhai, F., and Cheng, X. (2013).
miR-582-5p is upregulated in patients with active tu-
berculosis and inhibits apoptosis of monocytes by tar-
geting FOXO1. PLoS ONE, 8(10):e78381.
Lu, H. and Huang, H. (2011). FOXO1: A potential
target for human diseases. Current Drug Targets,
12(9):1235–1244.
Maaten, L. v. d. and Hinton, G. (2008). Visualizing data
using t-sne. Journal of Machine Learning Research,
9(Nov):2579–2605.
Maertzdorf, J., Ota, M., Repsilber, D., Mollenkopf, H. J.,
Weiner, J., Hill, P. C., and Kaufmann, S. H. E. (2011).
Functional correlations of pathogenesis-driven gene