Benson, D. A., Cavanaugh, M., Clark, K., Karsch-Mizrachi,
I., Lipman, D. J., Ostell, J., and Sayers, E. W. (2013).
Genbank. Nucleic acids research, 41(D1):D36–D42.
Chalker, A. F. and Lunsford, R. D. (2002). Rational iden-
tification of new antibacterial drug targets that are es-
sential for viability using a genomics-based approach.
Pharmacology & therapeutics, 95(1):1–20.
Chen, L., Ge, X., and Xu, P. (2015). Identifying essen-
tial streptococcus sanguinis genes using genome-wide
deletion mutation. Gene Essentiality: Methods and
Protocols, pages 15–23.
Chen, W.-H., Minguez, P., Lercher, M. J., and Bork, P.
(2012). OGEE: an online gene essentiality database.
Nucleic acids research, 40(D1):D901–D906.
Chen, Y. and Xu, D. (2005). Understanding protein dispens-
ability through machine-learning analysis of high-
throughput data. Bioinformatics, 21(5):575–581.
Cheng, J., Xu, Z., Wu, W., Zhao, L., Li, X., Liu, Y., and Tao,
S. (2014). Training set selection for the prediction of
essential genes. PloS one, 9(1):e86805.
Clarke, L. and Carbon, J. (1976). A colony bank containing
synthetic coi ei hybrid plasmids representative of the
entire e. coli genome. Cell, 9(1):91–99.
Cullen, L. M. and Arndt, G. M. (2005). Genome-
wide screening for gene function using RNAi in
mammalian cells. Immunology and cell biology,
83(3):217–223.
Dalevi, D. and Dubhashi, D. (2005). The peres-shields
order estimator for fixed and variable length markov
models with applications to DNA sequence similarity.
In International Workshop on Algorithms in Bioinfor-
matics, pages 291–302. Springer.
Date, S. V. and Marcotte, E. M. (2003). Discovery of un-
characterized cellular systems by genome-wide anal-
ysis of functional linkages. Nature biotechnology,
21(9):1055–1062.
Deng, J., Deng, L., Su, S., Zhang, M., Lin, X., Wei, L., Mi-
nai, A. A., Hassett, D. J., and Lu, L. J. (2011). Investi-
gating the predictability of essential genes across dis-
tantly related organisms using an integrative approach.
Nucleic acids research, 39(3):795–807.
Giaever, G., Chu, A. M., Ni, L., Connelly, C., Riles, L.,
Veronneau, S., Dow, S., Lucau-Danila, A., Ander-
son, K., Andre, B., et al. (2002). Functional profil-
ing of the saccharomyces cerevisiae genome. nature,
418(6896):387–391.
Hagenauer, J., Dawy, Z., Gobel, B., Hanus, P., and Mueller,
J. (2004). Genomic analysis using methods from in-
formation theory. In Information Theory Workshop,
2004. IEEE, pages 55–59. IEEE.
Hutchison, C. A., Chuang, R.-Y., Noskov, V. N., Assad-
Garcia, N., Deerinck, T. J., Ellisman, M. H., Gill, J.,
Kannan, K., Karas, B. J., Ma, L., et al. (2016). Design
and synthesis of a minimal bacterial genome. Science,
351(6280):aad6253.
Itaya, M. (1995). An estimation of minimal genome size
required for life. FEBS letters, 362(3):257–260.
Jacobs, M. A., Alwood, A., Thaipisuttikul, I., Spencer,
D., Haugen, E., Ernst, S., Will, O., Kaul, R., Ray-
mond, C., Levy, R., et al. (2003). Comprehensive
transposon mutant library of pseudomonas aerugi-
nosa. Proceedings of the National Academy of Sci-
ences, 100(24):14339–14344.
Katz, R. W. (1981). On some criteria for estimating the
order of a markov chain. Technometrics, 23(3):243–
249.
Lamichhane, G., Zignol, M., Blades, N. J., Geiman, D. E.,
Dougherty, A., Grosset, J., Broman, K. W., and
Bishai, W. R. (2003). A postgenomic method for pre-
dicting essential genes at subsaturation levels of mu-
tagenesis: application to mycobacterium tuberculo-
sis. Proceedings of the National Academy of Sciences,
100(12):7213–7218.
Letunic, I. and Bork, P. (2016). Interactive tree of life (itol)
v3: an online tool for the display and annotation of
phylogenetic and other trees. Nucleic acids research,
page gkw290.
Lu, Y., Deng, J., Rhodes, J. C., Lu, H., and Lu, L. J.
(2014). Predicting essential genes for identifying po-
tential drug targets in aspergillus fumigatus. Compu-
tational biology and chemistry, 50:29–40.
Luo, H., Lin, Y., Gao, F., Zhang, C.-T., and Zhang, R.
(2014). Deg 10, an update of the database of essen-
tial genes that includes both protein-coding genes and
noncoding genomic elements. Nucleic acids research,
42(D1):D574–D580.
Men
´
endez, M., Pardo, L., Pardo, M., and Zografos, K.
(2011). Testing the order of markov dependence in
DNA sequences. Methodology and computing in ap-
plied probability, 13(1):59–74.
Mushegian, A. R. and Koonin, E. V. (1996). A minimal
gene set for cellular life derived by comparison of
complete bacterial genomes. Proceedings of the Na-
tional Academy of Sciences, 93(19):10268–10273.
Nigatu, D., Henkel, W., Sobetzko, P., and Muskhelishvili,
G. (2016). Relationship between digital information
and thermodynamic stability in bacterial genomes.
EURASIP Journal on Bioinformatics and Systems Bi-
ology, 2016(1):1.
Nigatu, D., Mahmood, A., Henkel, W., Sobetzko, P., and
Muskhelishvili, G. (2014). Relating digital informa-
tion, thermodynamic stability, and classes of func-
tional genes in e. coli. In Signal and Information Pro-
cessing (GlobalSIP), 2014 IEEE Global Conference
on, pages 1338–1341. IEEE.
Ning, L., Lin, H., Ding, H., Huang, J., Rao, N., and Guo, F.
(2014). Predicting bacterial essential genes using only
sequence composition information. Genet. Mol. Res,
13:4564–4572.
Papapetrou, M. and Kugiumtzis, D. (2013). Markov chain
order estimation with conditional mutual information.
Physica A: Statistical Mechanics and its Applications,
392(7):1593–1601.
Papapetrou, M. and Kugiumtzis, D. (2016). Markov chain
order estimation with parametric significance tests of
conditional mutual information. Simulation Mod-
elling Practice and Theory, 61:1–13.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V.,
Thirion, B., Grisel, O., Blondel, M., Prettenhofer,
P., Weiss, R., Dubourg, V., Vanderplas, J., Passos,
A., Cournapeau, D., Brucher, M., Perrot, M., and
Prediction of Essential Genes based on Machine Learning and Information Theoretic Features
91