The bioinformatic tools that we have developed
make use of pre-computed genomic analyses that will
be able to accommodate continued influx of genomic
sequence data, requiring only new genomic data to be
analysed. The results of analyses using the database
allow end users to easily identify whether an isolate is
exceptionally virulent, or not usually associated with
human infection based on the presence / absence of
known virulence attributes and AMR genes, and ge-
nomic similarity to other known human pathogens.
4.2 Collaboration
Two similar efforts to construct genomic databases for
molecular epidemiology have recently been proposed
(Kupferschmidt, 2011; FDA, 2012). While strain
transport between countries can be difficult or impos-
sible, genomic sequence information can be transmit-
ted instantly, allowing rapid analyses and potentially
life-saving interventions. International agencies need
to be willing to share information between databases
or to collaborate in building a single, multi-national
database to fully realize the potential of comparative
genomics, and individual strains need to be analyzed
in the context of as many similar strains as possible to
put the data in the proper context.
ACKNOWLEDGEMENTS
We would like to thank the Canadian Food Inspection
Agency for allowing this research to be conducted
at the Animal Diseases Research Institute. This
work was supported by the Public Health Agency of
Canada and grants from the Natural Sciences and En-
gineering Research Council of Canada (www.nserc-
crsng.gc.ca) and Alberta Innovates Technology Fu-
tures (www.albertatechfutures.ca).
REFERENCES
FDA (2012). Press announcements - FDA, UC davis,
agilent technologies and CDC to create pub-
licly available food pathogen genome database.
http://www.fda.gov/NewsEvents/Newsroom/PressAn
nouncements/ucm311661.htm. The U.S. Food and
Drug Administration (FDA), the University of Cal-
ifornia, Davis, Agilent Technologies Inc., and the
Centers for Disease Control and Prevention (CDC)
announced today a collaboration to create a public
database of 100,000 foodborne pathogen genomes to
help speed identification of bacteria responsible for
foodborne outbreaks.
Gilmour, M., Graham, M., Van Domselaar, G., Tyler, S.,
Kent, H., Trout-Yakel, K., Larios, O., Allen, V., Lee,
B., and Nadon, C. (2010). High-throughput genome
sequencing of two listeria monocytogenes clinical iso-
lates during a large foodborne outbreak. BMC Ge-
nomics, 11(1):120.
Grad, Y. H., Lipsitch, M., Feldgarden, M., Arachchi, H. M.,
Cerqueira, G. C., Fitzgerald, M., Godfrey, P., Haas,
B. J., Murphy, C. I., Russ, C., Sykes, S., Walker,
B. J., Wortman, J. R., Young, S., Zeng, Q., Abouelleil,
A., Bochicchio, J., Chauvin, S., Desmet, T., Gu-
jja, S., McCowan, C., Montmayeur, A., Steelman,
S., Frimodt-Mller, J., Petersen, A. M., Struve, C.,
Krogfelt, K. A., Bingen, E., Weill, F.-X., Lander,
E. S., Nusbaum, C., Birren, B. W., Hung, D. T., and
Hanage, W. P. (2012). Genomic epidemiology of
the Escherichia coli O104:H4 outbreaks in europe,
2011. Proceedings of the National Academy of Sci-
ences of the United States of America, 109(8):3065–
3070. PMID: 22315421.
Junier, T. and Zdobnov, E. M. (2010). The newick utili-
ties: high-throughput phylogenetic tree processing in
the UNIX shell. Bioinformatics (Oxford, England),
26(13):1669–1670. PMID: 20472542.
Kser, C. U., Holden, M. T. G., Ellington, M. J., Cartwright,
E. J. P., Brown, N. M., Ogilvy-Stuart, A. L., Hsu,
L. Y., Chewapreecha, C., Croucher, N. J., Harris,
S. R., Sanders, M., Enright, M. C., Dougan, G.,
Bentley, S. D., Parkhill, J., Fraser, L. J., Betley,
J. R., Schulz-Trieglaff, O. B., Smith, G. P., and Pea-
cock, S. J. (2012). Rapid whole-genome sequenc-
ing for investigation of a neonatal MRSA outbreak.
The New England journal of medicine, 366(24):2267–
2275. PMID: 22693998.
Kupferschmidt, K. (2011). Outbreak detectives embrace the
genome era. Science, 333(6051):1818–1819.
Laing, C., Buchanan, C., Taboada, E. N., Zhang, Y.,
Kropinski, A., Villegas, A., Thomas, J. E., and Gan-
non, V. P. J. (2010). Pan-genome sequence analysis
using panseq: an online tool for the rapid analysis of
core and accessory genomic regions. BMC Bioinfor-
matics, 11:461. PMID: 20843356.
Laing, C., Pegg, C., Yawney, D., Ziebell, K., Steele, M.,
Johnson, R., Thomas, J. E., Taboada, E. N., Zhang,
Y., and Gannon, V. P. J. (2008). Rapid determination
of escherichia coli O157:H7 lineage types and molec-
ular subtypes by using comparative genomic finger-
printing. Applied and Environmental Microbiology,
74(21):6606–15. PMID: 18791027.
Laing, C., Villegas, A., Taboada, E. N., Kropinski, A.,
Thomas, J. E., and Gannon, V. P. J. (2011). Identifi-
cation of salmonella enterica species- and subgroup-
specific genomic regions using panseq 2.0. Infec-
tion, Genetics and Evolution: Journal of Molecular
Epidemiology and Evolutionary Genetics in Infectious
Diseases. PMID: 22001825.
Ronquist, F. and Huelsenbeck, J. P. (2003). MrBayes 3:
Bayesian phylogenetic inference under mixed models.
Bioinformatics, 19(12):1572–1574.
Taboada, E. N. e. a. (2012). Development and valida-
tion of a comparative genomic fingerprinting method
for high-resolution genotyping of Campylobacter je-
juni. Journal of clinical microbiology, 50(3):788–797.
PMID: 22170908.
ADynamicWhole-genomeDatabaseforComparativeAnalyses,MolecularEpidemiologyandPhenotypicSummaryof
BacterialPathogens
307