the exchange of genome sequence information can
happen as soon as it becomes available. These inter-
national efforts with common goals should at the least
provide data in a format that allows for it to be easily
shared and understood among the various platforms.
The value to the community of users of this shared
computational resource increases as the number of
users contributing data to it increases, which in turn
makes the platform more attractive to use and con-
tribute to by others. Users are encouraged to add not
only data, but suggest improvements and additions to
the SuperPhy platform, so that it can be iteratively de-
veloped to meet the needs of the user community.
6 AVAILABILITY
The website is available at http://lfz.corefacility.ca/
superphy/. The software code and database will be
made available upon request.
7 CONCLUSIONS
SuperPhy is a broadly accessible, integrated platform
for the phylogenetic and epidemiological analyses of
bacterial genome data. It providesnear real-time anal-
yses of thousands of genome sequences using novel
computational approaches with results that are under-
standable and useful to a wide community, includ-
ing those in the fields of clinical medicine, epidemi-
ology, ecology, and evolution. The web-interface
to this computational platform obviates the need for
command-line skills, or a particular computer envi-
ronment. As additional members of the research com-
munity use the platform, the number of genome se-
quences stored and analyzed will increase, adding
further value to the platform, and in turn attracting
more users. Genomic platforms such as SuperPhy
will become increasingly important in transforming
raw genome data into a format suitable for the de-
velopment of a world-wide real-time surveillance and
analyses network for bacterial genomes.
REFERENCES
Altschul, S. F., Madden, T. L., Schffer, A. A., Zhang, J.,
Zhang, Z., Miller, W., and Lipman, D. J. (1997).
Gapped BLAST and PSI-BLAST: a new generation
of protein database search programs. Nucleic Acids
Research, 25(17):3389–402.
Antezana, E., Kuiper, M., and Mironov, V. (2009). Biologi-
cal knowledge management: the emerging role of the
semantic web technologies. Briefings in Bioinformat-
ics, 10(4):392–407.
Benson, D. A., Cavanaugh, M., Clark, K., Karsch-Mizrachi,
I., Lipman, D. J., Ostell, J., and Sayers, E. W. (2013).
GenBank. Nucleic Acids Res., 41(Database issue):36–
42.
Bostock, M., Ogievetsky, V., and Heer, J. (2011). D
3
data-driven documents. Visualization and Computer
Graphics, IEEE Transactions on, 17(12):2301–2309.
Chen, L., Xiong, Z., Sun, L., Yang, J., and Jin, Q. (2012).
VFDB 2012 update: toward the genetic diversity and
molecular evolution of bacterial virulence factors. Nu-
cleic Acids Res., 40(Database issue):D641–645.
Chen, L., Yang, J., Yu, J., Yao, Z., Sun, L., Shen, Y., and
Jin, Q. (2005). VFDB: a reference database for bacte-
rial virulence factors. Nucleic AcidsRes., 33(Database
issue):D325–328.
Fang, H., Oates, M. E., Pethica, R. B., Greenwood, J. M.,
Sardar, A. J., Rackham, O. J., Donoghue, P. C., Sta-
matakis, A., de Lima Morais, D. A., and Gough, J.
(2013). A daily-updated tree of (sequenced) life as a
reference for genome research. Sci Rep, 3:2015.
Federhen, S. (2012). The NCBI Taxonomy database. Nu-
cleic Acids Res., 40(Database issue):D136–143.
Field, D., Garrity, G., Gray, T., Morrison, N., Selengut, J.,
Sterk, P., Tatusova, T., Thomson, N., Allen, M. J.,
Angiuoli, S. V., Ashburner, M., Axelrod, N., Bal-
dauf, S., Ballard, S., Boore, J., Cochrane, G., Cole,
J., Dawyndt, P., De Vos, P., DePamphilis, C., Ed-
wards, R., Faruque, N., Feldman, R., Gilbert, J.,
Gilna, P., Glockner, F. O., Goldstein, P., Guralnick,
R., Haft, D., Hancock, D., Hermjakob, H., Hertz-
Fowler, C., Hugenholtz, P., Joint, I., Kagan, L.,
Kane, M., Kennedy, J., Kowalchuk, G., Kottmann, R.,
Kolker, E., Kravitz, S., Kyrpides, N., Leebens-Mack,
J., Lewis, S. E., Li, K., Lister, A. L., Lord, P., Maltsev,
N., Markowitz, V., Martiny, J., Methe, B., Mizrachi, I.,
Moxon, R., Nelson, K., Parkhill, J., Proctor, L., White,
O., Sansone, S. A., Spiers, A., Stevens, R., Swift, P.,
Taylor, C., Tateno, Y., Tett, A., Turner, S., Ussery, D.,
Vaughan, B., Ward, N., Whetzel, T., San Gil, I., Wil-
son, G., and Wipat, A. (2008). The minimum informa-
tion about a genome sequence (MIGS) specification.
Nat. Biotechnol., 26(5):541–547.
Goecks, J., Nekrutenko, A., Taylor, J., and $au-
thor.lastName, a. f. (2010). Galaxy: a comprehen-
sive approach for supporting accessible, reproducible,
and transparent computational research in the life sci-
ences. Genome Biology, 11(8):R86.
Kahn, S. D. (2011). On the future of genomic data. Science
(New York, N.Y.), 331(6018):728–729.
Kupferschmidt, K. (2011). Outbreak detectives embrace the
genome era. Science, 333(6051):1818–1819.
Kurtz, S., Phillippy, A., Delcher, A. L., Smoot, M.,
Shumway, M., Antonescu, C., and Salzberg, S. L.
(2004). Versatile and open software for comparing
large genomes. Genome biology, 5(2):R12.
Laing, C., Buchanan, C., Taboada, E. N., Zhang, Y.,
Kropinski, A., Villegas, A., Thomas, J. E., and Gan-
non, V. P. J. (2010). Pan-genome sequence analysis
using panseq: an online tool for the rapid analysis of
SuperPhy-APilotResourceforIntegratedPhylogeneticandEpidemiologicalAnalysisofPathogens
47