Authors:
Victor Solovyev
1
;
Asaf Salamov
2
;
Igor Seledtsov
2
;
Denis Vorobyev
3
and
Alexander Bachinsky
4
Affiliations:
1
Royal Holloway and University of London, United Kingdom
;
2
Softberry Inc., United States
;
3
Softbery Inc., United States
;
4
Softberry. Inc., United States
Keyword(s):
Bacterial community, Genome annotation, Sequence assembling, RNA-seq data, Computational pipelines, classification and diagnostic pathogenic bacteria, Tanscriptome analysis.
Abstract:
To annotate bacterial sequences from an environmental sample, we have developed an automatic annotation pipeline Fgenesb_annotator that includes self-training of gene-finding parameters, prediction of CDS, RNA genes, operons, promoters and terminators. New version of pipeline includes frame shift corrections and special module with improved prediction accuracy of ribosomal proteins. To analyze next-generation sequencing data we have developed OligiZip assembler and Transomics pipeline that provide solutions to the following tasks: 1) de novo reconstruction of genomic sequence; 2) reconstruction of sequence with a reference genome; 3) SNP discovery; 4) mapping RNA-Seq data to a reference genome, assemble them into alternative transcripts and quantify the abundance of these transcripts. Using the OligoZip assembler and gene Fgenesb pipeline we have developed a novel computational approach of identification toxic and non-toxic bacterial serotypes using next-generation sequencing data. I
t can be used for detection of bacterial infections in wounds, water or food contamination.
(More)