QUASI: A Pipeline for the Quality Assessment and Statistical Inference on Next Generation Sequencing Data from Pooled shRNA Library Screens

Mark Onyango, Carsten Ade, Franz Cemič, Jürgen Hemberger

Abstract

With the development of next generation high-throughput sequencing solutions to expression profiling, the efficient and effortless handling of such profiling data became a key challenge for bioinformaticians and biologists alike. We therefore present a "fire and forget" style pipeline implemented in C and R, named QUASI. It is capable of quality assessments, sequence alignments, shRNA quantification and statistically inferring significant differential sequence abundance from datasets presented to it. Through blackboxing the often complex and laborious steps, QUASI presents itself as a user-friendly and time-efficient solution to handle pooled shRNA library screening data.

References

  1. Anders, S. & Huber, W., 2010. Differential expression analysis for sequence count data. Genome biology, 11(10), p.R106.
  2. Andrews, S., 2010. FastQC: A quality control tool for high throughput sequence data. Available at: http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/ Elbashir, S. M. et al., 2001. Duplexes of 21-nucleotide RNAs mediate RNA interference in cultured mammalian cells. Nature, 411(6836), pp.494-8.
  3. Ewing, B. et al., 1998. Base-calling of automated sequencer traces usingPhred. I. Accuracy assessment. Genome research, pp.175-185.
  4. Fewell, G. D. & Schmitt, K., 2006. Vector-based RNAi approaches for stable, inducible and genome-wide screens. Drug discovery today, 11(21-22), pp.975-82.
  5. Fire, A. et al., 1998. Potent and specific genetic interference by double-stranded RNA in Caenorhabditis elegans. Nature, 391(6669), pp.806- 11.
  6. Hannon, G., 2012. The FASTX-toolkit. Available at: http://hannonlab.cshl.edu/fastx_toolkit/.
  7. Hardcastle, T. J. & Kelly, K. A., 2010. baySeq: empirical Bayesian methods for identifying differential expression in sequence count data. BMC bioinformatics, 11(1), p.422.
  8. Kircher, M., Stenzel, U. & Kelso, J., 2009. Improved base calling for the Illumina Genome Analyzer using machine learning strategies. Genome biology, 10(8), p.R83.
  9. Langmead, B. et al., 2009. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome biology, 10(3), p.R25.
  10. Li, H. & Durbin, R., 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics (Oxford, England), 25(14), pp.1754- 60.
  11. Liu, C.-M. et al., 2012. SOAP3: Ultra-fast GPU-based parallel alignment tool for short reads. Bioinformatics (Oxford, England), pp.24-25.
  12. Lu, J., Tomfohr, J. K. & Kepler, T. B., 2005. Identifying differential expression in multiple SAGE libraries: an overdispersed log-linear model approach. BMC bioinformatics, 6, p.165.
  13. Robinson, M. D., McCarthy, D. J. & Smyth, G. K., 2010. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics (Oxford, England), 26(1), pp.139-40.
Download


Paper Citation


in Harvard Style

Onyango M., Ade C., Cemič F. and Hemberger J. (2013). QUASI: A Pipeline for the Quality Assessment and Statistical Inference on Next Generation Sequencing Data from Pooled shRNA Library Screens . In Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2013) ISBN 978-989-8565-35-8, pages 288-291. DOI: 10.5220/0004220702880291


in Bibtex Style

@conference{bioinformatics13,
author={Mark Onyango and Carsten Ade and Franz Cemič and Jürgen Hemberger},
title={QUASI: A Pipeline for the Quality Assessment and Statistical Inference on Next Generation Sequencing Data from Pooled shRNA Library Screens},
booktitle={Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2013)},
year={2013},
pages={288-291},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004220702880291},
isbn={978-989-8565-35-8},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2013)
TI - QUASI: A Pipeline for the Quality Assessment and Statistical Inference on Next Generation Sequencing Data from Pooled shRNA Library Screens
SN - 978-989-8565-35-8
AU - Onyango M.
AU - Ade C.
AU - Cemič F.
AU - Hemberger J.
PY - 2013
SP - 288
EP - 291
DO - 10.5220/0004220702880291