Optimisation and Validation of a Minimum Data Set for the Identification and Quality Control of EST Expression Libraries

A. T. Milnthorpe, Mikhail Soloviev

Abstract

There are currently a few bioinformatics tools, such as dbEST, DDD, GEPIS, cDNA xProfiler and cDNA DGED to name a few, which have been widely used to retrieve and analyse EST expression data and for comparing gene expression levels e.g. between cancer and normal tissues. The outcome of any such comparison depends on EST libraries' annotations and assumes that the actual expression data (EST counts) are correct. None of the existing tools provide a quality control method for the selection and evaluation of the original EST expression libraries. Here we report the selection, optimisation and evaluation of a minimal gene expression data set using CGAP cDNA DGED. Our approach relies solely on the expression data itself and is independent on the libraries annotations. The reported approach allows tissue typing of expression libraries of different sizes containing between as little as 249 total EST counts and up to 13,929 total EST counts (the highest tested so far).

References

  1. Abba, M. C., Drake, J. A., Hawkins, K. A., Hu, Y., Sun, H., Notcovich, C., Gaddis, S., Sahin, A., Baggerly, K., Aldaz, C. M., 2004. Transcriptomic changes in human breast cancer progression as determined by serial analysis of gene expression. Breast Cancer Research. 6 (5) pp. R499 - R513.
  2. Baggerly, K. A., Deng, L., Morris, J. S., Aldaz, C. M., 2003. Differential expression in SAGE: accounting for normal between-library variation. Bioinformatics 19 (12) pp. 1,477 - 1,483.
  3. Baggerly, K. A., Deng, L., Morris, J. S., Aldaz, C. M., 2004. Overdispersed logistic regression for SAGE: modelling multiple groups and covariates BMC. Bioinformatics 5 (144).
  4. Bashir, A., Bansal, V., Bafna, V. 2010. Designing deep sequencing experiments: detecting structural variation and estimating transcript abundance BMC. Genomics. 11 (385).
  5. Huminiecki, L., Lloyd, A. T., Wolfe, K. H., 2003. Congruence of tissue expression profiles from Gene Expression Atlas, SAGEmap and TissueInfo databases BMC. Genomics 4 (31).
  6. Milnthorpe, A.T., Soloviev, M., 2011. Errors in CGAP xProfiler and cDNA DGED: the importance of library parsing and gene selection algorithms BMC. Bioinformatics 12 (97).
  7. Milnthorpe, A. T., Soloviev, M., 2012. The use of EST expression matrixes for the quality control of gene expression data PLoS. One 7 (3) e32966.
  8. Schaaf, G. J., van Ruissen, F., van Kampen, A., Kool, M., Ruijter, J. M. 2008. Statistical comparison of two or more SAGE libraries: one tag at a time Methods. In Molecular Biology 387 pp. 151 - 168.
  9. Simon, S. A., Zhai, J., Nandety, R. S., McCormick, K. P., Zeng, J., Mejia, D., Meyers, B. C., 2009. Short-read sequencing technologies for transcriptional analyses. Annual Review of Plant Biology 60 pp. 305 - 333.
Download


Paper Citation


in Harvard Style

Milnthorpe A. and Soloviev M. (2013). Optimisation and Validation of a Minimum Data Set for the Identification and Quality Control of EST Expression Libraries . In Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2013) ISBN 978-989-8565-35-8, pages 278-281. DOI: 10.5220/0004194202780281


in Bibtex Style

@conference{bioinformatics13,
author={A. T. Milnthorpe and Mikhail Soloviev},
title={Optimisation and Validation of a Minimum Data Set for the Identification and Quality Control of EST Expression Libraries},
booktitle={Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2013)},
year={2013},
pages={278-281},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004194202780281},
isbn={978-989-8565-35-8},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2013)
TI - Optimisation and Validation of a Minimum Data Set for the Identification and Quality Control of EST Expression Libraries
SN - 978-989-8565-35-8
AU - Milnthorpe A.
AU - Soloviev M.
PY - 2013
SP - 278
EP - 281
DO - 10.5220/0004194202780281