loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Li Jiangyu ; Liu Yang ; Wang Xiaolei ; Mao Yiqing ; Wang Yumin and Zhao Dongsheng

Affiliation: Academy of Military Medical Sciences, China

Keyword(s): High-Throughput Sequencing, Metagenomics, RINS, Hadoop, MapReduce.

Related Ontology Subjects/Areas/Topics: Algorithms and Software Tools ; Bioinformatics ; Biomedical Engineering ; Next Generation Sequencing ; Sequence Analysis

Abstract: Sequencing data increase rapidly in recent years with the development of high-throughput sequencing technology. Using parallel computing to accelerate the computation is an important way to process the large volume of sequence data. RINS is a pipeline used to identify nonhuman sequences in deep sequencing datasets. It uses user-provided microbial reference genomes to reduce the number of reads to be processed and improve the processing speed. But all of its steps run serially. As a result, the processing speed of RINS slows down sharply as the sequencing data and reference genomes increase. In this article, we report a pipeline that processes sequencing data parallel through Hadoop. By comparing the runtime using same dataset, Hadoop-RINS is proved to be significantly faster than RINS with the same computation result.

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.129.67.248

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Jiangyu, L.; Yang, L.; Xiaolei, W.; Yiqing, M.; Yumin, W. and Dongsheng, Z. (2013). Hadoop-RINS - A Hadoop Accelerated Pipeline for Rapid Nonhuman Sequence Identification. In Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms (BIOSTEC 2013) - BIOINFORMATICS; ISBN 978-989-8565-35-8; ISSN 2184-4305, SciTePress, pages 296-299. DOI: 10.5220/0004239602960299

@conference{bioinformatics13,
author={Li Jiangyu. and Liu Yang. and Wang Xiaolei. and Mao Yiqing. and Wang Yumin. and Zhao Dongsheng.},
title={Hadoop-RINS - A Hadoop Accelerated Pipeline for Rapid Nonhuman Sequence Identification},
booktitle={Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms (BIOSTEC 2013) - BIOINFORMATICS},
year={2013},
pages={296-299},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004239602960299},
isbn={978-989-8565-35-8},
issn={2184-4305},
}

TY - CONF

JO - Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms (BIOSTEC 2013) - BIOINFORMATICS
TI - Hadoop-RINS - A Hadoop Accelerated Pipeline for Rapid Nonhuman Sequence Identification
SN - 978-989-8565-35-8
IS - 2184-4305
AU - Jiangyu, L.
AU - Yang, L.
AU - Xiaolei, W.
AU - Yiqing, M.
AU - Yumin, W.
AU - Dongsheng, Z.
PY - 2013
SP - 296
EP - 299
DO - 10.5220/0004239602960299
PB - SciTePress