loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Kirill Lassounski 1 ; Sahudy Montenegro González 2 ; Annabell del Real Tamariz 1 and Gabriel Lima de Oliveira 1

Affiliations: 1 State University of Norte Fluminense, Brazil ; 2 Federal University of São Carlos, Brazil

Keyword(s): Information Retrieval, Text Mining, PubMed, Evaluation Dataset, Java.

Related Ontology Subjects/Areas/Topics: Algorithms and Software Tools ; Bioinformatics ; Biomedical Engineering

Abstract: The NCBI (National Center for Biotechnology Information) provides information about genes, proteins, scientific literature, molecular structures among other resources related to bio-medicine. The NCBI has a database called PubMed that stores about 21 millions of scientific articles. There are many researches in the information retrieval field that need to automatically obtain useful data from PubMed to perform evaluation and testing. This work describes a Java library to construct datasets, so that numerous scientific researches could evaluate their results easily and quickly. Users must set input and output parameters such as article’s attributes (title, abstract, keywords, etc.) to conform the dataset constructed as a serializable file. The creation of PubMed Dataset came from the fact that the authors needed to build their own datasets to evaluate their system results. In this article it is also presented the BioSearch Refinement system as a case study. The system utilizes the lib rary to construct the datasets used to evaluate its algorithm for automatic extraction of keyphrases. We also discuss the benefits obtained from the usage of the PubMed Dataset. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 18.227.49.73

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Lassounski, K.; Montenegro González, S.; del Real Tamariz, A. and Lima de Oliveira, G. (2012). PUBMED DATASET: A JAVA LIBRARY FOR AUTOMATIC CONSTRUCTION OF EVALUATION DATASETS. In Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms (BIOSTEC 2012) - BIOINFORMATICS; ISBN 978-989-8425-90-4; ISSN 2184-4305, SciTePress, pages 343-346. DOI: 10.5220/0003797203430346

@conference{bioinformatics12,
author={Kirill Lassounski. and Sahudy {Montenegro González}. and Annabell {del Real Tamariz}. and Gabriel {Lima de Oliveira}.},
title={PUBMED DATASET: A JAVA LIBRARY FOR AUTOMATIC CONSTRUCTION OF EVALUATION DATASETS},
booktitle={Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms (BIOSTEC 2012) - BIOINFORMATICS},
year={2012},
pages={343-346},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003797203430346},
isbn={978-989-8425-90-4},
issn={2184-4305},
}

TY - CONF

JO - Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms (BIOSTEC 2012) - BIOINFORMATICS
TI - PUBMED DATASET: A JAVA LIBRARY FOR AUTOMATIC CONSTRUCTION OF EVALUATION DATASETS
SN - 978-989-8425-90-4
IS - 2184-4305
AU - Lassounski, K.
AU - Montenegro González, S.
AU - del Real Tamariz, A.
AU - Lima de Oliveira, G.
PY - 2012
SP - 343
EP - 346
DO - 10.5220/0003797203430346
PB - SciTePress