StorageBIT - A Metadata-aware, Extensible, Semantic and Hierarchical Database for Biosignals

Carlos Carreiras, Hugo Silva, André Lourenço, Ana Fred



Acquisition of biomedical data is, nowadays, widespread, originating a deluge of data that may contain relevant and interesting information for health-care professionals, biosignal researchers, and the individuals themselves. This creates the need to organize the information in a structured way, facilitating collaboration and research efforts. Therefore, for that purpose, this paper investigates database systems and file formats, discussing current technologies, requirements and possible implementations. These implementations were put through a benchmarking package to analyze their insertion, query and update performance. A final approach combining the use of HDF5, a hierarchical file format for numerical data, and MongoDB, a NoSQL database, is proposed, as it showed the best combination of properties from the tested solutions.


  1. Anderson, J. C., Lehnardt, J., and Slater, N. (2010). CouchDB: The Definitive Guide Time to Relax. O'Reilly Media, Inc., 1st edition.
  2. Bayer, R. (1971). Binary b-trees for virtual memory. In Proceedings of the 1971 ACM SIGFIDET (now SIGMOD) Workshop on Data Description, Access and Control, SIGFIDET 7871, pages 219-235, New York, NY, USA. ACM.
  3. Brooks, D. (2009). Extensible biosignal metadata a model for physiological time-series data. In Eng. in Medicine and Biology Society, 2009. EMBC 2009. Annual International Conference of the IEEE, pages 3881 -3884.
  4. Brooks, D., Hunter, P., Smaill, B., and Titchener, M. (2011). BiosignalML - a meta-model for biosignals. In Eng.
  5. in Medicine and Biology Society,EMBC, 2011 Annual International Conference of the IEEE, pages 5670 - 5673.
  6. Chodorow, K. and Dirolf, M. (2010). Definitive Guide. O'Reilly Media.
  7. Crockford, D. (2006). The application/json media type for JavaScript Object Notation (JSON). Network Working Group,
  8. Dean, J. and Ghemawat, S. (2004). MapReduce: simplified data processing on large clusters. In Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6, OSDI'04, pages 10-10, Berkeley, CA, USA. USENIX Association.
  9. Goldberger, A. L., Amaral, L. A. N., Glass, L., Hausdorff, J. M., Ivanov, P. C., Mark, R. G., Mietus, J. E., Moody, G. B., Peng, C.-K., and Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation, 101(23):e215-e220.
  10. HDF (2010). The HDF group. Hierarchical data format version 5, 2000-2010.
  11. Hellmann, G., Kuhn, M., Prosch, M., and Spreng, M. (1996). Extensible biosignal (EBS) file format: simple method for eeg data exchange. Electroencephalography and Clinical Neurophysiology, 99(5):426 - 431.
  12. Jain, A., Duin, R., and Mao, J. (2000). Statistical pattern recognition: A review. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 22(1):4-37.
  13. Kemp, B. and Olivan, J. (2003). European data format 'plus' (EDF+), an edf alike standard format for the exchange of physiological data. Clinical Neurophysiology, 114(9):1755 - 1761.
  14. Kokkinaki, A., Chouvarda, I., and Maglaveras, N. (2008). An ontology-based approach facilitating unified querying of biosignals and patient records. In Eng. in Medicine and Biology Society, 2008. EMBS 2008. 30th Annual International Conference of the IEEE, pages 2861 -2864.
  15. Kvedar, J. (2011). A physician's perspective on selftracking. MIT - Technology Review.
  16. Lovell, N., Magrabi, F., Celler, B., Huynh, K., and Garsden, H. (2001). Web-based acquisition, storage, and retrieval of biomedical signals. Eng. in Medicine and Biology Magazine, IEEE, 20(3):38 -44.
  17. McGuinness, D. L. (2003). Ontologies come of age. Spinning the Semantic Web: Bringing the World Wide Web to Its Full Potential.
  18. MFER (2003). Medical waveform format encoding rules.
  19. Penzel, T., Kemp, B., Klosch, G., Schlogl, A., Hasan, J., Varri, A., and Korhonen, I. (2001). Acquisition of biomedical signals databases. Eng. in Medicine and Biology Magazine, IEEE, 20(3):25 -32.
  20. Pierce, B. C. and Vouillon, J. (2004). What's in Unison? A formal specification and reference implementation of a file synchronizer. Technical Report MS-CIS-03-36, Dept. of Computer and Information Science, University of Pennsylvania.
  21. Strauch, C. (2011). NoSQL databases. Technical report, Stuttgart Media University.
  22. Varri, A., Kemp, B., Penzel, T., and Schlogl, A. (2001). Standards for biomedical signal databases. Eng. in Medicine and Biology Magazine, IEEE, 20(3):33 -37.
  23. Vaseghi, S. V. (2006). Advanced Digital Signal Processing and Noise Reduction. Wiley, 3rd edition.
  24. Wal, T. V. (2007). Folksonomy coinage and definition.

Paper Citation

in Harvard Style

Carreiras C., Silva H., Lourenço A. and Fred A. (2013). StorageBIT - A Metadata-aware, Extensible, Semantic and Hierarchical Database for Biosignals . In Proceedings of the International Conference on Health Informatics - Volume 1: HEALTHINF, (BIOSTEC 2013) ISBN 978-989-8565-37-2, pages 65-74. DOI: 10.5220/0004241400650074

in Bibtex Style

author={Carlos Carreiras and Hugo Silva and André Lourenço and Ana Fred},
title={StorageBIT - A Metadata-aware, Extensible, Semantic and Hierarchical Database for Biosignals},
booktitle={Proceedings of the International Conference on Health Informatics - Volume 1: HEALTHINF, (BIOSTEC 2013)},

in EndNote Style

JO - Proceedings of the International Conference on Health Informatics - Volume 1: HEALTHINF, (BIOSTEC 2013)
TI - StorageBIT - A Metadata-aware, Extensible, Semantic and Hierarchical Database for Biosignals
SN - 978-989-8565-37-2
AU - Carreiras C.
AU - Silva H.
AU - Lourenço A.
AU - Fred A.
PY - 2013
SP - 65
EP - 74
DO - 10.5220/0004241400650074