A Data-driven Framework on Mining Relationships between Air Quality and Cancer Diseases

Wei Yuan Chang, En Tzu Wang, Arbee L. P. Chen

Abstract

According to the report on global health risks, published by World Health Organization, environmental issues are urged to be dealt with in the world. Especially, air pollution causes great damage to human health. In this work, we build a framework for finding the correlations between air pollution and cancer diseases. This framework consists of a data access flow and a data analytics flow. The data access flow is designed to process raw data and to make the data able to be accessed by APIs. The cancer statistics is then mapped to air pollution data through temporal and spatial information. The analytics flow is used to find insights, based on the data exploration and data classification methods. The data exploration methods use statistics, clustering, and a series of mining techniques to interpret data. Then, the data mining methods are applied to find the relationships between air quality and cancer diseases by viewing air pollution indicators and cancer statistics as features and labels, respectively. The experiment results show that NO and NO2 air pollutants have a significant influence on the breast cancer, and the lung cancer is significantly influenced by NO2, NO, PM10 and O3, which are consistent with those from traditional statistical methods. Moreover, our results also cover the research results from several other studies. The proposed framework is flexible and can be applied to other applications with spatiotemporal data.

Download


Paper Citation


in Harvard Style

Chang W., Wang E. and Chen A. (2017). A Data-driven Framework on Mining Relationships between Air Quality and Cancer Diseases . In Proceedings of the 6th International Conference on Data Science, Technology and Applications - Volume 1: DATA, ISBN 978-989-758-255-4, pages 255-262. DOI: 10.5220/0006471902550262


in Bibtex Style

@conference{data17,
author={Wei Yuan Chang and En Tzu Wang and Arbee L. P. Chen},
title={A Data-driven Framework on Mining Relationships between Air Quality and Cancer Diseases},
booktitle={Proceedings of the 6th International Conference on Data Science, Technology and Applications - Volume 1: DATA,},
year={2017},
pages={255-262},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006471902550262},
isbn={978-989-758-255-4},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 6th International Conference on Data Science, Technology and Applications - Volume 1: DATA,
TI - A Data-driven Framework on Mining Relationships between Air Quality and Cancer Diseases
SN - 978-989-758-255-4
AU - Chang W.
AU - Wang E.
AU - Chen A.
PY - 2017
SP - 255
EP - 262
DO - 10.5220/0006471902550262