6 CONCLUSION
Diabetes is a chronic disease whose prevalence is
growing at a rapid rate throughout the world. In
Canada, one person is diagnosed with diabetes every
three minutes, and one in ten deaths are attributed to
this disease. Due to this prevalence, it has received
global attention and vast amounts of data has been
collected. Unfortunately, this data exists in disparate
repositories and has not been harnessed to its full
potential. One of the key shortcomings of existing
research towards this cause is the use of non-clinical
data which is collected using surveys and self-
administered questionnaires. The dataset used for this
research was obtained from a health authority and
exclusively comprised of diabetic patients. In order to
make this clinical data valuable for physicians and
other stakeholders, several KPIs were identified
which provided insight into historical trends and
patterns for using visual analytics. These metrics are
then presented on a visually appealing dashboard
which consists of top-level reports and numerous
drill-down and drill-through reports for insights at
finer granularity. The data was mined for predictive
analysis. Six representative data mining algorithms
were evaluated for analysis of three target variables.
Overall, an accuracy of 83.5%, 92.4% and 96.5% was
observed for I100, I500 and N179, respectively. The
developed models were then incorporated into an
interactive assessment tool that takes input from the
user via an interactive web form and predicts the
likelihood of one of the three comorbidities in future.
In summary, the study methodology consists of
the following steps: integration of a clinical diabetes
dataset into SQL database, data preprocessing, data
analysis, selection of the input and three predictor
variables for diabetes comorbidities, evaluation of
relative performance of various data mining
algorithms, displaying results on an interactive
dashboard and building an integrated, user-friendly
tool to calculate the risk of developing comorbidities
for individual patients.
There is potential for future research in this area.
For instance, it would be more desirable to have an
exclusive code for recording the type of diabetes and
separate the comorbidities diagnosis of the patients.
Similarly, physicians could identify combinations of
different diagnosis codes for a potentially higher
prediction accuracy due to larger grouping. Finally,
adding time dimension to the metrics could allow a
longitudinal study leading to prediction of timelines
when a comorbidity is likely to occur.
REFERENCES
Anand, R. S., Stey, P., Jain, S., Biron, D. R., Bhatt, H.,
Monteiro, K., Chen, E. S. (2018). Predicting Mortality
in Diabetic ICU Patients Using Machine Learning and
Severity Indices. AMIA Joint Summits on Translation
Science Proceedings, 2018(1), 310-319.
Dagliati, A., Sacchi, L., Tibollo, V., Cogni, G., Teliti, M.,
Martinez-Millana, A., Be, R. (2018). A dashboard-
based system for supporting diabetes care. Journal of
the American Medical Informatics Association, 25(5),
538-547.
Diabetes Canada. (n.d.). Why Federal Leadership Is
Essential Concerning Diabetes. Retrieved Dec 20,
2017, from https://www.diabetes.ca/how-you-can-
help/advocate/why-federal-leadership-is-essential
Heikes, K. E., Eddy, D. M., Arondekar, B., & Schlessinger,
L. (2008). Diabetes Risk Calculator: A Simple Tool for
Detecting Undiagnosed Diabetes and Pre-Diabetes.
Diabetes Care, 5, 1040-1045.
IBM. (2020). SPSS Modeler - Overview. Retrieved
September 22, 2020, from https://www.ibm.com/ca-
en/products/spss-modeler
Kumari, S., & Singh, A. (2013). A data mining approach
for the diagnosis of diabetes mellitus. 2013 7th
International Conference on Intelligent Systems and
Control (ISCO). Coimbatore, India: IEEE.
Lau, M., Campbell, H., Tang, T., J S Thompson, D., &
Elliott, T. (2014). Impact of Patient Use of an Online
Patient Portal on Diabetes Outcomes. Canadian
Journal of Diabetes, 38(1), 17-21.
Lindström, J., & Tuomilehto, J. (2003). The Diabetes Risk
Score: A Practical Tool to Predict Type 2 Diabetes
Risk. Diabetes Care, 26(3), 725-731.
Meng, X.-H., Huang, Y.-X., Rao, D.-P., Zhang, Q., & Liu,
Q. (2013, February). Comparison of three data mining
models for predicting diabetes or prediabetes by risk
factors. The Kaohsiung Journal of Medical Sciences,
29(2), 93-99.
Microsoft. (n.d.). Microsoft SQL documentation. Retrieved
November 1, 2020, from https://docs.microsoft.com/
en-us/sql/?view=sql-server-ver15
Zhang, L., Shang, X., Sreedharan, S., Yan, X., Liu, J., Keel,
S., Wu, J., Peng, W., He, M. (2020). Predicting the
Development of Type 2 Diabetes in a Large Australian
Cohort Using Machine-Learning Techniques:
Longitudinal Survey Study. JMIR MEDICAL
INFORMATICS, 8(7).