Authors:
Farshideh Einsele
1
;
Leila Sadeghi
2
;
Rolf Ingold
3
and
Helena Jenzer
2
Affiliations:
1
Berne University of Applied Sciences, Switzerland
;
2
Bern University of Applied Sciences, Switzerland
;
3
University of Fribourg, Switzerland
Keyword(s):
Data Mining, Association Rules, Nutritional Patterns, Knowledge Interpretation, Lifestyle Diseases, Demographic, Customer Profiles, Disease Diagnosis.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Biomedical Engineering
;
Business Analytics
;
Clinical Problems and Applications
;
Data Engineering
;
Data Mining
;
Databases and Information Systems Integration
;
Datamining
;
Decision Support Systems
;
Enterprise Information Systems
;
Health Information Systems
;
Pattern Recognition and Machine Learning
;
Practice-based Research Methods for Healthcare IT
;
Sensor Networks
;
Signal Processing
;
Soft Computing
Abstract:
Background: To date, the analysis of the implications of dietary patterns on lifestyle diseases is based on data
coming either from clinical studies or food surveys, both comprised of a limited number of participants. This
article demonstrates that linking big data from a grocery store sales database with demographical and health
data by using data mining tools such as classification and association rules is a powerful way to determine if
a specific population subgroup is at particular risk for developing a lifestyle disease based on its food
consumption patterns. Objective: The objective of the study was to link big data from grocery store sales with
demographic and health data to discover critical food consumption patterns linked with lifestyle diseases
known to be strongly tied with food consumption. Design: Food consumption databases from a publicly
available grocery store database dating from 1997–1998 were gathered along with corresponding
demographics and health data from the
U. S. west coast, pre-processed, cleaned and finally integrated to a
unique database. Results: This study applied data mining techniques such as classification and association
mining analysis. Firstly, the studied population was classified according to the demographical information “
age groups” and “race” and data for lifestyle diseases were correspondingly attributed. Secondly, association
mining analysis was used to incorporate rules about food consumption and lifestyle diseases. A set of
promising preliminary rules and their corresponding interpretation was generated and reported in the present
paper. Conclusions: Association mining rules were successfully used to describe and predict rules linking
food consumption patterns with lifestyle diseases. In the selected grocery store database, information about
interesting aspects of the grocery store customers were found such as marital status, educational background,
profession and number of children at home. An in-depth research on these attributes is needed to further
expand the present demographical database. Since the search on the internet for demographical attributes back
to the year of 2000 corresponding to the studied population subgroup was extremely laborious, the selected
demographical attributes to prove the feasibility of the study were limited to age groups and race.
(More)