tations. Finally, correlations with low Lift values do
not appear in the graph, such as the rule Q04 = E ⇒
NON DROPOUT with Lift=1.15, that is outputted in
the tree.
Note that ExARN provides a lot of additional in-
formation regarding classes. For example, in relation
to the dropout class, it may be noted that other items
can be influencing it, such as “Q10=A”, “Q05=C”,
“Q03=B”, etc. The rules related to these items have a
good Lift value (>=1.25), indicating that they should
be explored. The same is true for the non-dropout
class.
Unlike the previous case (without undersam-
pling), the complementary view that ExARN offers
in relation to that expressed in C4.5 is clear. Note that
this view is important because of the classifier’s poor
performance in predicting the dropout class, even bal-
ancing the dataset. Finally, it is interesting to note that
dominant items did not appear on the graphs, which
means that, in a sense, the classes are separable.
6 CONCLUSIONS
This work presented the ExARN approach to treat the
dropout problem as a complementary view to what
is commonly used in the literature, i.e., classification
through C4.5. For this, experiments were performed
with data from one of the Etec’s courses. ExARN was
found to be an interesting approach to understand the
factors that lead a student to dropout. In addition, it is
a good alternative for unbalanced datasets.
It is important to note that the C4.5 focuses on im-
proving accuracy to make good predictions, with in-
terpretability being a secondary result, while ExARN
on presenting statistically significant correlations that
exist among items, with prediction being a secondary
result. Therefore, it can be noted that it is interest-
ing to treat the problem with different views, to help
the user to better understand the problem. This is a
gap identified in the literature, described in Section 2,
since only few works combine techniques and use hy-
brid solutions. Thus, efforts should be made to pro-
pose solutions following this idea. Multiple views are
needed when the focus is to understand the domain,
not only classify.
As future work we intend to propose a hybrid so-
lution to the dropout problem, mainly because it is
an important and unbalanced problem. As an indirect
result, an effort must be done in Etecs to store more
information about students to try to better map their
profile.
ACKNOWLEDGEMENTS
We wish to thank CAPES and FAPESP (2019/04923-
2) for the financial aid.
REFERENCES
Al-shargabi, A. A. and Nusari, A. N. (2010). Discovering
vital patterns from UST students data by applying data
mining techniques. In 2nd International Conference
on Computer and Automation Engineering (ICCAE),
volume 2, pages 547–551.
Datta, S. and Mengel, S. (2015). Multi-stage decision
method to generate rules for student retention. Journal
of Computing Sciences in Colleges, 31(2):65–71.
Delen, D. (2011). Predicting student attrition with data min-
ing methods. Journal of College Student Retention:
Research, Theory & Practice, 13(1):17–35.
Gopalakrishnan, A., Kased, R., Yang, H., Love, M. B.,
Graterol, C., and Shada, A. (2017). A multifaceted
data mining approach to understanding what factors
lead college students to persist and graduate. In Com-
puting Conference, pages 372–381.
Gustian, D. and Hundayani, R. D. (2017). Combination of
AHP method with C4.5 in the level classification level
out students. In International Conference on Comput-
ing, Engineering, and Design (ICCED), page 6p.
Hegazi, M. O., Alhawarat, M., and Hilal, A. (2016). An
approach for integrating data mining with Saudi Uni-
versities database systems: Case study. International
Journal of Advanced Computer Science and Applica-
tions, 7(6):213–218.
Manh
˜
aes, L. M. B., Cruz, S. M. S., and Zimbr
˜
ao, G. (2014).
WAVE: An architecture for predicting dropout in un-
dergraduate courses using EDM. In Proceedings of
the 29th Annual ACM Symposium on Applied Com-
puting (SAC), pages 243–247.
M
´
arquez-Vera, C., Cano, A., Romero, C., Noaman, A.
Y. M., Mousa Fardoun, H., and Ventura, S. (2016).
Early dropout prediction using data mining: A case
study with high school students. Expert Systems: The
Journal of Knowledge Engineering, 33(1):107–124.
Padua, R., Calcada, D. B., Carvalho, V. O., and Rezende,
S. O. (2018). Exploring the data using Extended As-
sociation Rule Network. In Brazilian Conference on
Intelligent Systems (BRACIS), pages 330–335.
Pereira, R. T. and Zambrano, J. C. (2017). Application of
decision trees for detection of student dropout profiles.
In 16th IEEE International Conference on Machine
Learning and Applications (ICMLA), pages 528–531.
Pertiwi, A. G., Widyaningtyas, T., and Pujianto, U. (2017).
Classification of province based on dropout rate us-
ing C4.5 algorithm. In International Conference on
Sustainable Information Engineering and Technology
(SIET), pages 410–413.
Tinto, V. (1993). Leaving College: Rethinking the Causes
and Cures of Student Attrition. University of Chicago
Press.
CSEDU 2020 - 12th International Conference on Computer Supported Education
96