J-index
0.7
0.65
0.6
0.55
0.5
0.45
0.4
0.35
0.3
fACO fBest fAllData fBagg fBoost
Training
Test
Figure 2: Experimental results: Predictive accuracy im-
provement with ACO.
performed and new expert is learned on public mood
data collected from social networks. The novelty of
the proposed models combination resides in the reuse
of the structural elements of individual models rather
than their outputs. This way of combining models
does not only improve the performance of the com-
posite model but it also promotes model interpretabil-
ity. The latter model property is very critical in the
context of stock market decision making. In par-
ticular it is very useful in discovering which mood
states attributes and which mood scores are responsi-
ble for a particular movement of the stock value. The
high complexity of combining expert chunks was re-
solved by the application of a metaheuristic, namely,
ACO algorithm. Bayesian Classifiers are used as in-
terpretable type of prediction models. The results ob-
tained on data of a particular company in the stock
market, show the higher performance of the derived
model when compared to four alternative approaches
including bagging and boosting. Other types of pre-
diction models are to be considered in our future
works. Besides, we will continue collecting more rep-
resentative data from social media. In particular, for
maturing the problem of stock market prediction and
discovering better predictors.
REFERENCES
Ahmed, F., Bouktif, S., Serhani, A., and Khalil, I. (2008).
Integrating function point project information for im-
proving the accuracy of effort estimation. In Advanced
Engineering Computing and Applications in Sciences,
pages 193–198. IEEE.
Asur, S. and Huberman, B. A. (2010). Predicting the fu-
ture with social media. In International Conference
on Web Intelligence and Intelligent Agent Technology,
pages 492–499.
Bollen, J., Mao, H., and Zeng, X. (2011). Twitter mood
predicts the stock market. Journal of Computational
Science, 2(1):1–8.
Bouktif, S., Ahmed, F., Khalil, I., Antoniol, G., and
Sahraoui, H. (2010). A novel composite model ap-
proach to improve software quality prediction. Infor-
mation and Software Technology, 52(12):1298–1311.
Bouktif, S. and Awad, M. A. (2013). Ant colony based ap-
proach to predict stock market movement from mood
collected on twitter. In Proceedings of IEEE/ACM
International Conference on Advances in Social Net-
works Analysis and Mining, pages 837–845.
Bouktif, S., K
´
egl, B., and Sahraoui, S. (2002). Combining
software quality predictive models: An evolutionary
approach. In Proceeding of the International Confer-
ence on Software Maintenance, pages 385–392.
Bouktif, S., Sahraoui, H. A., and Antoniol, G. (2006). Sim-
ulated annealing for improving software quality pre-
diction. In Genetic and Evolutionary Computation
Conference proceeding, Seattle, USA, pages 1893–
1900.
Chen, J., Huang, H., Tian, S., and Qu, Y. (2009). Feature se-
lection for text classification with na
¨
ıve bayes. Expert
Systems with Applications, 36(3):5432–5435.
Deneubourg, J., Aron, S., Goss, S., and Pasteels, J. (1990).
The self-organizing exploratory pattern of the argen-
tine ant. Journal of insect behavior, 3(2):159–168.
Dorigo, M., Birattari, M., and Stutzle, T. (2006). Ant colony
optimization. Computational Intelligence Magazine,
IEEE, 1(4):28–39.
Fenton, N. and Neil, M. (1999). A critique of software de-
fect prediction models. IEEE Transactions on Soft-
ware Engineering, 25(5):675–689.
Freund, Y. and Schapire, R. (1997). A decision-theoretic
generalization of on-line learning and an application
to boosting. Journal of Computer and System Sci-
ences, 55(1):119–139.
Galar, M., Fern
´
andez, A., Barrenechea, E., Bustince, H.,
and Herrera, F. (2012). A review on ensembles for
the class imbalance problem: bagging-, boosting-, and
hybrid-based approaches. IEEE Transactions on Sys-
tems, Man, and Cybernetics, Part C: Applications and
Reviews, 42(4):463–484.
Gayo-Avello, D. (2012). I wanted to predict elections with
twitter and all i got was this lousy paper a balanced
survey on election prediction using twitter data. arXiv
preprint arXiv:1204.6441.
Ginsberg, J., Mohebbi, M. H., Patel, R. S., Brammer, L.,
Smolinski, M. S., and Brilliant, L. (2008). Detecting
influenza epidemics using search engine query data.
Nature, 457(7232):1012–1014.
Goebel, M. and Gruenwald, L. (1999). A survey of data
mining and knowledge discovery software tools. ACM
SIGKDD Explorations Newsletter, 1(1):20–33.
Kim, H., Loh, W.-Y., Shih, Y.-S., and Chaudhuri, P. (2007).
Visualizable and interpretable regression models with
good prediction power. IIE Transactions, 39(6):565–
579.
Lavra
ˇ
c, N. (1999). Selected techniques for data mining in
KDIR 2015 - 7th International Conference on Knowledge Discovery and Information Retrieval
166