Chapelle, O., Sch¨olkopf, B., and Zien, A. (2006). Semi-
supervised learning. MIT Press.
Czarnowski, I. and Jedrzejowicz, P. (2006). Instance reduc-
tion approach to machine learning and multi-database
mining. In Proceedings of the Scientific Session orga-
nized during XXI Fall Meeting of the Polish Informa-
tion Processing Society, Informatica, ANNALES Uni-
versitatis Mariae Curie-Skłodowska, Lublin, pages
60–71.
Devesa, J., Santos, I., Cantero, X., Penya, Y. K., and
Bringas, P. G. (2010). Automatic Behaviour-based
Analysis and Classification System for Malware De-
tection. In Proceedings of the 12
th
International Con-
ference on Enterprise Information Systems (ICEIS),
pages 395–399.
Garner, S. (1995). Weka: The Waikato environment for
knowledge analysis. In Proceedings of the New
Zealand Computer Science Research Students Confer-
ence, pages 57–64.
Holmes, G., Donkin, A., and Witten, I. H. (1994). Weka: a
machine learning workbench. pages 357–361.
Kang, M., Poosankam, P., and Yin, H. (2007). Renovo:
A hidden code extractor for packed executables. In
Proceedings of the 2007 ACM workshop on Recurring
malcode, pages 46–53.
Kolter, J. and Maloof, M. (2004). Learning to detect ma-
licious executables in the wild. In Proceedings of
the 10
th
ACM SIGKDD international conference on
Knowledge discovery and data mining, pages 470–
478. ACM New York, NY, USA.
Liu, H. and Motoda, H. (2001). Instance selection and con-
struction for data mining. Kluwer Academic Pub.
Liu, H. and Motoda, H. (2008). Computational methods of
feature selection. Chapman & Hall/CRC.
Martignoni, L., Christodorescu, M., and Jha, S. (2007).
Omniunpack: Fast, generic, and safe unpacking of
malware. In Proceedings of the 23
rd
Annual Com-
puter Security Applications Conference (ACSAC),
pages 431–441.
McGill, M. and Salton, G. (1983). Introduction to modern
information retrieval. McGraw-Hill.
Morley, P. (2001). Processing virus collections. In Proceed-
ings of the 2001 Virus Bulletin Conference (VB2001),
pages 129–134. Virus Bulletin.
Moskovitch, R., Stopel, D., Feher, C., Nissim, N., and
Elovici, Y. (2008). Unknown malcode detection via
text categorization and the imbalance problem. In
Proceedings of the 6
th
IEEE International Conference
on Intelligence and Security Informatics (ISI), pages
156–161.
Namata, G., Sen, P., Bilgic, M., and Getoor, L. (2009). Col-
lective classification for text classification. Text Min-
ing, pages 51–69.
Neville, J. and Jensen, D. (2003). Collective classifica-
tion with relational dependency networks. In Proceed-
ings of the Workshop on Multi-Relational Data Min-
ing (MRDM).
Pyle, D. (1999). Data preparation for data mining. Morgan
Kaufmann.
Royal, P., Halpin, M., Dagon, D., Edmonds, R., and Lee,
W. (2006). Polyunpack: Automating the hidden-code
extraction of unpack-executing malware. In Proceed-
ings of the 22
nd
Annual Computer Security Applica-
tions Conference (ACSAC), pages 289–300.
Santos, I., Nieves, J., and Bringas, P. (2011). Semi-
supervised learning for unknown malware detection.
In Abraham, A., Corchado, J., Gonzlez, S., and
De Paz Santana, J., editors, International Symposium
on Distributed Computing and Artificial Intelligence,
volume 91 of Advances in Intelligent and Soft Com-
puting, pages 415–422. Springer Berlin / Heidelberg.
Santos, I., Penya, Y., Devesa, J., and Bringas, P. (2009).
N-Grams-based file signatures for malware detection.
In Proceedings of the 11
th
International Conference
on Enterprise Information Systems (ICEIS), Volume
AIDSS, pages 317–320.
Schapire, R. (2003). The boosting approach to machine
learning: An overview. Lecture Notes in Statistics,
pages 149–172.
Schultz, M., Eskin, E., Zadok, F., and Stolfo, S. (2001).
Data mining methods for detection of new malicious
executables. In Proceedings of the 22
n
d IEEE Sympo-
sium on Security and Privacy., pages 38–49.
Shafiq, M., Khayam, S., and Farooq, M. (2008). Embedded
Malware Detection Using Markov n-Grams. Lecture
Notes in Computer Science, 5137:88–107.
Sharif, M., Yegneswaran, V., Saidi, H., Porras, P., and Lee,
W. (2008). Eureka: A Framework for Enabling Static
Malware Analysis. In Proceedings of the European
Symposium on Research in Computer Security (ES-
ORICS), pages 481–500.
Singh, Y., Kaur, A., and Malhotra, R. (2009). Compara-
tive analysis of regression and machine learning meth-
ods for predicting fault proneness models. Interna-
tional Journal of Computer Applications in Technol-
ogy, 35(2):183–193.
Tsang, E., Yeung, D., and Wang, X. (2003). OFFSS: op-
timal fuzzy-valued feature subset selection. IEEE
transactions on fuzzy systems, 11(2):202–213.
Zhou, Y. and Inge, W. (2008). Malware detection using
adaptive data compression. In Proceedings of the 1st
ACM workshop on Workshop on AISec, pages 53–60.
ACM New York, NY, USA.
Zhou, Y., Jorgensen, Z., and Inge, M. (2007). Combating
Good Word Attacks on Statistical Spam Filters with
Multiple Instance Learning. In Proceedings of the 19
th
IEEE International Conference on Tools with Artifi-
cial Intelligence-Volume 02, pages 298–305.
SECRYPT 2011 - International Conference on Security and Cryptography
256