Figure 6: Fraud scheme for transactions types.
5 CONCLUSION
Since the last three decades, Data Mining or Machine
Learning algorithms are used to pursue the delicate
problem of the fraud detection in the bank industry.
These algorithms pose three main problems: (i) the
strong reluctance with which the bankers agree to sup-
ply in a third party a set of real-world transactions for
confidentiality reasons, (ii) the problem of the data
set representativity, and to a lesser extent, its com-
pleteness, and (iii) the huge amount of transactions
that must be analyzed to detect the potential frauds.
This paper presents an operational program, called
TOM4FFS, to solve these three problems. The main
advantages of this method are (i) to be purely syn-
tactic what guarantees a strict confidentiality and (ii)
to reduce the complexity of the problem of the fraud
detection from O(n
2
) to O(n). The TOM4FFS pro-
gram is then able to handle more than 4 billions of
transactions a day, online and in real time, with a
standard personal computer. This paper describes the
TOM4FFS program and its application to a real-world
fraud example of a world wide French bank. Our cur-
rent works are concerned with the extension of the
approach to more complex fraud schemata and its ap-
plication to the general problem of the conformity in
the banking and other industries.
REFERENCES
Agrawal, R. and Srikant, R. (1995). Mining sequential pat-
terns. Proceedings of the 11th International Confer-
ence on Data Engineering (ICDE95), pages 3–14.
Altman, E., Marco, G., and Varetto, F. (1994). Corporate
distress diagnosis: Comparisons using linear discrim-
inant analysis and neural networks (the italian experi-
ence). Journal of banking & finance, 18(3):505–529.
Benford, F. (1938). The law of anomalous numbers. Pro-
ceedings of the American Philosophical Society.
Deshmukh, A. and Talluru, L. (1998). A rule-based fuzzy
reasoning system for assessing the risk of manage-
ment fraud. International Journal of Intelligent Sys-
tems in Accounting, Finance & Management, 74:223–
241.
Fanning, K. and Cogger, K. (1998). Neural network detec-
tion of management fraud using published financial
data. International Journal of Intelligent Systems in
Accounting, Finance & Management, 7:21–41.
Fliess, M., Join, C., and Hatt, F. (2011). Is a probabilis-
tic modeling really useful in financial engineering?
In Conf
´
erence M
´
editerran
´
eenne sur L’Ing
´
enierie S
ˆ
ure
des Syst
`
emes Complexes.
Green, B. and Choi, J. (1997). Assessing the risk of man-
agement fraud through neural network technology.
Auditing, 161:14–28.
Han, J. and Kamber, M. (2006). Data Mining. Concepts
and Techniques. Morgan Kaufmann.
Hoogs, B. and al. (2007). A genetic algorithm approach
to detecting temporal patterns indicative of financial
statement fraud. Intelligent Systems in Accounting, Fi-
nance and Management, 15:41–56.
Jans, M., Lybaert, N., and Vanhoof, K. (2009). A frame-
work for internal fraud risk reduction at it integrating
business processes: the ifr framework. The Interna-
tional Journal of Digital Accounting Research, 9:1–
29.
Kirkos, E. and al. (2007). Data mining techniques for the
detection of fraudulent financial statements. Expert
Systems with Applications.
Kotsiantis, S. and al. (2006). Forecasting fraudulent finan-
cial statements using data mining. International Jour-
nal of Computation Intelligence, 3:104–100.
Le Goc, M. (2006). Notion d’observation pour le di-
agnostic des processus dynamiques: Application
`
a
Sachem et
`
a la d
´
ecouverte de connaissances tem-
porelles. Hdr, Aix-Marseille University, Facult
´
e des
Sciences et Techniques de Saint J
´
er
ˆ
ome.
Mannila, H. (2002). Local and global methods in data min-
ing: Basic techniques and open problems. 29th In-
ternational Colloquium on Automata, Languages and
Programming.
Mannila, H., Toivonen, H., and Verkamo, A. I. (1995). Dis-
covering frequent episodes in sequences. In Fayyad,
U. M. and Uthurusamy, R., editors, Proceedings of the
First International Conference on Knowledge Discov-
ery and Data Mining (KDD-95), Montreal, Canada.
AAAI Press.
Mannila, H., Toivonen, H., and Verkamo, A. I. (1997). Dis-
covery of frequent episodes in event sequences. Data
Mining and Knowledge Discovery, 1(3):259–289.
Phua, C. and al. (2010). A comprehensive survey of data
mining-based fraud detection research. arXiv preprint
arXiv:1009.6119.
Phua, C., Alahakoon, D., and Lee, V. (2004). Minority re-
port in fraud detection: classification of skewed data.
ACM SIGKDD Explorations Newsletter, 6(1):50–59.
Roddick, F. J. and Spiliopoulou, M. (2002). A survey of
temporal knowledge discovery paradigms and meth-
ods. IEEE Transactions on Knowledge and Data En-
gineering, (14):750–767.
Wei, W. and al. (2012). Effective detection of sophisticated
online banking fraud on extremely imbalanced data.
World Wide Web: Internet and Web Information Sys-
tems, 16:449–475.
Discovering Internal Fraud Models in a Stream of Banking Transactions
351