transactions with payment of not-covered checks in
other banks, resulting in the formation of loops.
It is proposed to use one of the algorithms to solve
this problem in (Poppe et al., 2017b): the M-CET
depth first search (DFS), the T-CET breadth-first
search (BFS) or the hybrid H-CET method. The M-
CET algorithm is more memory efficient; the T-CET
algorithm is more productive in performance and the
H-CET algorithm combines the benefits of the two
previous methods. The H-CET algorithm assumes
cutting the graph into subgraphs (graphlets) (Andreev
et al., 2004; Karypis et al., 1995; Tsourakakis et al.,
2012). DFS is used for one part of the graphlets and
BFS for the others. The problem of cutting a graph
into graphlets is a non-trivial task.
The exponential nature of the storage capacity and
the number of CPU operations remains for large CET
graphs, even for the hybrid H-CET method (Poppe et
al., 2017b).
Figure 1: CET graph example.
Table 2 shows an algorithm for executing query
Q1 using the Neo4j graph database.
The CET graph is built at the level of transactions
between banks in Figure 1. It does not consider that
the transaction includes not only the name of the
bank, but also the account and check number. A
transaction (event) is described by a tuple (source,
destination, q) in the developed algorithm. Each of
the source and destination elements has three fields
(bank, account, check). The feature q has also been
introduced: q = 0 - a transaction for paying a check in
another bank, e.g., A → B (0), q = 1 - covering a
check in another bank, for example, B → A (1).
A node z is created corresponding to the new
event e (2 :). Next, a search is for events Y is
performed, when the check must be paid or covered
in the same bank (and account), which is specified in
e.source (3 :). The set Y is equal to {A → B (0), C →
B (1)} in the example in Figure 2. Operators 4: -8:
establish links between node z and nodes from set Y.
Further along z, a search for a node x is performed
with which e forms a loop (9 :). That is, for example,
x = A → B (0), z = B → A (1). A loop is a sign of a
possible fraudulent transaction. If a loop is found (10
:), search for all chains leading to the beginning of the
loop (line 12 :) is in progress. One of them is the loop
itself. Figure 2 shows an example of the found chain.
Figure 2: Chain for the loop (A→B(0), B→A(1)).
4 A PROPOSED METHOD FOR
DETECTING FRAUD USING AN
INDEX
The following features of the flow events can be
noted:
1. The right side of the transaction coincides with
the left side of the next transaction (NEXT) at the level
of the bank name, account number and check, for
example, B → C (0) and C → A (0) (see query Q1 ).
2. A sign of possible fraud is the presence of loops
when covering checks, for example, A → B (0) and
B → A (1) (see C1, C2 in Table 1).
These features allow you to implement query Q1
using an index (B-tree or hash table). Table 3 shows
the flow event algorithm.
Z, y, j are global vars, index1 and index2 are
indices (B-tree or hash table) in this algorithm. To
speed up the search, event e is stored in the index as
two records: (e.source; e.destination, e.q, en1) and
(e.destination; e.source, e.q, en1), where the first
attribute is the search key (1: , 2 :). The search keys
e.source and e.destination may be not unique. The
next 4 operators determine if there is a loop, for
example, A → B (0) and e = B → A (1) (see Figure
2). First, it is determined whether the check is covered
(4 :). Then events A → B (0 or 1) (5 :) are extracted
from index 1. And if d (= B) is equal to e.source (=
B) and q = 0 (8 :), then there is a loop (9 :). The loop
is stored in x, and the strings are searched for leading
to the beginning of the loop (10 :). The problem is the
same as in Algorithm 1, but it is solved in a different
way. The ‘chains’ procedure is used for this, so let us
look at it in details.
The procedure has the following formal input
parameters: i is the number of the recursive call level
chains, a is the key value for searching in index 2
(source), c1 is the final value in the chain
(destination). Let there be a loop A → B (0) and e =
B → A (1) (see Figure 2). Then, at the first call to the
chains procedure, parameters (1, B, A) (10 :) are