
 
applications. The sequential patterns at lower 
concept levels often carry more specific and 
concrete information and those at higher concept 
levels carry more general information. This requires 
progressively deepening the mining process to 
multiple concept levels. In many cases, concept 
hierarchies over items are available. Given a set of 
transactions and a concept hierarchy over items 
contained in the transactions, association patterns at 
any level of the hierarchy can be found by 
developing appropriate algorithms (Chen et al., 
2001). 
Quantitative association rules over a set of 
purchased items in a customer transaction database 
were defined over quantitative and categorical 
attributes of the items (Srikant and Agrawal, 1996). 
The values of categorical attributes were mapped to 
a set of contiguous integers. While the domain of 
quantitative attributes was discretized into intervals 
by fine-partitioning the values of the attributes and 
combining the adjacent partitions as necessary and 
the intervals were then mapped to contiguous 
integers. As a result, each attribute had a form of 
<attribute, value> where value was the mapped 
integer of an interval for quantitative attributes or a 
single value for categorical attributes. Then the 
algorithms for finding Boolean association rules can 
be used on the transformed database to discover 
quantitative association rules. Some algorithms have 
been proposed (Agrawal and Srikant, 1995, Chen et 
al. 2001, Agrawal and Srikant, 1994). But, few of 
them focused on quantitative sequential patterns that 
involves discretising the domain of quantitative 
attributes into intervals while these intervals may 
not be concise and meaningful enough for human 
experts to easily obtain nontrivial knowledge from 
those rules discovered.  
In this study, we present an algorithm for 
mining sequential patterns at multiple levels with 
quantitative attributes. Instead of using the partition 
method discussed above, the fuzzy concept was 
introduced into the algorithm. Fuzzy sets were 
proposed by Zadeh (1965). Since then much 
progress in theory and application of fuzzy sets has 
been observed (Chen et al., 2001). The fuzzy 
concept is considered better than the partition 
method as fuzzy sets provide a smooth transition 
between member and non-member of a set. The use 
of fuzzy techniques makes the algorithms resilient 
to noise and missing values in the databases. Fuzzy 
concepts are not confined to a single attribute. 
Instead, they can be defined on a set of attributes. 
The proposed method for mining fuzzy 
multiple-level sequential patterns uses a 
hierarchically encoded customer-sequence table, 
instead of the original customer transaction table. 
The problem of mining multiple-level sequential 
patterns with quantitative attributes can be split into 
four steps:  
(1) Transforming the original database into a 
hierarchically encoded customer-sequences table; 
(2) Fuzzy partitioning in each quantitative 
attribute on each concept level; 
(3) Finding all fuzzy large sequences at every 
concept level using a top-down, progressively 
deepening mining process;  
(4) Generating all fuzzy sequential patterns and 
sequential rules from the result of step 3.  
Step 3 is the most crucial step for the method. As 
long as all the fuzzy large sequences at each concept 
level can be discovered, it is not difficult to derive 
the corresponding sequential patterns and 
association rules. 
The paper is organized as follows. Section 2 
introduces some related concepts of multiple-level 
sequential patterns and fuzzy partitions of the 
quantitative attributes. Based on these concepts the 
problem of mining fuzzy multiple-level sequential 
patterns can be formally characterized. Section 3 
describes the method for mining fuzzy multiple-
level sequential patterns in detail. An algorithm for 
discovering large sequences at each concept level is 
presented and discussed.   Section 4 concludes this 
study.  
2  PROBLEM STATEMENT  
In a given customer transactions database D, each 
transaction consists of the following fields: 
customer-id, transaction-time, and the items 
purchased in the transaction. No customer has more 
than one transaction at the same transaction-time. 
Each item is a binary variable representing whether 
an item was bought or not. Let I = {i
1
, i
2
, …, i
n
} be a 
set of literals. An itemset is a non-empty set of 
items. A sequence is a non-empty and ordered list of 
itemsets. We denote an itemset by (i
1
, i
2 
, …, i
m
), 
where i
j
 is an item. The length of an itemset is the 
number of items in it. An itemset of length k is 
called a k-itemset. We denote a sequence S by <s
1
, 
s
2
, …, s
n
>, where s
j
 is an itemset. The length of a 
sequence is the number of itemsets in it. A sequence 
of length k is called a k-sequence. The sequence 
formed by the concatenation of two sequences A and 
B is denoted as <A,  B>. The following concept 
definitions are based on (Agrawal and Srikant, 
1995, Chen et al. 2001).  
Definition 1: All the transactions of a customer 
can together be viewed as a sequence, where each 
transaction corresponds to a set of items, and the list 
of transactions, ordered by increasing transaction-
time, corresponds to a sequence. A transaction made 
FUZZY MULTIPLE-LEVEL SEQUENTIAL PATTERNS DISCOVERY FROM CUSTOMER TRANSACTION
DATABASES
435