• Experiment data collecting. We collected 8000
documents from Japanese patents from the year
1999 for training and 4000 documents from the
year 2000 for testing.
• Feature selection. We used the CHI avg feature
selection method to convert each document into a
5000 dimensional vector.
• Base classifier selection. Because of its effec-
tiveness and wide usage in text categorization.
SVM
light
was employed as the base classifier in
each module. Each SVM
light
s had a linear kernel
function, and c (the trade-off between training er-
ror and margin) is set to be 1.
For evaluating the effectiveness of category as-
signments by classifiers to documents, we use the
standard recall, precision and F
1
measurement.
The results comparison is shown in Table 1. In the
table, four methods (min-max, scs, acs, and DTCS)
are listed in the first column, and the DTCS using
pruned C4.5 is also listed as DTCS(p). In the last col-
umn, #module denotes the average number of mod-
ules used to predict a sample in the modular network.
Table 1: Results comparison of patent categorization.
Method P(%) R(%) F
1
(%) #module
Min-max 71.51 71.48 71.49 700
ACS 71.51 71.48 71.49 350
SCS 71.51 71.48 71.49 250
DTM 71.10 71.05 71.08 200
DTM(p) 71.86 71.83 71.84 140
From Table 1, we can see that the DTCS with
a pruned C4.5 has the best performance and lowest
time complexity. We also notice that DTCS with an
unpruned C4.5 performed worse than min-max, be-
cause of the over-fitting problem of unpruned C4.5
algorithm.
For selecting less modules than min-max method,
DTCS highly increases the parallel learning effi-
ciency.
5 CONCLUSIONS
We apply the decision tree to module combination
step in min-max modular network. Compared with
the min-max approach, the advantages of the new
method are its lower time complexity in prediction,
especially in parallel prediction, and better adaptive
ability to weak low-layer sub-classifier.
Our future work is to adjust traditional decision
tree algorithm to adapt to M
3
network, and continue
to analyze and test DTCS generalizing performance
through a series experiments; with a focus on large-
scale patent categorization.
ACKNOWLEDGEMENTS
This work was partially supported by the Na-
tional Natural Science Foundation of China (Grant
No. 60773090 and Grant No. 90820018),
the National Basic Research Program of China
(Grant No. 2009CB320901), and the National
High-Tech Research Program of China (Grant No.
2008AA02Z315).
REFERENCES
Chu, X., Ma, C., Li, J., Lu, B., Utiyama, M., and Isahara, H.
Large-scale patent classification with min-max modu-
lar support vector machines. In Proc. of International
Joint Conference on Neural Networks, pages 3972–
3977.
Lu, B., Bai, Y., Kita, H., and Nishikawa, Y. (1993). An
efficient multilayer quadratic perceptron for pattern
classification and function approximation. In Proc. of
International Joint Conference on Neural Networks.,
pages 1385–1388.
Lu, B. and Ito, M. (1999). Task decomposition and mod-
ule combination based on class relations: A modular
neural network for pattern classification. IEEE Trans-
actions on Neural Networks, 10(5):1244–1256.
Lu, B., Shin, J., and Ichikawa, M. (2004). Massively par-
allel classification of single-trial EEG signals using a
min-max modular neural network. IEEE Transactions
on Biomedical Engineering, 51(3):551.
Quinlan, J. (1993). C4. 5: programs for machine learning.
Morgan Kaufmann.
Vilalta, R. and Drissi, Y. (2002). A perspective view and
survey of meta-learning. Artificial Intelligence Re-
view, 18(2):77–95.
Wang, K., Zhao, H., and Lu, B. (2005). Task decomposition
using geometric relation for min–max-modular svms.
Lecture Notes in Computer Science, 3496:887–892.
Ye, Z. (2009). Parallel Min-Max Modular Support Vector
Machine with Application to Patent Classification (in
Chinese). Master Thesis, Shanghai Jiao Tong Univer-
sity.
Ye, Z. and Lu, B. (2007). Learning Imbalanced Data Sets
with a Min-Max Modular Support Vector Machine.
In Proc. of International Joint Conference on Neural
Networks, pages 1673–1678.
Zhao, H. and Lu, B. (2005). Improvement on response per-
formance of min-max modular classifier by symmetric
module selection. Lecture Notes in Computer Science,
3971:39–44.
Zhao, H. and Lu, B. (2006). A modular reduction method
for k-nn algorithm with self-recombination learning.
Lecture Notes in Computer Science, 3971:537–542.
IJCCI 2009 - International Joint Conference on Computational Intelligence
558