...
// Algorithm details
var algorithm
= "tminer.kdd.classification.tdidt.TDIDTClassifier";
var parameters = new tminer.model.adt.Dictionary();
parameters.add("divisionRule", "GainRatio");
parameters.add("binarySplits", true); parameters.add("pruning",
true); parameters.add("pruningCF", 0.25);
// Cross-validation experiment
var experiment = new classification.CrossValidation();
experiment.type = algorithm;
experiment.parameters = parameters;
experiment.partitions = 10;
experiment.dataset = dataset;
experiment.encoder = encoder;
experiment.classAttribute = "classLabel";
experiment.discrete = discreteAttributes;
experiment.continuous = continuousAttributes;
experiment.run();
// Experiment results
experiment;
Figure 6: TMinerScript code snippet needed to run a cross-
validation experiment.
ACKNOWLEDGEMENTS
Work partially supported by research project
TIN2006-07262.
REFERENCES
Berzal, F., Blanco, I., Cubero, J. C., and Mar
´
ın, N. (2002).
Component-based data mining frameworks. Commu-
nications of the ACM, 45(12):97–100.
Berzal, F., Cubero, J. C., Mar
´
ın, N., Serrano, J.-M., and
Blanco, I. (2003). Usability issues in data mining sys-
tems. In ICEIS 2003: Proceedings of the 5th Interna-
tional Conference on Enterprise Information Systems
(Volume II - Artificial Intelligence and Decision Sup-
port Systems), pages 418–421.
Chaudhuri, S. and Dayal, U. (1997). An overview of
data warehousing and OLAP technology. SIGMOD
Record, 26(1):65–74.
Constantine, L. L. (1994). Interfaces for intermediates.
IEEE Software, 11(4):96–99.
Constantine, L. L. and Lockwood, L. A. D. (1999). Soft-
ware for Use: A practical guide to the models and
methods of usage-centered design. ACM Press /
Addison-Wesley.
Etienne, J., Wachmann, B., and Zhang, L. (2006). A
component-based framework for knowledge discov-
ery in bioinformatics. In KDD ’06: Proceedings of
the 12th ACM SIGKDD international conference on
Knowledge discovery and data mining, pages 916–
921.
Fayad, M. E. and Schmidt, D. C. (1997). Object-oriented
application frameworks. Communications of the
ACM, 40(10):32–38.
Flanagan, D. (2006). JavaScript: The Definitive Guide.
O’Reilly & Associates, Inc., Sebastopol, CA, USA.
Gamma, E., Helm, R., Johnson, R., and Vlissides, J. (1995).
Design Patterns: Elements of reusable object-oriented
software. Addison-Wesley.
Han, J. and Kamber, M. (2006). Data Mining: Concepts
and Techniques. Morgan Kaufmann.
Inselberg, A. (1985). The plane with parallel coordinates.
The Visual Computer, 1(2):69–91.
Kimball, R. and Ross, M. (2002). The Data Warehouse
Toolkit: The Complete Guide to Dimensional Model-
ing. John Wiley & Sons, Inc.
Kobryn, C. (2000). Modeling components and frameworks
with UML. Communications of the ACM, 43(10):31–
38.
Larsen, G. (2000). Component-based enterprise frame-
works. Communications of the ACM, 43(10):24–26.
Laurinen, P., Tuovinen, L., and Roning, J. (2005). Smart
archive: A component-based data mining application
framework. In ISDA’05: Proceedings of the 5th In-
ternational Conference on Intelligent Systems Design
and Applications, pages 20–25.
Perry, D. E. and Kaiser, G. E. (1991). Models of soft-
ware development environments. IEEE Transactions
on Software Engineering, 17(3):283–295.
Prudsys (2008). XELOPES library - eXtEnded
Library fOr Prudsys Embedded Solutions.
http://www.prudsys.com/.
Rapid-I (2008). RapidMiner (formerly YALE, Yet Another
Learning Environment). http://rapid-i.com/.
Szyperski, C., Gruntz, D., and Murer, S. (2002). Com-
ponent Software: Beyond Object-Oriented Program-
ming. Addison-Wesley.
Tan, P.-N., Steinbach, M., and Kumar, V. (2006). Introduc-
tion to Data Mining. Addison-Wesley.
Techapichetvanich, K. and Datta, A. (2005). VisAR: A
new technique for visualizing mined association rules.
In ADMA 2005: 1st International Conference on Ad-
vanced Data Mining and Applications, LNCS 3584,
pages 88–95.
Widom, J. (1995). Research problems in data warehousing.
In CIKM ’95, Proceedings of the 1995 International
Conference on Information and Knowledge Manage-
ment, November 28 - December 2, 1995, Baltimore,
Maryland, USA, pages 25–30. ACM.
Witten, I. H. and Frank, E. (2005). Data Mining: Practi-
cal machine learning tools and techniques. Morgan
Kaufmann.
NOTES ON THE ARCHITECTURAL DESIGN OF TMINER - Design and Use of a Component-based Data Mining
Framework
103