where biologists, biochemists and ILP experts may
solve the referred kind of problems. The ILP system
available for the experiments is Aleph.
Each site running the application allows domain
and ILP experts to implement active Collaborative
Data Mining tasks. Domain experts provide problems
and data (examples) and the ILP experts develop the
background knowledge predicates for those problems.
Each site has libraries of available predicates organ-
ised in a hierarchical fashion and according to a hi-
erarchical structure defined by domain experts. Each
stored predicate has an English description of its func-
tion and the detailed implementation is hidden from
the domain expert.
We have provided an interface (see Figure 2)
where the domain expert may assemble the back-
ground knowledge by searching and choosing pred-
icates from this hierarchically organised library of
predicates. At this stage Web services are used to
search other web sites where the application is de-
ployed, looking for predicates of the required cate-
gory. This procedure may save time in the develop-
ment of the background knowledge. An ILP expert is
required only when the domain expert decides to use
some knowledge that is not encoded as a predicate lo-
cally neither available using the Web services.
Before starting the data analysis experiments the
user may use the UI to inspect existing results of
other experiments on the data set, if publicly avail-
able. This will give him an idea of what background
knowledge have been tried and what were the corre-
spondent results and therefore avoid repeating useless
experiments or avoid choosing predicates that seem to
be of no use for the analysis of the data.
The expert may undergo a sequence of experi-
ments where models are constructed and shown to
the expert. Each step of the experimental process is
recorded so the expert may inspect previously con-
structed models and in the end he may decide which
models to store as final results of the analysis process.
He may also decide which information to make pub-
lic.
6 CONCLUSIONS
In this paper we have described a framework for Col-
laborative Data Mining. At each site the framework
enables the solving of domain problems with the help
of ILP experts that develop the background knowl-
edge and use ILP systems. Web services look at other
sites for publicly available information that are rele-
vant for the solving of problems.
The use of Web services extended the traditional
approach to Collaborative Data Mining possibilities
implementing a passive Collaborative Data Mining
that searches web sites for relevant information.
ACKNOWLEDGEMENTS
This study was funded by FCT project “ILP-Web-
services” (PTDC/EIA/70841/2006).
REFERENCES
Blockeel, H. and Moyle, S. (2002). Collaborative data
mining needs centralised model evaluation. In Pro-
ceedings of the ICML-2002 Workshop on Data Mining
Lessons Learned, pages 21–28.
Booth, D. and Liu, C. K. (2007). Web services description
language (WSDL) version 2.0 part 0: Primer. Tech-
nical Report Second Edition, W3C Recommendation.
http://www.w3.org/TR/wsdl20-primer.
CRISP-DM (2007). Cross industry standard process for
data mining. http://www.crisp-dm.org/.
Dzeroski, S. (2001). Relational Data Mining. Springer-
Verlag New York, Inc., Secaucus, NJ, USA.
Lavrac, N., Motoda, H., Fawcett, T., Holte, R., Langley,
P., and Adriaans, P. (2004). Introduction: Lessons
learned from data mining applications and collabora-
tive problem solving. Machine Learning, 57(1-2):13–
41.
Mitra, N. and Lafon, Y. (2007). SOAP version 1.2 part 0:
Primer. Technical Report Second Edition, W3C Rec-
ommendation. http://www.w3.org/TR/soap12-part0/.
Moller, A. and Schwartzbach, M. I. (2006). An Introduction
to XML and Web Technologies. Addison Wesley.
Moyle, S., McKenzie, J., and Jorge, A. M. (2003). Col-
laboration in a data mining virtual organization. In
Data Mining and Decision Support: Integration and
Collaboration, The International Series in Engineer-
ing and Computer Science, chapter 5, pages 49–62.
Springer.
Muggleton, S. and De Raedt, L. (1994). Inductive logic
programming: Theory and methods. Journal of Logic
Programming, 19/20:629–679.
Papazoglou, M. P. and Georgakopoulos, D. (2003). Service-
oriented computing. Communications of the ACM,
46(10):2528.
Srinivasan, A. (2003). The Aleph Manual. Available
from http://web.comlab.ox.ac.uk/oucl/research/areas/
machlearn/Aleph.
KDIR 2010 - International Conference on Knowledge Discovery and Information Retrieval
470