
 
2  BACKGROUND 
In general, knowledge can be experience, concepts, 
values, or beliefs that increase an individual’s ability 
to take effective action (F. Zheng, 2008). 
Knowledge can be either implicit or explicit. The 
former is represented by tacit experience which can 
come through individual ideas, intuition, experience, 
values and judgements. This type of knowledge is 
dynamic in nature. It can be accessed only through 
direct participation and communication with field 
experts that possess this knowledge. The know-how 
of each knowledge worker is accordingly based on 
this tacit (or implicit) knowledge. Instead, explicit 
knowledge usually includes anything that is saveable 
in an electronic format or in other words what we 
are able to transcribe and to share.  
Knowledge discovery can be defined as “the non-
trivial extraction of implicit, unknown, and 
potentially useful information from data. When 
working with texts, knowledge discovery refers 
generally to the process of extracting interesting 
information from a large amount of unstructured 
textual documents. The goal of this process is to find 
and extract useful patterns. To do this, specific 
methods and algorithms from the fields of machine 
learning and statistics are applied. Text mining is, 
thus, the application of these algorithms and 
methods to texts ”(U.M. Fayyad, 1996).   
Grid is an infrastructure which allows shared 
resources to be coordinated inside dynamic 
organisations, be they individuals, institutions or 
resources. It offers a flexible environment where 
resources can be dynamically reorganised without 
altering any active processing on the GRID and 
provide connectivity for data distributed in different 
locations. This can resolve transparency problems 
related to location while providing a mechanism 
which allows easier access to and management of 
distributed data as well as the virtualisation and 
sharing of GRID connected resources. To 
manipulate intensive computation procedures, the 
platform can provide automatic allocation of 
resources, scheduling and algorithm implementation 
in relation to the availability, capacity and position 
of these resources. A GRID can increase efficiency 
while reducing the cost of computational networks 
by decreasing data processing times, optimizing 
resources, and distributing the workload. Thus, users 
are provided the results of large operations with 
greater speed and lower costs (I. Foster, 2001).   
Attempts to automate knowledge processes date 
to the early 1980s. Several processes have been 
employed on parallel computing platforms to 
achieve high performance on the analysis of large 
data sets stored on a single site. Recently, the 
demand for knowledge processes has expanded to 
include the management and analyses of multi-site 
and multi-owner data repositories. This task involves 
large data-sets, the geographic distribution of data, 
users and resources, and computational intensive 
analysis demands for new parallel and distributed 
platforms for knowledge processes as computational 
grid technology. The resulting application of grid 
technology to the knowledge field has been termed 
Knowledge-Grid (M.Cannataro, 2001). 
Workflow automation technology has been 
developed to facilitate organizational coordination 
and collaboration by automating entire work 
processes and controlling the flow of information 
among participants. A workflow can be used to 
define the work process, control activity requests, 
route relevant documents to the appropriate agents, 
enforce deadlines, and monitor the progress of work 
(S. X. Sun, 2008). The Workflow Management 
Coalition (WFMC)  defines a workflow as “… the 
total or partial automation of business  procedures 
where documents, information or tasks are passed 
between participants according to a defined set of 
rules …” (www.wfms.com). A business process is a 
group of necessary tasks and a set of conditions 
which determines the order of their completion. A 
task is a logical unit of work that must be performed 
by a resource in its entirety. A resource can be a 
person or machine or it can be a group of persons or 
machines that perform specific tasks. The 
performance of a task by a resource is called an 
activity (Wil van der Aalst, 2002).  
Hence, a workflow can be seen as a structure which 
not only contains tasks/activities but also 
coordinates and supervises their execution. 
Different types of workflow have been identified: 
Collaborative workflows manage less rigid 
processes and allow connections among those users 
closest to the collaboration as well as work groups; 
Structured workflows manage structurally well-
defined and repeatable activities which can be 
specified through a series of rules. Examples of 
structured workflows are: (a) administrative 
workflows which manage the flow of electronic 
forms, integrating them with message systems or 
email; and (b) production workflows which manage 
the flow of  well-structured work, defined by well-
formalised rules and dependencies; 
Ad Hoc workflows are created by using lighter 
systems which give the user the task of identifying 
the correct procedural steps to take each time a 
A WORKFLOW BASED APPROACH FOR KNOWLEDGE GRID APPLICATION
231