specific job. Towards this end, this paper proposes
an heuristic mechanism that computes the single
resource reputation by considering a set of attributes
that we refer as Resource Operative Context (ROC) .
Usually derived from log-files, ROC attributes
allows for evaluating the resource behaviour with
time. To avoid information queries on single
resource behaviour and to obtain information faster
and more reliable, we propose to cluster resources
that exhibit similar behavioural patterns and share
similar operative contexts. Based on this
information, the Grid scheduler may limit the
number of resources that are potentially capable of
providing efficient job execution. The paper presents
preliminary results concerning the implementation
of the proposed approach in a simulated Grid
environment.
2 RELATED WORK
A survey of existing methodologies for trust and
reputation in grid environment is presented in (G.C.
Silaghi et al, 2007) that discusses many reputation
based trust management systems and their suitability
for grids. The complexities of enterprise grid
environment and the importance of QoS in grid are
discussed in (D.A.Menasce, & E.Casalicchio, 2004)
that remarks the need of considering SLAs and cost
constraints in the grid scheduling. Concepts related
to trust and reputation are validated by mathematical
definitions in (W.Xinhua et al, 2005). (B.Ma et al,
2006) present an approach to compute and compare
the trustworthiness of entities in the same
autonomous and different domains. In (Thamarai
Selvi Somasundaram et al,2007), a general purpose
trust management system for computational grids is
presented , based on several information that can be
obtained directly by the grid scheduler. A model of
trust aspects in executing collaborative distributed
services is presented in (C. Argiolas et al. ,2008)
3 ASSESSING REPUTATION
The reputation of a resource reveals its reliability in
executing jobs. Previous section emphasizes the
importance of resource’s past behaviour for
assessing its reputation. However, in an open and
decentralized scientific Grid, there is not a
centralized authority which collects and maintains
reputation information. Additionally, users only
submit their jobs and are not asked for a feedback.
This makes it impossible for the grid scheduler to
have updated reputation information about the whole
network, since some Grids may have thousand of
resources. Possible approaches could be using
distributed data structures or evaluating reputation
by local knowledge on the interested resource. Our
proposal considers the evaluation of the reputation
as a centralized activity supported by a Reputation
Management System (RMS) , depicted in figure 1,
that periodically interacts with every grid resource.
RMS is intended to support the grid scheduler in
selecting resources by assessing their reputation and
to calculate the delay in executing each user job. The
RMS can be configured to be either proactive,
active, or passive. In a proactive configuration , the
RMS will actively stop all communication with
resources that are not conformant to the level of
reputation required by the Grid scheduler. With
active and passive status, the RMS will not intervene
as strongly as with proactive configuration but it will
report non compliant resource behaviours to the Grid
Scheduler. In a passive configuration, the RMS will
only log task behaviour while the resource
conformance to the agreed reputation level will be
verified at a later time. The RMS includes a
Performance Module (PM) that estimates the
resource’s reputation as follows. It takes periodic
data on resources from the scheduler log file and
structures this information in relational tables whose
attributes express basic characteristic of the
resource behaviour that we globally refer as the
Resource Operative Context (ROC). Attributes
differ in a number of ways. For example , they can
be of different types (i.e. quantitative or qualitative)
and may contain explicit relationships to one
another. A Database Management System is
responsible for the storage and the management of
ROC information. Hence, each ROC is an instance
of a relation table that represents the resource
operative status with respect to time. As mentioned
earlier, we are not interested in evaluating the most
accurate reputation of each single resource, but in
clustering resources that exhibit similar behaviours
in executing jobs. To reduce the effort required to
achieve this, the RMS uses an application that
consists of a computation that queries periodically
the databases and carries out further analysis on the
retrieved data. Specifically, this analysis is based on
a more general technique known as instance-based
learning which uses retrieved data as a training set
for the k-means algorithm (J.A.Hartigen &
M.A.Wong, 1979) which is a popular clustering
technique. Taking into consideration the scale of the
considered attribute, each cluster can be labelled by
ICEIS 2008 - International Conference on Enterprise Information Systems
236