T
N,E
= T
up
+ T
conv
+ T
down
=
N[(S
PNG
+ S
JPG
)/B + O
max
/E + K
OH
E]
(1)
where S
x
is the average size of an image with the
format X (for the dimensions used in the
conversion), B is the available bandwidth, O
max
is
the maximum time spent by the slowest Executor to
convert one image, and K
OH
is an overhead factor
needed to create and consume the threads. This is
merely a simplification of the expected behavior,
and numerous variables fluctuate over time, mostly
due to the network latency.
If we consider the thread overhead to be
negligible for small values of E and N, the equation
becomes of the form:
T
N,E
= k
1,N
/E + k
2,N
(2)
which indicates that the optimal performance point
is reached when E is maximum, E=N. However, the
gain in performance becomes less apparent as we
add more Executors (in the case of document 3,
there is 48% gain when increasing E up to 2 and
only 11% for E=8).
From the above Decision block discussion, we
should define a minimum gain for this conversion
service. Since it is difficult to perform this
calculation with so many variables which fluctuate
in an unpredictable manner, we used our set of
results. Hence, for a minimum 15% performance
gain, we defined a maximum threshold (number of
executors) of 6 and no minimum.
7 CONCLUSIONS
We have presented a generic Grid based layer which
we will develop in order to offer generic and
stateless services to be executed in a parallel
manner. We also have shown a simple case study
and how the performance of a basic service has been
greatly improved has we fully used the Grid
capabilities, executing up to five times faster.
Analysis has been conducted for a very simple
application scenario. Some work is yet to be done to
create robust and intelligent blocks for the Manager.
As a future work we plan to define a minimum set of
simple application agnostic services required for
digital libraries. We will analyze their requirements
and define the rules for their Decision blocks. We all
these settings well established, we shall implement
the applications and create Web Services which will
serve as the interface to execute the services.
ACKNOWLEDGEMENTS
This work was funded in part by FCT – Portuguese
Foundation for Science and Technology – grant
number SFRH/BD/23976/2005.
REFERENCES
Agosti, M. et al, 2006. D1.1.1: Evaluation and comparison
of the service architecture, P2P, and Grid approaches
for DLs. Technical Report, DELOS – A Network of
Excellence on Digital Libraries.
Almeida, P. et al, 2006. SInBAD - A Digital Library to
Aggregate Multimedia Documents. In ICIW'06:
International Conference on Internet and Web
Applications and Services. Guadeloupe, France.
Buyya, R., 2007. Grid Computing Info Centre (GRID
Infoware). http://www.gridcomputing.com.
Candela, L., Castelli, D., Pagano, P., 2005. Moving Digital
Library Service Systems to the Grid. In Türker, C.,
Agosti, M., and Schek, H. (Ed.), Peer-to-Peer, Grid,
and Service-Orientation in Digital Library
Architectures (pp. 236-259). Springer, Berlin.
CSSE - Department of Computer Science and Software
Engineering of University of Melbourne, Australia,
2006. The GRIDS Lab and the Gridbus Project.
http://www.gridbus.org.
Ilkaev, D., Pearson, S., 2005. Analysis of Grid Computing
as it Applies to High Volume Document Processing
and OCR. Technical Report, Tier Technologies, USA.
Luther, A., Buyya, R., Ranjan, R., Venugopal, S., 2005.
High Performances Computing: Paradigm and
Infrastructure. Wiley Press. New Jersey, USA.
Risse, T. et al., 2005. The BRICKS infrastructure – an
overview. In Proc. of 75th Conference on Electronic
Imaging, the Visual Arts & Beyond (EVA 2005).
Taylor, I.J., 2005. From P2P to Web Services and Grids –
Peers in a Client/Server World. Springer-Verlag.
London.
Tansley, R. et al., 2003. The DSpace institutional digital
repository system: current functionality. In
Proceedings of the 2003 Joint Conference on Digital
Libraries (JCDL’03), IEEE Computer Society, 87-97.
IMPROVING PERFORMANCE OF BACKGROUND JOBS IN DIGITAL LIBRARIES USING GRID COMPUTING
225